Have you ever wondered what effect seatbelt usage has on the amount of money spent for hospital admission for crash victims? Or perhaps you want to know if size-appropriate splinting in the prehospital setting reduces hospital admissions, lengths of stay, and charges?

Using Existing Databases

In most cases the answers for these problems are rarely contained in one database. A researcher must therefore start from scratch, building a new database that follows patients from the point of splinting, to the emergency department, and finally determine if the patient was admitted. You can imagine that building these databases can be expensive in terms of time and money. However, if you have access to existing databases, such as computerized EMS run reports and an emergency department or hospital discharge database, then probabilistic record linkage may be the tool for you.

Purpose

The purpose of probabilistic record linkage is to combine multiple databases into one extensive database for analysis. It can also be used to link multiple events within one database that refer to a single patient or individual.

Description

Probabilistic record linkage is accomplished by comparing data fields in two files, such as birth date or gender. The comparison of numerous data fields leads to a judgment of whether two records refer to the same patient and/or event (and should be linked). This judgment is based on the cumulative weight of agreement and disagreement among field values. This judgment is based on the cumulative weight of agreement and disagreement among field values. The amount of information in a field affects the field’s impact on whether two records should be linked. For instance, agreement of the gender field alone would not determine that two records refer to the same patient, but agreement on Social Security Number nearly guarantees that two records refer to the same individual. Probabilistic linkage software utilizes mathematical algorithms to determine whether two records should be linked based on the information in each record.

Technical Details

By assigning log-likelihood ratios to field comparisons, it is possible to computerize the judgment process. Let mi equal the probability the ith field agrees, given that the records are known to refer to the same person or event (a true match). Let ui equal the probability that the ith field will agree by chance among records known to not match.

Then for a given pair of records, if field i agrees, the agreement weight is wi= log2( mi / ui ). If field i disagrees, a disagreement weight wi = log2(( 1-mi ) / ( 1-ui )) is assigned. The composite weight for a record pair will be the sum of agreement and disagreement weights for all fields available for comparison.

To improve computation time, both files are sorted on one or several data fields. Comparisons are then made only on records that agree on the sorted fields, which are called blocking variables. If an error occurs in a data field that is used for blocking then records that should match will not be compared. This is because when the file is blocked, only records that agree on the blocking variable(s) are compared. To account for this problem, records that fail to match are subjected to subsequent attempts to match the files after re-blocking with different data fields.

Researchers can relate the match weight for a pair of records to the probability that these records are correctly matched. Based on the sizes of the databases being linked and the number of expected matches, researchers can relate the match weight for a pair of records to the probability that these records are correctly matched. Generally, only record pairs attaining a probability of being correct of at least 0.90 or higher are linked and considered true matches.

Examples:

Probabilistic record linkage has been used on a national level to look at:

• The effects of seatbelts and motorcycle helmets on medical outcomes.1
• The Intermountain Injury Control Research Center (housed with NEDARC) has used probabilistic linkage to study a variety of topics including:
• Drivers with medical conditions.2
• Effect of wearing only a shoulder strap in a motor vehicle crash.3
• Older and teenage drivers as well as children involved in motor vehicle
crashes.4-7
• Pediatric utilization of pre-hospital emergency medical services.8
• Injuries sustained in shop classes at public schools.9

Bibliography

