"" Using Data to Improve Care for Children EKG
Statistical Help
Home >

Probabilistic Linkage

Combining multiple databases into one extensive database for analysis or linking multiple events...

Have you ever wondered what effect seatbelt usage has on the amount of money spent for hospital admission for crash victims? Or perhaps you want to know if size-appropriate splinting in the prehospital setting reduces hospital admissions, lengths of stay, and charges?

Using Existing Databases

In most cases the answers for these problems are rarely contained in one database. A researcher must therefore start from scratch, building a new database that follows patients from the point of splinting, to the emergency department, and finally determine if the patient was admitted. You can imagine that building these databases can be expensive in terms of time and money. However, if you have access to existing databases, such as computerized EMS run reports and an emergency department or hospital discharge database, then probabilistic record linkage may be the tool for you.

Purpose

The purpose of probabilistic record linkage is to combine multiple databases into one extensive database for analysis. It can also be used to link multiple events within one database that refer to a single patient or individual.

Description

Probabilistic record linkage is accomplished by comparing data fields in two files, such as birth date or gender. The comparison of numerous data fields leads to a judgment of whether two records refer to the same patient and/or event (and should be linked). This judgment is based on the cumulative weight of agreement and disagreement among field values. This judgment is based on the cumulative weight of agreement and disagreement among field values. The amount of information in a field affects the field’s impact on whether two records should be linked. For instance, agreement of the gender field alone would not determine that two records refer to the same patient, but agreement on Social Security Number nearly guarantees that two records refer to the same individual. Probabilistic linkage software utilizes mathematical algorithms to determine whether two records should be linked based on the information in each record.

Technical Details

By assigning log-likelihood ratios to field comparisons, it is possible to computerize the judgment process. Let mi equal the probability the ith field agrees, given that the records are known to refer to the same person or event (a true match). Let ui equal the probability that the ith field will agree by chance among records known to not match.

Then for a given pair of records, if field i agrees, the agreement weight is wi= log2( mi / ui ). If field i disagrees, a disagreement weight wi = log2(( 1-mi ) / ( 1-ui )) is assigned. The composite weight for a record pair will be the sum of agreement and disagreement weights for all fields available for comparison.

To improve computation time, both files are sorted on one or several data fields. Comparisons are then made only on records that agree on the sorted fields, which are called blocking variables. If an error occurs in a data field that is used for blocking then records that should match will not be compared. This is because when the file is blocked, only records that agree on the blocking variable(s) are compared. To account for this problem, records that fail to match are subjected to subsequent attempts to match the files after re-blocking with different data fields.

Researchers can relate the match weight for a pair of records to the probability that these records are correctly matched. Based on the sizes of the databases being linked and the number of expected matches, researchers can relate the match weight for a pair of records to the probability that these records are correctly matched. Generally, only record pairs attaining a probability of being correct of at least 0.90 or higher are linked and considered true matches.

 

Examples:

Probabilistic record linkage has been used on a national level to look at:

  • The effects of seatbelts and motorcycle helmets on medical outcomes.1
  • The Intermountain Injury Control Research Center (housed with NEDARC) has used probabilistic linkage to study a variety of topics including:
    • Drivers with medical conditions.2
    • Effect of wearing only a shoulder strap in a motor vehicle crash.3
    • Older and teenage drivers as well as children involved in motor vehicle
      crashes.4-7
    • Pediatric utilization of pre-hospital emergency medical services.8
    • Injuries sustained in shop classes at public schools.9

Bibliography

1Johnson SW, Walker J. The Crash Outcome Data Evaluation System (CODES). Washington DC: National Highway Traffic Safety Adminstration; 1996.

2Diller E, Cook LJ, Leonard DR, Dean M, Reading JM, Vernon DD. Evaluating Drivers Licensed with Medical Conditions Licensed with Medical Conditions in Utah, 1992 – 1996. National Highway Traffic Safety Administration 1999 June;Report No. DOT HS 809 023.

3Knight S, Cook LJ, Nechodom PJ, Olson LM, Reading JC, Dean JM. Improper Use of Shoulder Straps in Motor Vehicle Crashes: A Statewide Analysis of Restraint Efficacy. In Press; Accident Analysis and Prevention.

4Cook LJ, Knight S, Olson LM, Nechodom PJ, Dean JM. Crash Characteristics and Medical Outcomes of Older Drivers in Motor Vehicle Crashes in Utah, 1992 – 1995. Annals of Emergency Medicine 2000;35(6):585-591.

5Cvijanovich NZ, Cook LJ, Nechodom PJ, Dean JM. A Population-Based Study of Teenage Drivers: 1992-1996. 43rd Annual Proceedings Association for the Advancement of Automotive Medicine 1999;175-186.

6Berg M, Cook LJ, Corneli H, Vernon D, Dean JM. Effect of Seating Position and Restraint Use on Injuries to Children in Motor Vehicle Crashes. Pediatrics 2000;105(4):831-835.

7Corneli HM, Cook LJ, Dean JM. Adults and Children in severe motor vehicle crashes: A Matched-Pairs Study. In Press; Annals of Emergency Medicine 2000 Oct;36(4):340-5.

8Suruda AJ, Vernon DD, Reading J, Cook LJ, Nechodom PJ, Leonard D, Dean JM. Pre-Hospital Emergency Medical Services: A Population-Based Study of Pediatric Utilization. Injury Prevention 1999;5(4):294-297.

9Knight S, Junkins EP, Lightfoot AC, Cazier C, Olson LM, Injuries in School Shop Classes. Pediatrics 2000;106(1):10-13.

 

Top

 

 

rev. 04-Aug-2022

 

 

 

Resource Library

Link 1
(Description of link)

 

Disclaimer | Website Feedback | U of U
© NEDARC 2010




This website is supported by the Health Resources and Services Administration (HRSA) of the U.S. Department of Health and Human Services (HHS) as part of the Emergency Medical Services for Children Data Center award totaling $3,200,000 with 0% financed with non-governmental sources. The contents are those of the author(s) and do not necessarily represent the official views of, nor an endorsement, by HRSA, HHS, or the U.S. Government. For more information, please visit HRSA.gov.

(In accordance with the Americans with Disabilities Act, the information in this site is
available in alternate formats upon request.)


08-Sep-2011