"" Using Data to Improve Care for Children EKG
Tutorials
Home >

Clean the Data Using a Predefined Specification

Once you’ve identified the problems in your dataset, you will want to develop a cleaning routine. This cleaning routine will be used to help you produce more reliable, consistent, and accurate results in your data.

A Typical Cleaning Routine

  1. Identify invalid data. Use your standards of data quality and your key necessities to identify all the invalid or inaccurate data.
  2. Investigate the reasons for the bad data. Having this understanding will assist you in taking the necessary actions to correct the data.
  3. Determine how the dirty data should be cleaned. Whenever possible, invalid data should be corrected so it can be used for your project.
  4. Perform accuracy tests to ensure the data were properly cleaned. Accuracy tests are a physical comparison of the data collected with the actual event/object.

    For example, you may want to compare the written run report with the electronic version that was recently entered into the database.

These steps may seem time consuming but they are worth every minute!

Next Step
Identify Methods to Minimize More Bad Data >>

Top

 

 

rev. 29-Aug-2016

 

 

 

Disclaimer | Website Feedback | U of U
© NEDARC 2010
(In accordance with the Americans with Disabilities Act, the information in this site is
available in alternate formats upon request.)