Establishing a sorting protocol for healthcare databases

 https://doi.org/10.4081/jphr.2021.1722
  • Elie Ghabi
    Faculty of Medicine, University of Balamand, Lebanon.
    https://orcid.org/0000-0002-9026-9681
  • Wehbeh Farah
    UEGP, Faculty of Sciences, Saint Joseph University of Beirut, Lebanon.
  • Maher Abboud
    UEGP, Faculty of Sciences, Saint Joseph University of Beirut, Lebanon.
  • Elias Chalhoub
    Medical Laboratory Sciences Department, Faculty of Health Sciences, University of Balamand, Lebanon.
  • Nelly Ziade
    Faculty of Medicine, Saint Joseph University of Beirut, Lebanon.
    https://orcid.org/0000-0002-4479-7678
  • Isabella Annesi-Maesano
    Institut Pierre Louis d’Epidémiologie et de Santé Publique, Equipe EPAR, Sorbonne Universités, Paris, France.
  • Laurie Abi-Habib
    Public Health Department, Faculty of Health Sciences, University of Balamand, Lebanon.
  • Myriam Mrad Nakhle
    Public Health Department, Faculty of Health Sciences, University of Balamand, Lebanon.
    https://orcid.org/0000-0003-0030-0700

ABSTRACT

Background: Health information records in many countries, especially developing countries, are still paper based. Compared to electronic systems, paper-based systems are disadvantageous in terms of data storage and data extraction. Given the importance of health records for epidemiological studies, guidelines for effective data cleaning and sorting are essential. They are, however, largely absent from the literature. The following paper discusses the process by which an algorithm was developed for the cleaning and sorting of a database generated from emergency department records in Lebanon.

Design and methods
: Demographic and health related information were extracted from the emergency department records of three hospitals in Beirut. Appropriate categories were selected for data categorization. For health information, disease categories and codes were selected according to the International Classification of Disease 10th Edition.

Results
: A total of 16,537 entries were collected. Demographic information was categorized into groups for future epidemiological studies. Analysis of the health information led to the creation of a sorting algorithm which was then used to categorize and code the health data. Several counts were then performed to represent and visualize the data numerically and graphically.

Conclusions
: The article describes the current state of health information records in Lebanon and the associated disadvantages of a paper-based system in terms of storage and data extraction. Furthermore, the article describes the algorithm by which health information was sorted and categorized to allow for future data analysis using paper records.

REFERENCES

Anderson HR, de Leon AP, Bland JM, et al. Air pollution and daily mortality in London: 1987-92. BMJ 1996;312:665-9. DOI: https://doi.org/10.1136/bmj.312.7032.665

Schwartz J, Marcus A. Mortality and air pollution in London: a time series analysis. Am J Epidemiol 1990;131:185-94. DOI: https://doi.org/10.1093/oxfordjournals.aje.a115473

Schwartz J, Dockery DW. Increased mortality in Philadelphia associated with daily air pollution concentrations. Am Rev Respir Dis 1992;145:600-4. DOI: https://doi.org/10.1164/ajrccm/145.3.600

Zanobetti A, Schwartz J. The effect of fine and coarse particulate air pollution on mortality: a national analysis. Environ Health Perspect 2009;117:898-903. DOI: https://doi.org/10.1289/ehp.0800108

Filleul L, Rondeau V, Vandentorren S, et al. Twenty five year mortality and air pollution: results from the French PAARC survey. Occup Environ Med 2005;62:453-60. DOI: https://doi.org/10.1136/oem.2004.014746

Verhoeff AP, Hoek G, Schwartz J, van Wijnen JH. Air pollution and daily mortality in Amsterdam. Epidemiology 1996:225-30. DOI: https://doi.org/10.1097/00001648-199605000-00002

Hoek G, Brunekreef B, Goldbohm S, et al. Association between mortality and indicators of traffic-related air pollution in the Netherlands: a cohort study. Lancet 2002;360:1203-9. DOI: https://doi.org/10.1016/S0140-6736(02)11280-3

Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and effect modification in the short-term effects of ambient particles on total mortality: results from 29 European cities within the APHEA2 project. Epidemiology 2001;12:521-31. DOI: https://doi.org/10.1097/00001648-200109000-00011

Touloumi G, Samoli E, Katsouyanni K. Daily mortality and “winter type” air pollution in Athens, Greece–a time series analysis within the APHEA project. J Epidemiol Commun Health 1996;50:s47-51. DOI: https://doi.org/10.1136/jech.50.Suppl_1.s47

Sunyer J, Castellsagué J, Sáez M, et al. Air pollution and mortality in Barcelona. J Epidemiol Commun Health 1996;50:s76-80. DOI: https://doi.org/10.1136/jech.50.Suppl_1.s76

Spix C, Heinrich J, Dockery D, Schwartz J, et al. Air pollution and daily mortality in Erfurt, east Germany, 1980-1989. Environ Health Perspect 1993;101:518-26. DOI: https://doi.org/10.1289/ehp.93101518

Kelly FJ, Fussell JC. Air pollution and public health: emerging hazards and improved understanding of risk. Environ Geochem Health 2015;37:631-49. DOI: https://doi.org/10.1007/s10653-015-9720-1

Health Effects Institute [Internet]. State of Global Air 2018. Accessed: 2018 May 21]. Available from: https://www.stateofglobalair.org/sites/default/files/soga-2018-report.pdf

Wordley J, Walters S, Ayres JG. Short term variations in hospital admissions and mortality and particulate air pollution. Occup Environ Med 1997;54:108-16. DOI: https://doi.org/10.1136/oem.54.2.108

Dab W, Medina S, Quenel P, et al. Short term respiratory health effects of ambient air pollution: results of the APHEA project in Paris. J Epidemiol Commun Health 1996;50:s42-6. DOI: https://doi.org/10.1136/jech.50.Suppl_1.s42

Kelly FJ, Fussell JC. Health effects of airborne particles in relation to composition, size and source. In: FJ Kelly, JC Fussell, editors. Airborne Particulate Matter: Sources, atmospheric processes and health. London: Royal Society of Chemistry; 2016. p. 344-82. DOI: https://doi.org/10.1039/9781782626589-00344

Grigg J. Particulate matter exposure in children: relevance to chronic obstructive pulmonary disease. Proc Am Thorac Soc 2009;6:564-9. DOI: https://doi.org/10.1513/pats.200905-026RM

Grigg J. Air pollution and children’s respiratory health–gaps in the global evidence. Clin Experiment Allergy 2011;41:1072-5. DOI: https://doi.org/10.1111/j.1365-2222.2011.03790.x

Latzin P, Röösli M, Huss A, et al. Air pollution during pregnancy and lung function in newborns: a birth cohort study. Eur Respir J 2009;33:594-603. DOI: https://doi.org/10.1183/09031936.00084008

Jedrychowski WA, Perera FP, Spengler JD, et al. Intrauterine exposure to fine particulate matter as a risk factor for increased susceptibility to acute broncho-pulmonary infections in early childhood. Int J Hygiene Environ Health 2013;216:395-401. DOI: https://doi.org/10.1016/j.ijheh.2012.12.014

Mortimer K, Neugebauer R, Lurmann F, et al. Air pollution and pulmonary function in asthmatic children: effects of prenatal and lifetime exposures. Epidemiology 2008:550-7. DOI: https://doi.org/10.1097/EDE.0b013e31816a9dcb

Morales E, Garcia-Esteban R, de la Cruz OA, et al. Intrauterine and early postnatal exposure to outdoor air pollution and lung function at preschool age. Thorax 2015;70:64-73. DOI: https://doi.org/10.1136/thoraxjnl-2014-205413

Nakhlé MM, Farah W, Ziade N, et al. Short-term relationships between emergency hospital admissions for respiratory and cardiovascular diseases and fine particulate air pollution in Beirut, Lebanon. Environ Monitor Assess 2015;187:196. DOI: https://doi.org/10.1007/s10661-015-4409-6

Kobrossi R, Nuwayhid I, Sibai AM, et al. Respiratory health effects of industrial air pollution on children in North Lebanon. Int J Environ Health Res 2002;12:205-20. DOI: https://doi.org/10.1080/09603/202/000000970

Salameh P, Salame J, Khayat G, et al. Exposure to outdoor air pollution and chronic bronchitis in adults: a case-control study. Int J Occup Environ Med 2012;3:165-77.

Khoury MJ, Ioannidis JP. Big data meets public health. Science 2014;346:1054-5. DOI: https://doi.org/10.1126/science.aaa2709

Zheng Y, Liu F, Hsieh HP. U-air: When urban air quality inference meets big data. In: Proceedings 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago; 2013. p. 1436-44. DOI: https://doi.org/10.1145/2487575.2488188

Zheng Y, Chen X, Jin Q, et al. A cloud-based knowledge discovery system for monitoring fine-grained air quality. MSR-TR-2014-40, |Microsoft Research. 2014.

Ram S, Zhang W, Williams M, Pengetnze Y. Predicting asthma-related emergency department visits using big data. IEEE J Biomed Health Inform 2015;19:1216-23. DOI: https://doi.org/10.1109/JBHI.2015.2404829

Huang T, Lan L, Fang X, et al. Promises and challenges of big data computing in health sciences. Big Data Res 2015;2:2-11. DOI: https://doi.org/10.1016/j.bdr.2015.02.002

Van den Broeck J, Cunningham SA, Eeckels R, Herbst K. Data cleaning: detecting, diagnosing, and editing data abnormalities. PLoS Med 2005;2:e267. DOI: https://doi.org/10.1371/journal.pmed.0020267

Winkler WE. Data cleaning methods. Proceedings ACM SIGKDD Workshop on Data Cleaning, Record Linkage, and Object Consolidation, Washington DC, 2003. Available from: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2066&rep=rep1&type=pdf

Loureiro A, Torgo L, Soares C. Outlier detection using clustering methods: a data cleaning application. Proceedings of KDNet Symposium on Knowledge-based systems for the Public Sector, 2004.

Hall GC, Sauer B, Bourke A, et al. Guidelines for good database selection and use in pharmacoepidemiology research. Pharmacoepidemiol Drug Saf 2012;21:1-0. DOI: https://doi.org/10.1002/pds.2229

Borer ET, Seabloom EW, Jones MB, Schildhauer M. Some simple guidelines for effective data management. Bull Ecol Soc Am 2009;90:205-14. DOI: https://doi.org/10.1890/0012-9623-90.2.205

Nakhlé MM, Farah W, Ziade N, et al. Beirut air pollution and health effects-BAPHE study protocol and objectives. Multidiscip Respir Med 2015;10:21. DOI: https://doi.org/10.1186/s40248-015-0016-1

United Nations, Department of International Economic and Social Affairs. Provisional Guidelines on Standard International Age Classifications, Statistical Paper Series M, No. 74. 1982- Available from: https://unstats.un.org/unsd/publication/SeriesM/SeriesM_74e.pdf

WHO. International statistical classification of diseases and related health problems. Geneva: World Health Organization; 2004.

Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008;47:128-44.

Crammer K, Dredze M, Ganchev K, et al. Automatic code assignment to medical text. Proceedings of the Workshop on BioNLP 2007: Biological, translational, and clinical language processing. Stroudsburg: Association for Computational Linguistics. p. 129-36. DOI: https://doi.org/10.3115/1572392.1572416