HealthstudyResultsandFindingsSummary1 x
Running Head: HEALTH STUDY RESULTS AND FINDINGS SUMMARY 1
HEALTH STUDY RESULTS AND FINDINGS SUMMARY 4
Study Topic: Combining Knowledge and Data Driven Insights for Identifying Risk Factors using Electronic Health Records
Professor’s Name:
Student’s Name:
Date:
The study discussed in this context describes the EHR data as well as other features occurring in the research. The report also provides metric evaluation and the findings. The patient EHRs used in the study are those that dated from 2003 up to the year 2010, within the Geisinger Health System(GHS). GHS in this system refers to an integrated system of healthcare and which offers services of health care in central and northeastern Pennsylvania. There are forty-one clinics of community practice under this health system, which got used in providing over 400,000 patients whose data got used in this study. The study identified 4,644 cases of HF based on clinical diagnosis, from the EHRs used in this study. HF diagnosis formed one out of the many thresholds that got used in deducing the results of this study. The date for determination got defined as the first appearance of an HF diagnosis about EHR.
For each incident HF case, ten eligible clinic sex and age-matched controls got selected. For the control patients, the requirements were them to have their initial Geisinger Clinic Office visit period to be less than one year. The outpatient problem diagnosis list, lab results for the patients, among other health reconciliation documents got extracted for use in this study. In feature construction, an index date for every patient used in this study got anchored to construct a feature vector based on the observation window. The observation window refers to that fixed-size time window occurring immediately before the index date. The index data forms the date of diagnosis for an HF case-patient. During the period of the observation window, longitudinal clinical events create statistical measures that make up the feature vector. In the system overview section, every clinical development forms a feature.
Specifically, values for a feature get derived from EHR records that correspond with patients’ records form the observation window. Computation of an average of the continuous events e.g., lab measures, gets done an observation window and after the removal of the invalid outliers. Calculation of data-driven features occur by filtering fewer correlation data-driven features about the target variable. The evaluation of the results occurs quantitively by validating the performance of functions that get selected. In this regard, 5-fold Cross-validation gets done throughout the entire dataset. The area under the ROC curve gets considered as the performance metric. Only the Training data gets processed during feature selection and within a fold. Therefore, the testing data gets left out for the AUC validation.
Initially, the comparison of the performance for the different knowledge-driven features gets done. This exercise gets done with an objective of making the predictive power of common risk factors more understandable. Next, evaluation of the results involves a gradual addition of additional data-driven features per the computed feature ranking by the SOR algorithm. In a comparison done among the knowledge-driven features, there are three primary knowledge-driven features sets picked, using the prevalence in HF cases. In a comparison exercise done in this study, hypertension, as well as coronary disease, features proved to perform better compared to those of diabetes. Therefore, all comorbidities combined play better compared to individual features as expected in this study. The process of combining knowledge driven with data has driven features bases on the performance ranking. Currently, all the knowledge-driven features are highly trusted and hence included all in the model. It forms one limitation encountered in this study. Due to reasons like data quality, some knowledge-driven features may not serve the predictive purpose. The study, therefore, presents a systematic framework through which the experience and the data-driven features get combined for the identification of risk factors.
Giving my insight into this study, I tend to believe first that full compliance of the HIPAA Act got effected. Surveys and research that involve the use of patients’ health data must first understand and implement the Data privacy compliance Act that seeks to ensure the security of personal health information(PHI). The use of patients’ data in this study provides the start point and a foundation for the subsequent findings. As noted, health records of diagnosis in health care institution remains an essential tool through which future assessment may get done. HR diagnosis records in this context form a critical aspect that helps develop a feature vector. Performance and risk assessment procedures in this study involve a comparison of the health record of the past. Indeed, the study is a success in feature development used for ranking. The SOR algorithm employed in the computation of the ranking feature makes the results of this research study accurate and reliable.
References
Sun, J. (2012). Combining knowledge and data driven insights for identifying risk factors using electronic health records. PubMed Central (PMC).
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3540578/