David Carrell, PhD, is an assistant investigator who develops and applies technology for extracting rich information from unstructured clinical text, such as physician progress notes. This work uses state-of-the-art clinical natural language processing (NLP) technologies in single- and multi-site settings.
An example of this work is an NLP system to identify women who have been diagnosed with recurrent breast cancer. Despite being a common and consequential clinical diagnosis, recurrent breast cancer cannot be tracked reliably using standard medical codes found in a person’s chart. Supported by a grant from the National Cancer Institute, he and his colleagues used information from clinician progress notes, radiology reports, and pathology reports to classify women by breast cancer recurrence.
Working with teams of researchers inside and outside Kaiser Permanente Washington Health Research Institute, Dr. Carrell has applied similar precision phenotyping methods to identify evidence of carotid artery stenosis, colon polyps, problem use of prescription opioids, and colonoscopy quality.
Dr. Carrell’s current research projects are applying NLP and machine learning methods to improve medication safety surveillance (through the Food and Drug Administration Sentinel Initiative) and to evaluate the impact on drug use disorder diagnosis and treatment of Kaiser Permanente Washington patients screened for unhealthy cannabis and other drug use. His ongoing work also includes development and application of automated algorithms based on electronic health record data to identify patients with particular health conditions (called “patient phenotypes”) for use in genetic and epidemiological research.
Surveillance methods for adverse events associated with medication exposure, including problem use of prescription opioids
Methods for using structured and unstructured electronic health record data to identify patients with (or without) specific clinical conditions or phenotypes for large scale epidemiological and genomic studies
Identifying recurrent breast cancer using EHR text; Colonoscopy quality metrics
Recurrent breast cancer; Colonoscopy quality; Extracting information from clinical text; Automated de-identification of clinical text; Methods for applying NLP methods in multi-site research
Prevention and treatment
Mercaldo ND, Brothers KB, Carrell DS, Clayton EW, Connolly JJ, Holm IA, Horowitz CR, Jarvik GP, Kitchner TE, Li R, McCarty CA, McCormick JB, McManus VD, Myers MF, Pankratz JJ, Shrubsole MJ, Smith ME, Stallings SC, Williams JL, Schildcrout JS. Enrichment sampling for a multi-site patient survey using electronic health records and census data. J Am Med Inform Assoc. 2019 Mar 1;26(3):219-227. doi: 10.1093/jamia/ocy164. PubMed
Liu Y, Wan Z, Xia W, Kantarcioglu M, Vorobeychik Y, Clayton EW, Kho A, Carrell D, Malin BA. Detecting the presence of an individual in phenotypic summary data. AMIA Annu Symp Proc. 2018 Dec 5;2018:760-769. eCollection 2018. PubMed
Mosley JD, Benson MD, Smith JG, Melander O, Ngo D, Shaffer CM, Ferguson JF, Herzig MS, McCarty CA, Chute CG, Jarvik GP, Gordon AS, Palmer MR, Crosslin DR, Larson EB, Carrell DS, Kullo IJ, Pacheco JA, Peissig PL, Brilliant MH, Kitchner TE, Linneman JG, Namjou B, Williams MS, Ritchie MD, Borthwick KM, Kiryluk K, Mentch FD, Sleiman PM, Karlson EW, Verma SS, Zhu Y, Vasan RS, Yang Q, Denny JC, Roden DM, Gerszten RE, Wang TJ. Probing the virtual proteome to identify novel disease biomarkers. Circulation. 2018;138(22):2469-2481. doi: 10.1161/CIRCULATIONAHA.118.036063. PubMed
Hall TO, Stanaway IB, Carrell DS, Carroll RJ, Denny JC, Hakonarson H, Larson EB, Mentch FD, Peissig PL, Pendergrass SA, Rosenthal EA, Jarvik GP, Crosslin DR. Unfolding of hidden white blood cell count phenotypes for gene discovery using latent class mixed modeling. Genes Immun. 2019 Sep;20(7):555-565. doi: 10.1038/s41435-018-0051-y. Epub 2018 Nov 21. PubMed
Ezaz G, Leffler DA, Beach S, Schoen RE, Crockett SD, Gourevitch RA, Rose S, Morris M, Carrell DS, Greer JB, Mehrotra A. Association between endoscopist personality and rate of adenoma detection. Clin Gastroenterol Hepatol. 2018 Oct 13. pii: S1542-3565(18)31140-6. doi: 10.1016/j.cgh.2018.10.019. [Epub ahead of print]. PubMed
Stanaway IB, Hall TO, Rosenthal EA, Palmer M, Naranbhai V, Knevel R, Namjou-Khales B, Carroll RJ, Kiryluk K, Gordon AS, Linder J, Howell KM, Mapes BM, Lin FTJ, Joo YY, Hayes MG, Gharavi AG, Pendergrass SA, Ritchie MD, de Andrade M, Croteau-Chonka DC, Raychaudhuri S, Weiss ST, Lebo M, Amr SS, Carrell D, Larson EB, Chute CG, Rasmussen-Torvik LJ, Roy-Puckelwartz MJ, Sleiman P, Hakonarson H, Li R, Karlson EW, Peterson JF, Kullo IJ, Chisholm R, Denny JC, Jarvik GP; eMERGE Network, Crosslin DR. The eMERGE genotype set of 83,717 subjects imputed to ~40 million variants genome wide and association with the herpes zoster medical record phenotype. Genet Epidemiol. 2019 Feb;43(1):63-81. doi: 10.1002/gepi.22167. Epub 2018 Oct 8. PubMed
Crockett DS, Gourevitch RA, Morris M, Carrell DS, Rose S, Shi Z, Greer JB, Schoen RE, Mehrotra A. Serrated polyp detection is related to training and colonoscopy volume: results from a multicenter study. Endoscopy. 2018 Oct;50(10):984-992.
Dr. Jennifer Nelson explains how KP scientists are helping the CDC and FDA keep an eye out for rare adverse events.
A Kaiser Permanente-led BCSC study is among the largest ever to evaluate adding MRI surveillance for breast cancer survivors.
Dr. Paula Lozano explains how a Learning Health System project finds Kaiser Permanente Washington members who could benefit most from preventive services.
In Annals editorial Drs. Sascha Dublin and Michael Von Korff advocate more caution in prescribing these medications.