David Carrell, PhD, is an assistant investigator who develops and applies technology for extracting rich information from unstructured clinical text, such as physician progress notes. This work uses state-of-the-art clinical natural language processing (NLP) technologies in single- and multi-site settings.
An example of this work is an NLP system to identify women who have been diagnosed with recurrent breast cancer. Despite being a common and consequential clinical diagnosis, recurrent breast cancer cannot be tracked reliably using standard medical codes found in a person’s chart. Supported by a grant from the National Cancer Institute, he and his colleagues used information from clinician progress notes, radiology reports, and pathology reports to classify women by breast cancer recurrence.
Working with teams of researchers inside and outside Kaiser Permanente Washington Health Research Institute, Dr. Carrell has applied similar precision phenotyping methods to identify evidence of carotid artery stenosis, colon polyps, problem use of prescription opioids, and colonoscopy quality.
Dr. Carrell’s current research projects are applying NLP and machine learning methods to improve medication safety surveillance (through the Food and Drug Administration Sentinel Initiative) and to evaluate the impact on drug use disorder diagnosis and treatment of Kaiser Permanente Washington patients screened for unhealthy cannabis and other drug use. His ongoing work also includes development and application of automated algorithms based on electronic health record data to identify patients with particular health conditions (called “patient phenotypes”) for use in genetic and epidemiological research.
Surveillance methods for adverse events associated with medication exposure, including problem use of prescription opioids
Methods for using structured and unstructured electronic health record data to identify patients with (or without) specific clinical conditions or phenotypes for large scale epidemiological and genomic studies
Identifying recurrent breast cancer using EHR text; Colonoscopy quality metrics
Recurrent breast cancer; Colonoscopy quality; Extracting information from clinical text; Automated de-identification of clinical text; Methods for applying NLP methods in multi-site research
Prevention and treatment
Joo YY, Pacheco JA, Thompson WK, Rasmussen-Torvik LJ, Rasmussen LV, Lin FTJ, Andrade M, Borthwick KM, Bottinger E, Cagan A, Carrell DS, Denny JC, Ellis SB, Gottesman O, Linneman JG, Pathak J, Peissig PL, Shang N, Tromp G, Veerappan A, Smith ME, Chisholm RL, Gawron AJ, Hayes MG, Kho AN. Multi-ancestry genome- and phenome-wide association studies of diverticular disease in electronic health records with natural language processing enriched phenotyping algorithm. PLoS One. 2023 May 17;18(5):e0283553. doi: 10.1371/journal.pone.0283553. eCollection 2023. PubMed
Pacheco JA, Rasmussen LV, Wiley K Jr, Person TN, Cronkite DJ, Sohn S, Murphy S, Gundelach JH, Gainer V, Castro VM, Liu C, Mentch F, Lingren T, Sundaresan AS, Eickelberg G, Willis V, Furmanchuk A, Patel R, Carrell DS, Deng Y, Walton N, Satterfield BA, Kullo IJ, Dikilitas O, Smith JC, Peterson JF, Shang N, Kiryluk K, Ni Y, Li Y, Nadkarni GN, Rosenthal EA, Walunas TL, Williams MS, Karlson EW, Linder JE, Luo Y, Weng C, Wei W. Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network. Sci Rep. 2023 Feb 3;13(1):1971. doi: 10.1038/s41598-023-27481-y. PubMed
Brandt PS, Kho A, Luo Y, Pacheco JA, Walunas TL, Hakonarson H, Hripcsak G, Liu C, Shang N, Weng C, Walton N, Carrell DS, Crane PK, Larson EB, Chute CG, Kullo IJ, Carroll R, Denny J, Ramirez A, Wei WQ, Pathak J, Wiley LK, Richesson R, Starren JB, Rasmussen LV. Characterizing variability of electronic health record-driven phenotype definitions. J Am Med Inform Assoc. 2022 Dec 6;ocac235. doi: 10.1093/jamia/ocac235. Online ahead of print. PubMed
Carrell DS, Gruber S, Floyd JS, Bann MA, Cushing-Haugen KL, Johnson RL, Graham V, Cronkite DJ, Hazlehurst BL, Felcher AH, Bejan CA, Kennedy A, Shinde M, Karami S, Ma Y, Stojanovic D, Zhao Y, Ball R, Nelson J. Improving methods of identifying anaphylaxis for medical product safety surveillance using natural language processing and machine learning. Am J Epidemiol. 2022 Nov 4:kwac182. doi: 10.1093/aje/kwac182. [Epub ahead of print]. PubMed
Shinde M, Rodriguez-Watson C, Zhang TC, Carrell DS, Mendelsohn AB, Nam YH, Carruth A, Petronis KR, McMahill-Walraven CN, Jamal-Allial A, Nair V, Pawloski PA, Hickman A, Brown MT, Francis J, Hornbuckle K, Brown JS, Mo J. Patient characteristics, pain treatment patterns, and incidence of total joint replacement in a US population with osteoarthritis. BMC Musculoskelet Disord. 2022 Sep 23;23(1):883. doi: 10.1186/s12891-022-05823-7. PubMed
Floyd JS, Bann MA, Felcher AH, Sapp D, Nguyen MD, Ajao A, Ball R, Carrell DS, Nelson JC, Hazlehurst B. Validation of acute pancreatitis among adults in an integrated healthcare system. Epidemiology. 2023 Jan 1;34(1):33-37. doi: 10.1097/EDE.0000000000001541. Epub 2022 Aug 25. PubMed
Penfold RB, Carrell DS, Cronkite DJ, Pabiniak C, Dodd T, Glass AM, Johnson E, Thompson E, Arrighi HM, Stang PE. Development of a machine learning model to predict mild cognitive impairment using natural language processing in the absence of screening. BMC Med Inform Decis Mak. 2022 May 12;22(1):129. doi: 10.1186/s12911-022-01864-z. PubMed
KPWHRI researchers are contributing to better mental health care for people nationwide.
Using doctor's notes to learn about drug reactions, dementia, and cannabis use.
Dr. Jennifer Nelson explains how KP scientists are helping the CDC and FDA keep an eye out for rare adverse events.
A Kaiser Permanente-led BCSC study is among the largest ever to evaluate adding MRI surveillance for breast cancer survivors.