David Carrell, PhD

David Carrell PhD

“My work uses computers to mine and analyze information about patients’ health from the millions of clinical notes Kaiser Permanente Washington doctors and nurses write about their patients in a typical year.”

David Carrell, PhD

Associate Investigator, Kaiser Permanente Washington Health Research Institute
Affiliate Associate Professor, Dept. of Biomedical Informatics and Medical Education, University of Washington School of Medicine


David Carrell, PhD, is an assistant investigator who develops and applies technology for extracting rich information from unstructured clinical text, such as physician progress notes. This work uses state-of-the-art clinical natural language processing (NLP) technologies in single- and multi-site settings.

An example of this work is an NLP system to identify women who have been diagnosed with recurrent breast cancer. Despite being a common and consequential clinical diagnosis, recurrent breast cancer cannot be tracked reliably using standard medical codes found in a person’s chart. Supported by a grant from the National Cancer Institute, he and his colleagues used information from clinician progress notes, radiology reports, and pathology reports to classify women by breast cancer recurrence.

Working with teams of researchers inside and outside Kaiser Permanente Washington Health Research Institute, Dr. Carrell has applied similar precision phenotyping methods to identify evidence of carotid artery stenosis, colon polyps, problem use of prescription opioids, and colonoscopy quality.

Dr. Carrell’s current research projects are applying NLP and machine learning methods to improve medication safety surveillance (through the Food and Drug Administration Sentinel Initiative) and to evaluate the impact on drug use disorder diagnosis and treatment of Kaiser Permanente Washington patients screened for unhealthy cannabis and other drug use.  His ongoing work also includes development and application of automated algorithms based on electronic health record data to identify patients with particular health conditions (called “patient phenotypes”) for use in genetic and epidemiological research.


  • Medication Use & Patient Safety

    Surveillance methods for adverse events associated with medication exposure, including problem use of prescription opioids

  • Health Informatics

    Methods for using structured and unstructured electronic health record data to identify patients with (or without) specific clinical conditions or phenotypes for large scale epidemiological and genomic studies

  • Cancer and Cancer Screening

    Identifying recurrent breast cancer using EHR text; Colonoscopy quality metrics

  • Clinical Natural Language Processing

    Recurrent breast cancer; Colonoscopy quality; Extracting information from clinical text; Automated de-identification of clinical text; Methods for applying NLP methods in multi-site research

  • Addictions

    Prevention and treatment

  • Substance Use Disorders

  • Mental Health

  • Pharmacoepidemiology


  • Clinical Text De-identification

Recent publications

Wald A, Carrell D, Remington M, Kexel E, Zeh J, Corey L. Two-day regimen of acyclovir for treatment of recurrent genital herpes simplex virus type 2 infection Clin Infect Dis. 2002 Apr 1;34(7):944-8. Epub 2002 Feb 20. PubMed

Wilkerson JD, Carrell D. Money, politics, and medicine: the American Medical PAC's strategy of giving in U.S. house races. J Health Polit Policy Law. 1999 Apr;24(2):335-55. PubMed

Bowen DJ, Kestin M, McTiernan A, Carrell D, Green P. Effects of dietary fat intervention on mental health in women. Cancer Epidemiol Biomarkers Prev. 1995 Jul-Aug;4(5):555-9. PubMed

Bowen DJ, Urban N, Carrell D, Kinne S. Comparisons of strategies to prevent breast cancer mortality. J Soc Issues. 1993 Summer;49(2):35-60. PubMed

Carrell D. Whither the Revolution? The Toucqueville Review. 1987, 8:39-92

Rosenthal E, Jarvik GP, Crosslin DR, Gordon S, Carrell D, Stanaway IB, Larson EB, Grafton J, Wei-Qi W, Denny JC, Shah A, Ritchie M, Hakonarson H, Rasmussen-Torvik LJ, Connoly JJ, Sturm A, Feng Q, Kullo IJ. Association between triglycerides, known risk SNVs, and conserved rare variation in SLC25A40 in a multi-ancestry cohort. BMC Med Genomics. 2021 Jan 6;14(1):11. doi: 10.1186/s12920-020-00854-2. PubMed

Suri P, Stanaway IB, Zhang Y, Freidin MB, Tsepilov YA, Carrell DS, Williams FMK, Aulchenko YS, Hakonarson H, Namjou B, Crosslin DR, Jarvik GP, Lee MT. Genome-wide association studies of low back pain and lumbar spinal disorders using electronic health record data identify a locus associated with lumbar spinal stenosis. Pain. 2021 Aug 1;162(8):2263-2272. doi: 10.1097/j.pain.0000000000002221. PubMed




Improving and advancing mental health care

KPWHRI researchers are contributing to better mental health care for people nationwide.

New findings


Research roundup on natural language processing and machine learning

Using doctor's notes to learn about drug reactions, dementia, and cannabis use.

Vaccine Safety


Biostatisticians track COVID-19 vaccine safety

Dr. Jennifer Nelson explains how KP scientists are helping the CDC and FDA keep an eye out for rare adverse events.

cancer research


Using breast MRI after cancer may lead to unneeded biopsies

A Kaiser Permanente-led BCSC study is among the largest ever to evaluate adding MRI surveillance for breast cancer survivors.