Clinically reported covert cerebrovascular disease and risk of neurological disease: a whole-population cohort of 395,273 people using natural language processing

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Background

The relevance of covert cerebrovascular disease (CCD) in practice is uncertain, in part because estimation of risk in whole clinical populations is difficult. To address this gap, we measured clinically reported CCD in a large clinical cohort using natural language processing (NLP) and estimated subsequent disease risk in linked health record data.

Methods

From all people with brain imaging in Scotland from 2010 to 2018, we selected people with no prior hospitalisation for neurological disease. Four phenotypes were identified with NLP of imaging reports: white matter hypoattenuation or hyperintensities (WMH), lacunes, cortical infarcts and cerebral atrophy. Hazard ratios (aHR) for stroke, dementia, and Parkinson’s disease (conditions previously associated with CCD), epilepsy (a brain-based control condition) and colorectal cancer (a non-brain control condition), adjusted for age, sex, deprivation, region, scan modality, and pre-scan healthcare, were calculated for each phenotype.

Findings

From 395,273 people with brain imaging and no history of neurological disease, 145,978 (37%) had ≥1 phenotype. For each phenotype, the aHR of any stroke was: WMH 1·4 (95%CI: 1·3–1·4), lacunes 1·6 (1·5–1·6), cortical infarct 1·7 (1·6–1·8), and cerebral atrophy 1·1 (1·0–1·1). The aHR of any dementia was: WMH, 1·3 (1·3–1·3), lacunes, 1·0 (0·9–1·0), cortical infarct 1·1 (1·0–1·1) and cerebral atrophy 1·7 (1·7–1·7). The aHR of Parkinson’s disease was, in people with a report of: WMH 1·1 (1·0–1·2), lacunes 1·1 (0·9–1·2), cortical infarct 0·7 (0·6–0·9) and cerebral atrophy 1·4 (1·3–1·5). The aHRs between CCD phenotypes and epilepsy and colorectal cancer overlapped the null.

Interpretation

NLP identified CCD and atrophy phenotypes from routine clinical image reports, and these had important associations with future stroke, dementia and Parkinson’s disease. Prevention of neurological disease in people with CCD should be a priority for healthcare providers and policymakers.

Funding

The Chief Scientist’s Office, the Medical Research Council, the Alzheimer’s Society, Health Data Research UK, the Wellcome Trust, Research Data Scotland, MQ – Transforming Mental Health, The Alan Turing Institute, the National Institute for Health Research, and the Stroke Association.

Research in context

Evidence before this study

Systematic reviews of magnetic resonance imaging (MRI) in cohort studies show that people with asymptomatic, covert cerebrovascular disease (CCD) have an increased risk of stroke, dementia and Parkinson’s disease. Covert brain infarcts, identified with natural language processing (NLP) show similar associations in one US-based insurance system. However, there is a lack of data on clinically-reported CCD from whole populations.

Added value of this study

This study used a validated NLP algorithm to identify CCD and cerebral atrophy from both MRI and computed tomography (CT) imaging reports generated during routine healthcare in >395K people in Scotland. It also distinguished between three CCD phenotypes – white matter hypoattenuation/hyperintensities, lacunes, cortical infarcts – and cerebral atrophy, and their associations with stroke and dementia and their subtypes. In adjusted models, we demonstrate higher risk of dementia (particularly Alzheimer’s disease) in people with atrophy, and higher risk of stroke in people with cortical infarcts. However, associations with an age-associated control outcome (colorectal cancer) were neutral, supporting a causal relationship. It also highlights differential associations between cerebral atrophy and dementia and cortical infarcts and stroke risk.

Implications of all the available evidence

CCD or atrophy on brain imaging reports in routine clinical practice is associated with a higher risk of stroke or dementia. Evidence is needed to support treatment strategies to reduce this risk. NLP can identify these important, otherwise uncoded, disease phenotypes, allowing research at scale into imaging-based biomarkers of dementia and stroke.

Related articles

Related articles are currently not available for this article.