User login
Agent Orange Exposure Increases Lymphoma Risk in Million Veteran Program Cohort
TOPLINE: Agent Orange exposure was associated with a 26% to 71% increased risk for multiple lymphoid cancers in veterans enrolled in the US Department of Veterans Affairs (VA) Million Veterans Program (MVP), while genetic predisposition independently raised risk by 12% to 81% across different lymphoma subtypes. A case-controlled analysis of 255,155 veterans found no significant interaction between genetic risk scores and Agent Orange exposure.
METHODOLOGY:
A case-control study included 255,155 non-Hispanic White veterans (median age 67 years, 92.5% male) enrolled in the VA MVP with genotype and Agent Orange exposure data.
Researchers analyzed five lymphoid malignant neoplasm subtypes: chronic lymphocytic leukemia, diffuse large B-cell lymphoma, follicular lymphoma, marginal zone lymphoma, and multiple myeloma diagnosed from January 1965 through June 2024.
Agent Orange exposure was determined through self-reported survey responses, while polygenic risk scores were derived from genome-wide association studies of lymphoid malignant neoplasms.
Analysis included adjustments for age at enrollment, sex, and the first 10 genetic principal components in logistic regression models evaluating Agent Orange exposure, polygenic risk scores, and their potential interaction.
TAKEAWAY:
Agent Orange exposure significantly increased risk for chronic lymphocytic leukemia (odds ratio [OR], 1.61; 95% CI, 1.40-1.84), diffuse large B-cell lymphoma (OR, 1.26; 95% CI, 1.03-1.53), follicular lymphoma (OR, 1.71; 95% CI, 1.39-2.11), and multiple myeloma (OR, 1.58; 95% CI, 1.35-1.86).
Polygenic risk scores were independently associated with all lymphoma subtypes, with strongest associations for chronic lymphocytic leukemia (OR, 1.81; 95% CI, 1.70-1.93) and multiple myeloma (OR, 1.41; 95% CI, 1.31-1.52).
Analysis in African American participants showed similar associations for multiple myeloma with both Agent Orange exposure (OR, 1.56; 95% CI, 1.18-2.07) and polygenic risk scores (OR, 1.31; 95% CI, 1.15-1.49).
According to the researchers, no significant polygenic risk score and Agent Orange exposure interactions were observed for any lymphoma subtype.
IN PRACTICE: "Our study addressed the public health concerns surrounding Agent Orange exposure and lymphoid malignant neoplasms, finding that both Agent Orange exposure and polygenic risk are independently associated with disease, suggesting potentially distinct and additive pathways that merit further investigation," wrote the authors of the study.
SOURCE: The study was led by researchers at the University of California, Irvine and the Tibor Rubin Veterans Affairs Medical Center, Long Beach, Californiaand was published online on August 13 in JAMA Network Open.
LIMITATIONS: According to the authors, while this represents the largest case-control study of Agent Orange exposure and lymphoid malignant neoplasm risk, the power to detect interaction associations in specific subtypes might be limited. Self-reported Agent Orange exposure data may have introduced survival bias, particularly in aggressive subtypes, as patients with aggressive tumors may have died before joining the MVP. Additionally, about half of the patients were diagnosed with lymphoid malignant neoplasms before self-reporting Agent Orange exposure, potentially introducing recall bias.
DISCLOSURES: The research was supported by a Veterans Affairs Career Development Award Xueyi Teng, PhD, received grants from the George E. Hewitt Foundation for Medical Research Postdoc Fellowship during the study.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: Agent Orange exposure was associated with a 26% to 71% increased risk for multiple lymphoid cancers in veterans enrolled in the US Department of Veterans Affairs (VA) Million Veterans Program (MVP), while genetic predisposition independently raised risk by 12% to 81% across different lymphoma subtypes. A case-controlled analysis of 255,155 veterans found no significant interaction between genetic risk scores and Agent Orange exposure.
METHODOLOGY:
A case-control study included 255,155 non-Hispanic White veterans (median age 67 years, 92.5% male) enrolled in the VA MVP with genotype and Agent Orange exposure data.
Researchers analyzed five lymphoid malignant neoplasm subtypes: chronic lymphocytic leukemia, diffuse large B-cell lymphoma, follicular lymphoma, marginal zone lymphoma, and multiple myeloma diagnosed from January 1965 through June 2024.
Agent Orange exposure was determined through self-reported survey responses, while polygenic risk scores were derived from genome-wide association studies of lymphoid malignant neoplasms.
Analysis included adjustments for age at enrollment, sex, and the first 10 genetic principal components in logistic regression models evaluating Agent Orange exposure, polygenic risk scores, and their potential interaction.
TAKEAWAY:
Agent Orange exposure significantly increased risk for chronic lymphocytic leukemia (odds ratio [OR], 1.61; 95% CI, 1.40-1.84), diffuse large B-cell lymphoma (OR, 1.26; 95% CI, 1.03-1.53), follicular lymphoma (OR, 1.71; 95% CI, 1.39-2.11), and multiple myeloma (OR, 1.58; 95% CI, 1.35-1.86).
Polygenic risk scores were independently associated with all lymphoma subtypes, with strongest associations for chronic lymphocytic leukemia (OR, 1.81; 95% CI, 1.70-1.93) and multiple myeloma (OR, 1.41; 95% CI, 1.31-1.52).
Analysis in African American participants showed similar associations for multiple myeloma with both Agent Orange exposure (OR, 1.56; 95% CI, 1.18-2.07) and polygenic risk scores (OR, 1.31; 95% CI, 1.15-1.49).
According to the researchers, no significant polygenic risk score and Agent Orange exposure interactions were observed for any lymphoma subtype.
IN PRACTICE: "Our study addressed the public health concerns surrounding Agent Orange exposure and lymphoid malignant neoplasms, finding that both Agent Orange exposure and polygenic risk are independently associated with disease, suggesting potentially distinct and additive pathways that merit further investigation," wrote the authors of the study.
SOURCE: The study was led by researchers at the University of California, Irvine and the Tibor Rubin Veterans Affairs Medical Center, Long Beach, Californiaand was published online on August 13 in JAMA Network Open.
LIMITATIONS: According to the authors, while this represents the largest case-control study of Agent Orange exposure and lymphoid malignant neoplasm risk, the power to detect interaction associations in specific subtypes might be limited. Self-reported Agent Orange exposure data may have introduced survival bias, particularly in aggressive subtypes, as patients with aggressive tumors may have died before joining the MVP. Additionally, about half of the patients were diagnosed with lymphoid malignant neoplasms before self-reporting Agent Orange exposure, potentially introducing recall bias.
DISCLOSURES: The research was supported by a Veterans Affairs Career Development Award Xueyi Teng, PhD, received grants from the George E. Hewitt Foundation for Medical Research Postdoc Fellowship during the study.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: Agent Orange exposure was associated with a 26% to 71% increased risk for multiple lymphoid cancers in veterans enrolled in the US Department of Veterans Affairs (VA) Million Veterans Program (MVP), while genetic predisposition independently raised risk by 12% to 81% across different lymphoma subtypes. A case-controlled analysis of 255,155 veterans found no significant interaction between genetic risk scores and Agent Orange exposure.
METHODOLOGY:
A case-control study included 255,155 non-Hispanic White veterans (median age 67 years, 92.5% male) enrolled in the VA MVP with genotype and Agent Orange exposure data.
Researchers analyzed five lymphoid malignant neoplasm subtypes: chronic lymphocytic leukemia, diffuse large B-cell lymphoma, follicular lymphoma, marginal zone lymphoma, and multiple myeloma diagnosed from January 1965 through June 2024.
Agent Orange exposure was determined through self-reported survey responses, while polygenic risk scores were derived from genome-wide association studies of lymphoid malignant neoplasms.
Analysis included adjustments for age at enrollment, sex, and the first 10 genetic principal components in logistic regression models evaluating Agent Orange exposure, polygenic risk scores, and their potential interaction.
TAKEAWAY:
Agent Orange exposure significantly increased risk for chronic lymphocytic leukemia (odds ratio [OR], 1.61; 95% CI, 1.40-1.84), diffuse large B-cell lymphoma (OR, 1.26; 95% CI, 1.03-1.53), follicular lymphoma (OR, 1.71; 95% CI, 1.39-2.11), and multiple myeloma (OR, 1.58; 95% CI, 1.35-1.86).
Polygenic risk scores were independently associated with all lymphoma subtypes, with strongest associations for chronic lymphocytic leukemia (OR, 1.81; 95% CI, 1.70-1.93) and multiple myeloma (OR, 1.41; 95% CI, 1.31-1.52).
Analysis in African American participants showed similar associations for multiple myeloma with both Agent Orange exposure (OR, 1.56; 95% CI, 1.18-2.07) and polygenic risk scores (OR, 1.31; 95% CI, 1.15-1.49).
According to the researchers, no significant polygenic risk score and Agent Orange exposure interactions were observed for any lymphoma subtype.
IN PRACTICE: "Our study addressed the public health concerns surrounding Agent Orange exposure and lymphoid malignant neoplasms, finding that both Agent Orange exposure and polygenic risk are independently associated with disease, suggesting potentially distinct and additive pathways that merit further investigation," wrote the authors of the study.
SOURCE: The study was led by researchers at the University of California, Irvine and the Tibor Rubin Veterans Affairs Medical Center, Long Beach, Californiaand was published online on August 13 in JAMA Network Open.
LIMITATIONS: According to the authors, while this represents the largest case-control study of Agent Orange exposure and lymphoid malignant neoplasm risk, the power to detect interaction associations in specific subtypes might be limited. Self-reported Agent Orange exposure data may have introduced survival bias, particularly in aggressive subtypes, as patients with aggressive tumors may have died before joining the MVP. Additionally, about half of the patients were diagnosed with lymphoid malignant neoplasms before self-reporting Agent Orange exposure, potentially introducing recall bias.
DISCLOSURES: The research was supported by a Veterans Affairs Career Development Award Xueyi Teng, PhD, received grants from the George E. Hewitt Foundation for Medical Research Postdoc Fellowship during the study.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
Lower Cancer Risk in Veterans With COVID-19 Infection
TOPLINE: COVID-19 infection is associated with a 25% reduction in cancer risk over 3 years among veterans who survived the initial infection. This protective effect was observed across sexes and racial groups, with stronger benefits seen in older patients and those with mild disease.
METHODOLOGY:
Researchers conducted a retrospective cohort study comparing Veterans who tested positive for COVID-19 between March 15, 2020, and November 30, 2020, to those who tested negative.
Analysis included 499,396 veterans, with 88,590 (17.2%) COVID-19 positive and 427,566 (82.8%) COVID-19 negative patients, with mean (SD) ages of 57.9 (16.4) and 59.5 (15.8) years, respectively.
Investigators utilized Cox proportional hazard regression models to determine the hazard ratio of new cancer diagnosis within a three-year follow-up period.
Patient characteristics included age, race, ethnicity, sex, BMI, smoking status, and various comorbidities as covariates in the analysis.
TAKEAWAY:
For patients surviving ≥ 30 days after COVID-19 testing, infection was associated with a 25% reduction in cancer hazard (hazard ratio [HR], 0.75; 95% CI, 0.73-0.77).
The reduction in cancer risk was similar across sexes and races, with the exception of Asians, and showed greater decreases with advancing age above 45 years.
Patients with mild COVID-19 showed the strongest reduction in cancer risk (adjusted HR, 0.72; 95% CI, 0.70-0.74), while those with moderate COVID-19 showed an 11% reduction (adjusted HR, 0.89; 95% CI, 0.83-0.93), and severe COVID-19 showed no significant reduction in cancer risk.
IN PRACTICE: "Regarding age, the incidence of cancer appeared to decrease with each decade of life in the COVID-19 group compared to that in the non-exposed group,” the authors noted. “This is surprising, given that cancer diagnoses typically increase with age.”
SOURCE: The study was led by researchers at the Miami Veterans Affairs (VA) Healthcare System Geriatric Research, Education, and Clinical Center and was published online on August 25 in PLoS One.
LIMITATIONS: The findings of this retrospective and observational study should be interpreted with caution. Results may not be generalizable beyond the predominantly male, older veteran population. The 3-year follow-up period may be insufficient to fully understand long-term cancer incidence patterns. Researchers could not capture all COVID-19 reinfection cases due to testing occurring outside the Veterans Affairs system, including at-home testing. The impact of vaccination status and reinfection on cancer risk could not be fully assessed, as the initial study cohort was grouped prior to vaccine availability.
DISCLOSURES: The authors report no financial support was received for this study and declare no competing interests.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: COVID-19 infection is associated with a 25% reduction in cancer risk over 3 years among veterans who survived the initial infection. This protective effect was observed across sexes and racial groups, with stronger benefits seen in older patients and those with mild disease.
METHODOLOGY:
Researchers conducted a retrospective cohort study comparing Veterans who tested positive for COVID-19 between March 15, 2020, and November 30, 2020, to those who tested negative.
Analysis included 499,396 veterans, with 88,590 (17.2%) COVID-19 positive and 427,566 (82.8%) COVID-19 negative patients, with mean (SD) ages of 57.9 (16.4) and 59.5 (15.8) years, respectively.
Investigators utilized Cox proportional hazard regression models to determine the hazard ratio of new cancer diagnosis within a three-year follow-up period.
Patient characteristics included age, race, ethnicity, sex, BMI, smoking status, and various comorbidities as covariates in the analysis.
TAKEAWAY:
For patients surviving ≥ 30 days after COVID-19 testing, infection was associated with a 25% reduction in cancer hazard (hazard ratio [HR], 0.75; 95% CI, 0.73-0.77).
The reduction in cancer risk was similar across sexes and races, with the exception of Asians, and showed greater decreases with advancing age above 45 years.
Patients with mild COVID-19 showed the strongest reduction in cancer risk (adjusted HR, 0.72; 95% CI, 0.70-0.74), while those with moderate COVID-19 showed an 11% reduction (adjusted HR, 0.89; 95% CI, 0.83-0.93), and severe COVID-19 showed no significant reduction in cancer risk.
IN PRACTICE: "Regarding age, the incidence of cancer appeared to decrease with each decade of life in the COVID-19 group compared to that in the non-exposed group,” the authors noted. “This is surprising, given that cancer diagnoses typically increase with age.”
SOURCE: The study was led by researchers at the Miami Veterans Affairs (VA) Healthcare System Geriatric Research, Education, and Clinical Center and was published online on August 25 in PLoS One.
LIMITATIONS: The findings of this retrospective and observational study should be interpreted with caution. Results may not be generalizable beyond the predominantly male, older veteran population. The 3-year follow-up period may be insufficient to fully understand long-term cancer incidence patterns. Researchers could not capture all COVID-19 reinfection cases due to testing occurring outside the Veterans Affairs system, including at-home testing. The impact of vaccination status and reinfection on cancer risk could not be fully assessed, as the initial study cohort was grouped prior to vaccine availability.
DISCLOSURES: The authors report no financial support was received for this study and declare no competing interests.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: COVID-19 infection is associated with a 25% reduction in cancer risk over 3 years among veterans who survived the initial infection. This protective effect was observed across sexes and racial groups, with stronger benefits seen in older patients and those with mild disease.
METHODOLOGY:
Researchers conducted a retrospective cohort study comparing Veterans who tested positive for COVID-19 between March 15, 2020, and November 30, 2020, to those who tested negative.
Analysis included 499,396 veterans, with 88,590 (17.2%) COVID-19 positive and 427,566 (82.8%) COVID-19 negative patients, with mean (SD) ages of 57.9 (16.4) and 59.5 (15.8) years, respectively.
Investigators utilized Cox proportional hazard regression models to determine the hazard ratio of new cancer diagnosis within a three-year follow-up period.
Patient characteristics included age, race, ethnicity, sex, BMI, smoking status, and various comorbidities as covariates in the analysis.
TAKEAWAY:
For patients surviving ≥ 30 days after COVID-19 testing, infection was associated with a 25% reduction in cancer hazard (hazard ratio [HR], 0.75; 95% CI, 0.73-0.77).
The reduction in cancer risk was similar across sexes and races, with the exception of Asians, and showed greater decreases with advancing age above 45 years.
Patients with mild COVID-19 showed the strongest reduction in cancer risk (adjusted HR, 0.72; 95% CI, 0.70-0.74), while those with moderate COVID-19 showed an 11% reduction (adjusted HR, 0.89; 95% CI, 0.83-0.93), and severe COVID-19 showed no significant reduction in cancer risk.
IN PRACTICE: "Regarding age, the incidence of cancer appeared to decrease with each decade of life in the COVID-19 group compared to that in the non-exposed group,” the authors noted. “This is surprising, given that cancer diagnoses typically increase with age.”
SOURCE: The study was led by researchers at the Miami Veterans Affairs (VA) Healthcare System Geriatric Research, Education, and Clinical Center and was published online on August 25 in PLoS One.
LIMITATIONS: The findings of this retrospective and observational study should be interpreted with caution. Results may not be generalizable beyond the predominantly male, older veteran population. The 3-year follow-up period may be insufficient to fully understand long-term cancer incidence patterns. Researchers could not capture all COVID-19 reinfection cases due to testing occurring outside the Veterans Affairs system, including at-home testing. The impact of vaccination status and reinfection on cancer risk could not be fully assessed, as the initial study cohort was grouped prior to vaccine availability.
DISCLOSURES: The authors report no financial support was received for this study and declare no competing interests.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
LLMs Show High Accuracy in Extracting CRC Data From VA Health Records
TOPLINE: Large Language Models (LLMs) achieve more than 95% accuracy in extracting colorectal cancer and dysplasia diagnoses from Veterans Health Administration (VHA) pathology reports, including patients with Million Veteran Program (MVP) genomic data. The validated approach using publicly available LLMs demonstrates excellent performance across both Inflammatory Bowel Disease (IBD) and non-IBD populations.
METHODOLOGY:
Researchers analyzed 116,373 pathology reports generated in the VHA between 1999 and 2024, utilizing search term filtering followed by simple yes/no question prompts for identifying colorectal dysplasia, high-grade dysplasia and/or colorectal adenocarcinoma, and invasive colorectal cancer.
Results were compared to blinded manual chart review of 200 to 300 pathology reports for each patient cohort and diagnostic task, totaling 3,816 reviewed reports, to validate the LLM approach.
Validation was performed independently in IBD and non-IBD populations using Gemma-2 and Llama-3 LLMs without any task-specific training or fine-tuning.
Performance metrics included F1 scores, positive predictive value, negative predictive value, sensitivity, specificity, and Matthew's correlation coefficient to evaluate accuracy across different tasks.
TAKEAWAY:
In patients with IBD in the MVP, the LLM achieved (F1-score, 96.9%; 95% confidence interval [CI], 94.0%-99.6%) for identifying dysplasia, (F1-score, 93.7%; 95% CI, 88.2%-98.4%) for identifying high-grade dysplasia/colorectal cancer, and (F1-score, 98%; 95% CI, 96.3%-99.4%) for identifying colorectal cancer.
In non-IBD MVP patients, the LLM demonstrated (F1-score, 99.2%; 95% CI, 98.2%-100%) for identifying colorectal dysplasia, (F1-score, 96.5%; 95% CI, 93.0%-99.2%) for high-grade dysplasia/colorectal cancer, and (F1-score, 95%; 95% CI, 92.8%-97.2%) for identifying colorectal cancer.
Agreement between reviewers was excellent across tasks, with (Cohen's kappa, 89%-97%) for main tasks, and (Cohen's kappa, 78.1%-93.1%) for indefinite for dysplasia in IBD cohort.
The LLM approach maintained high accuracy when applied to full pathology reports, with (F1-score, 97.1%; 95% CI, 93.5%-100%) for dysplasia detection in IBD patients.
IN PRACTICE: “We have shown that LLMs are powerful, potentially generalizable tools for accurately extracting important information from clinical semistructured and unstructured text and which require little human-led development.” the authors of the study wrote
SOURCE: The study was based on data from the Million Veteran Program and supported by the Office of Research and Development, Veterans Health Administration, and the US Department of Veterans Affairs Biomedical Laboratory. It was published online in BMJ Open Gastroenterology.
LIMITATIONS: According to the authors, this research may be specific to the VHA system and the LLM models used. The authors did not test larger models. The authors acknowledge that without long-term access to graphics processing units, they could not feasibly test larger models, which may overcome some of the shortcomings seen in smaller models. Additionally, the researchers could not rule out overlap between Million Veteran Program and Corporate Data Warehouse reports, though they state that results in either cohort alone are sufficient validation compared with previously published work.
DISCLOSURES: The study was supported by Merit Review Award from the United States Department of Veterans Affairs Biomedical Laboratory Research and Development Service, AGA Research Foundation, National Institutes of Health grants, and the National Library of Medicine Training Grant. Kit Curtius reported receiving an investigator-led research grant from Phathom Pharmaceuticals. Shailja C Shah disclosed being a paid consultant for RedHill Biopharma and Phathom Pharmaceuticals, and an unpaid scientific advisory board member for Ilico Genetics, Inc.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: Large Language Models (LLMs) achieve more than 95% accuracy in extracting colorectal cancer and dysplasia diagnoses from Veterans Health Administration (VHA) pathology reports, including patients with Million Veteran Program (MVP) genomic data. The validated approach using publicly available LLMs demonstrates excellent performance across both Inflammatory Bowel Disease (IBD) and non-IBD populations.
METHODOLOGY:
Researchers analyzed 116,373 pathology reports generated in the VHA between 1999 and 2024, utilizing search term filtering followed by simple yes/no question prompts for identifying colorectal dysplasia, high-grade dysplasia and/or colorectal adenocarcinoma, and invasive colorectal cancer.
Results were compared to blinded manual chart review of 200 to 300 pathology reports for each patient cohort and diagnostic task, totaling 3,816 reviewed reports, to validate the LLM approach.
Validation was performed independently in IBD and non-IBD populations using Gemma-2 and Llama-3 LLMs without any task-specific training or fine-tuning.
Performance metrics included F1 scores, positive predictive value, negative predictive value, sensitivity, specificity, and Matthew's correlation coefficient to evaluate accuracy across different tasks.
TAKEAWAY:
In patients with IBD in the MVP, the LLM achieved (F1-score, 96.9%; 95% confidence interval [CI], 94.0%-99.6%) for identifying dysplasia, (F1-score, 93.7%; 95% CI, 88.2%-98.4%) for identifying high-grade dysplasia/colorectal cancer, and (F1-score, 98%; 95% CI, 96.3%-99.4%) for identifying colorectal cancer.
In non-IBD MVP patients, the LLM demonstrated (F1-score, 99.2%; 95% CI, 98.2%-100%) for identifying colorectal dysplasia, (F1-score, 96.5%; 95% CI, 93.0%-99.2%) for high-grade dysplasia/colorectal cancer, and (F1-score, 95%; 95% CI, 92.8%-97.2%) for identifying colorectal cancer.
Agreement between reviewers was excellent across tasks, with (Cohen's kappa, 89%-97%) for main tasks, and (Cohen's kappa, 78.1%-93.1%) for indefinite for dysplasia in IBD cohort.
The LLM approach maintained high accuracy when applied to full pathology reports, with (F1-score, 97.1%; 95% CI, 93.5%-100%) for dysplasia detection in IBD patients.
IN PRACTICE: “We have shown that LLMs are powerful, potentially generalizable tools for accurately extracting important information from clinical semistructured and unstructured text and which require little human-led development.” the authors of the study wrote
SOURCE: The study was based on data from the Million Veteran Program and supported by the Office of Research and Development, Veterans Health Administration, and the US Department of Veterans Affairs Biomedical Laboratory. It was published online in BMJ Open Gastroenterology.
LIMITATIONS: According to the authors, this research may be specific to the VHA system and the LLM models used. The authors did not test larger models. The authors acknowledge that without long-term access to graphics processing units, they could not feasibly test larger models, which may overcome some of the shortcomings seen in smaller models. Additionally, the researchers could not rule out overlap between Million Veteran Program and Corporate Data Warehouse reports, though they state that results in either cohort alone are sufficient validation compared with previously published work.
DISCLOSURES: The study was supported by Merit Review Award from the United States Department of Veterans Affairs Biomedical Laboratory Research and Development Service, AGA Research Foundation, National Institutes of Health grants, and the National Library of Medicine Training Grant. Kit Curtius reported receiving an investigator-led research grant from Phathom Pharmaceuticals. Shailja C Shah disclosed being a paid consultant for RedHill Biopharma and Phathom Pharmaceuticals, and an unpaid scientific advisory board member for Ilico Genetics, Inc.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.
TOPLINE: Large Language Models (LLMs) achieve more than 95% accuracy in extracting colorectal cancer and dysplasia diagnoses from Veterans Health Administration (VHA) pathology reports, including patients with Million Veteran Program (MVP) genomic data. The validated approach using publicly available LLMs demonstrates excellent performance across both Inflammatory Bowel Disease (IBD) and non-IBD populations.
METHODOLOGY:
Researchers analyzed 116,373 pathology reports generated in the VHA between 1999 and 2024, utilizing search term filtering followed by simple yes/no question prompts for identifying colorectal dysplasia, high-grade dysplasia and/or colorectal adenocarcinoma, and invasive colorectal cancer.
Results were compared to blinded manual chart review of 200 to 300 pathology reports for each patient cohort and diagnostic task, totaling 3,816 reviewed reports, to validate the LLM approach.
Validation was performed independently in IBD and non-IBD populations using Gemma-2 and Llama-3 LLMs without any task-specific training or fine-tuning.
Performance metrics included F1 scores, positive predictive value, negative predictive value, sensitivity, specificity, and Matthew's correlation coefficient to evaluate accuracy across different tasks.
TAKEAWAY:
In patients with IBD in the MVP, the LLM achieved (F1-score, 96.9%; 95% confidence interval [CI], 94.0%-99.6%) for identifying dysplasia, (F1-score, 93.7%; 95% CI, 88.2%-98.4%) for identifying high-grade dysplasia/colorectal cancer, and (F1-score, 98%; 95% CI, 96.3%-99.4%) for identifying colorectal cancer.
In non-IBD MVP patients, the LLM demonstrated (F1-score, 99.2%; 95% CI, 98.2%-100%) for identifying colorectal dysplasia, (F1-score, 96.5%; 95% CI, 93.0%-99.2%) for high-grade dysplasia/colorectal cancer, and (F1-score, 95%; 95% CI, 92.8%-97.2%) for identifying colorectal cancer.
Agreement between reviewers was excellent across tasks, with (Cohen's kappa, 89%-97%) for main tasks, and (Cohen's kappa, 78.1%-93.1%) for indefinite for dysplasia in IBD cohort.
The LLM approach maintained high accuracy when applied to full pathology reports, with (F1-score, 97.1%; 95% CI, 93.5%-100%) for dysplasia detection in IBD patients.
IN PRACTICE: “We have shown that LLMs are powerful, potentially generalizable tools for accurately extracting important information from clinical semistructured and unstructured text and which require little human-led development.” the authors of the study wrote
SOURCE: The study was based on data from the Million Veteran Program and supported by the Office of Research and Development, Veterans Health Administration, and the US Department of Veterans Affairs Biomedical Laboratory. It was published online in BMJ Open Gastroenterology.
LIMITATIONS: According to the authors, this research may be specific to the VHA system and the LLM models used. The authors did not test larger models. The authors acknowledge that without long-term access to graphics processing units, they could not feasibly test larger models, which may overcome some of the shortcomings seen in smaller models. Additionally, the researchers could not rule out overlap between Million Veteran Program and Corporate Data Warehouse reports, though they state that results in either cohort alone are sufficient validation compared with previously published work.
DISCLOSURES: The study was supported by Merit Review Award from the United States Department of Veterans Affairs Biomedical Laboratory Research and Development Service, AGA Research Foundation, National Institutes of Health grants, and the National Library of Medicine Training Grant. Kit Curtius reported receiving an investigator-led research grant from Phathom Pharmaceuticals. Shailja C Shah disclosed being a paid consultant for RedHill Biopharma and Phathom Pharmaceuticals, and an unpaid scientific advisory board member for Ilico Genetics, Inc.
This article was created using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.