All-department team publishes paper in Computational and Structural Biotechnology Journal
Congratulations to former visiting student Chia-Jung "Charlene" Chang (PhD candidate in biomedical engineering at National Cheng Kung University), research assistant professor Chih-Yuan Hsu, and professors Qi Liu and Yu Shyr on the publication of "VICTOR: Validation and inspection of cell type annotation through optimal regression." The article on this new method appeared online ahead of print on October 15 and will go to press as part of Computational and Structural Biotechnology Journal's December issue. As described in the paper's abstract:
Single-cell RNA sequencing provides unprecedent opportunities to explore the heterogeneity and dynamics inherent in cellular biology. An essential step in the data analysis involves the automatic annotation of cells. Despite development of numerous tools for automated cell annotation, assessing the reliability of predicted annotations remains challenging, particularly for rare and unknown cell types. Here, we introduce VICTOR: Validation and inspection of cell type annotation through optimal regression. VICTOR aims to gauge the confidence of cell annotations by an elastic-net regularized regression with optimal thresholds. We demonstrated that VICTOR performed well in identifying inaccurate annotations, surpassing existing methods in diagnostic ability across various single-cell datasets, including within-platform, cross-platform, cross-studies, and cross-omics settings.
Figure 1 in the paper provides an " One example in diagnosing the reliability of cell annotations. a) diagnostic performance of singleR, scmap, SCINA, scPred, CHETAH, scClassify, and Seurat. b) diagnostic performance of VICTOR when applied to annotations from singleR, scmap, SCINA, scPred, CHETAH, scClassify, and Seurat.

New evaluation system to render AI chatbots safe, empathetic
Alumnus Zhijun Yin (MS, 2017) is co-PI of this ARPA-H funded project, and professor Bradley Malin is a member of the team.
Vanderbilt Biostatistics at AMIA 2024
The 2024 AMIA (American Medical Informatics Association) Annual Symposium will take place in San Francisco from November 9 through November 13. Department members with work to be presented at the symposium include:
Saturday, November 9
Workshop 17, "REDCap on FHIR: Implementing and Using Clinical Data Interoperability Services" - professor Paul Harris, co-instructor/author
Sunday, November 10
Workshop 27, "Advancing Biomedical Research Using Multi-omics Data in the All of Us Researcher Workbench," 8:30 am - co-authored by Paul Harris
Session 7, "Pediatric Health Informatics - Kid Coders," 3:30 pm
"Revealing Patterns of Child Maltreatment Policy Differences and Demographic Dynamics using BERT-Networks and Clustering Approach" - co-authored by associate professor Rameela Raman
Monday, November 11
Session 17, "LIEAF: Artificial Intelligence and Data Science in Health Informatics Education," 8:30 am
Enhancing Causes of Death Prediction from Electronic Health Records through Multi-Modal Integration of Structured and Unstructured EHR Data - co-authored by professor Michael Matheny
Session 22, "AI Fairness and Ethics - Justice League," 8:30 am
- "Fairness of AI Collaboration and Suppression in Emergency Triage" - co-authored by professor Bradley Malin
- "Enhancement of Fairness in AI for Chest X-ray Classification" - co-authored by Bradley Malin
Session 53, "Utilization Data and Data Utilization - Auditory Audits, Listening to the Data," 3:30 pm
"Optimizing Large Language Models for Discharge Prediction: Best Practices in Leveraging Electronic Health Record Audit Logs" - co-authored by Bradley Malin
Session 54, "Patient Generated Data - Organic Certified," 3:30 pm
"Examining Oral Anti-Cancer Medication Continuation Using Questionnaires, Prescription Refills, and Structured Electronic Health Records" - co-authored by professor Qingxia Chen, Bradley Malin, and alumnus Zhijun Yin (MS 2017)
Poster session 1, 5:00 pm
P114: "Machine Learning Methods for Estimating Gestational Age at Birth from Electronic Health Records" - co-authored by professor Leena Choi
P118: "Large Language Models Enhance the Identification of Emergency Department Visits for Symptomatic Kidney Stones" - co-authored by PhD candidate Siwei Zhang and assistant professor Yaomin Xu
P171: "Comparing EHR-recorded Race/Ethnicity to Self-reported Race/Ethnicity: Insights from the All of Us Research Program" - coauthored by Xiaoke (Sarah) Feng (first author), biostatistician Andrew Guide, assistant in biostatistics Shawn Garbett, and Qingxia Chen
Tuesday, November 12
Session 98, "Wearable Sensor Data - Data on the Go," 3:30 pm
"'I worry we’ll blow right by it': Barriers to Uptake of the STRATIFY CDSS for ED Discharge in Acute Heart Failure" - co-authored by associate professor Dandan Liu
P05: "Utilizing Large Language Models (LLM) to Optimize Domain-Specific Natural Language Processing (NLP) for Identifying Patients with No Reason for Not Prescribing ACEI/ARB in Chronic Kidney Disease (CKD) Management" - co-authored by Michael Matheny
P27: "Assessing ChatGPT Responses to Alzheimer’s Disease Myths" - co-authored by Bradley Malin and Zhijun Yin
P117: "Algorithmic Matching of Unique Device Information to Electronic Health Record Data" - co-authored by Michael Matheny
P178: "A Study of Challenges In Algorithmic Transportability Between VHA Sites" - co-authored by Michael Matheny
P188: "Real-Time Automated Billing for Tobacco Treatment: A CDS Hook Approach for Simulating Clinician Facing Coding Prompts Within EHRs" - co-authored by Michael Matheny
Wednesday, November 13
Session 102, "Self-Service Software Tools for Clinical and Translational Research: Rationale, Benefits, Limitations, Challenges, and the Future," 8:00 am - Paul Harris, speaker
Updated 11.11.2024 to include P01.
Vanderbilt Biostatistics at WSDS 2024
The 2024 Women in Statistics and Data Science Conference is underway in Reston, Virginia, from October 16 through 18. We are proud of the department members and alumni involved with this year's meeting. They include:
MS student Zongyue Teng
- First and presenting author of "Going for gold: Using record linkage and Bayesian hierarchical modeling to select winning gymnasts at the 2024 Paris Olympics" (speed session Wednesday, poster Thursday; graphic via WakeForestStats)
Sarah Lotspeich (PhD 2021)
- Co-author of "Quantifying the impact of measurement error on health disparities models" (speed session Wednesday, poster Thursday)
- Organizer of and speaker in "Mastering Data: Insights into Master's Degrees in Statistics and Analytics" (panel, Thursday)
- Organizer of "More than Statistics: Improving Maternal and Infant Health with Data" (invited session, Thursday)
- Co-author of "Adjusting for covariate misclassification to quantify the relationship between diabetes and local access to healthy food" (speed session 3, Thursday)
- Co-organizer of and speaker in "Statistical Methods For HIV Research: Battling An Epidemic With Linked, Missing, And Error-prone Data" (invited session, Thursday)
- Panelist for "Statistical Storytelling: Insights Into Effective Presentation Strategies" (Friday)
- Organizer of "Cause For Celebration: Adapting Causal Inference Methods For Challenging Datasets" (invited session, Friday)
PhD student Ashley Mullan
- First and presenting author of "Adjusting for covariate misclassification to quantify the relationship between diabetes and local access to healthy food" (speed session and poster, Thursday)
- Chair of and panelist for "Statistical Storytelling: Insights Into Effective Presentation Strategies" (Friday)
Lead biostatistician Amy Perkins
- First and presenting author of "Machine Learning Model Robustness and Performance Stability in Future Years when Predicting Adverse Events in a Veteran Population and a Diabetic Subpopulation" (speed session and poster, Thursday). Co-authors include assistant professor Amber Hackstadt and professor Michael Matheny.
Lucy D'Agostino McGowan (PhD 2018):
- Speaker in "Statistical Methods for Missing Data Imputation" (panel, Thursday)
- Co-organizer of "Statistical Methods For HIV Research: Battling An Epidemic With Linked, Missing, And Error-prone Data" (invited session, Thursday)
- Panelist for "Statistical Storytelling: Insights Into Effective Presentation Strategies" (Friday)
- Speaker in "Cause For Celebration: Adapting Causal Inference Methods For Challenging Datasets" (invited session, Friday)
Statistical Computing Series: Intro to GitHub
The Department of Biostatistics' Statistical Computing Series focuses on the implementation of statistical models and methods, statistical computation and graphics. These informal meetings allow experienced statisticians and developers to share their expertise on computing topics with practitioners across Vanderbilt. On Thursday, October 31, at 1:00 pm, application developer Savannah Obregon will present "Introduction to GitHub," on Microsoft Teams:
Use GitHub for seamless collaboration and robust version control in your projects. This presentation will guide you through the essential features and best practices for using GitHub in a team setting. Learn how to manage repositories, branches, pull requests, and issues to streamline your workflow.
For an example of what's possible on GitHub, see Obregon's own website: https://smobregon.github.io. She was recently named winner of the department's IT Innovation Award and has delivered presentations at conferences such as R/Medicine.
To obtain a link to this webinar, contact series organizer Ryan Moore.
Brant Imhoff promoted to senior biostatistician
We are pleased to announce the promotion of Brant Imhoff to senior biostatistician, effective October 1. Imhoff earned his bachelor's and master's degrees in statistics at Miami University (Ohio), where he won the Teradata Analytics Challenge in 2020 for COVID-19 time series modeling and completed his thesis, "Evaluating Scores for Comparing Powerlifters." He worked as a statistician for the Army National Guard and credit risk analyst for Macy's prior to joining our department in 2022. His support for projects at the Vanderbilt Biostatistics Data Coordinating Center, Pragmatic Critical Care Research Group, and other teams has included leading DSMB (data and safety monitoring board) closed session meetings for multiple clinical trials; constructing, maintaining, and monitoring electronic data captures; generating statistical analyses, custom reports, and billing; and working on manuscripts and study analysis plans, with co-authorship of peer-reviewed papers published in the New England Journal of Medicine, American Heart Journal, and BMJ Open. A certified personal trainer and competitive powerlifter, Imhoff's other interests include sports science, and he is proficient in Spanish as well as R, Python, and Java. Click his name to view his staff profile.

Department of Biostatistics 2024 Service Milestones
Congratulations for the following faculty and staff members for reaching these service milestones at Vanderbilt University Medical Center:
Name | Years at Vanderbilt University |
Sandra Hewston, senior financial analyst | 25 |
Jonathan Schildcrout, vice chair for research | 20 |
Robert Greevy, director of health services research | 20 |
Janey Wang, chief business officer | 15 |
Cierra Streeter, program manager - operational support | 5 |
Chih-Yuan Hsu, research assistant professor | 5 |

Aaron Lee promoted to senior biostatistician
We are pleased to announce the promotion of Aaron Lee to senior biostatistician, effective September 27. A 2021 graduate of our MS program, Lee is supervised by professor Tatsuki Koyama and supports investigators across the medical center and beyond by planning and conducting statistical analyses, plus preparing formal reports and presentations about those analyses using RStudio, LaTeX, and R Markdown. He has applied regression modeling and other biostatistical tools to studies of overactive bladder syndrome and other urinary conditions, COVID, cancer, kidney disease, HIV, liver disease, workplace violence, lung transplant outcomes, and more, with co-authorship of peer-reviewed papers in The American Surgeon, Clinical Neurophysiology, BMC Cancer, British Journal of Cancer, and Pan African Medical Journal. Click his name to view his staff profile.

Methods and Publication Awards
Each year, our department produces, presents, and publishes hundreds of papers and software packages. We are pleased to recognize some of the best work from the past year:
'
Left to right, top: Savannah Obregon, Cole Beck, Shawn Garbett, Onur Orun. Bottom: Bryan Blette, Shengxin Tu, Bryan Shepherd
The IT Innovation Award celebrates the creative and crucial contribution that IT members make to department operations, and to the research program within the department, across the medical center, and more broadly. The 2023 IT Innovation Award goes to application developer Savannah Obregon, senior application developer Cole Beck, and director of informatics software development Shawn Garbett for REDCapAPI: Interface to REDCap. The review panel commented that while the package was "initially conceived to export the raw API from REDCap into R, it has grown dramatically. Its features collectively ensure reliable and automated retrieval of REDCap data in a format that is ready for analysis." It is deserving of the award because it "provides a robust tool for researchers, aiding them in conducting efficient and effective research studies and is being used both inside VUMC and at institutions such as Meharry Medical College, Harvard Medical School, Children’s Hospital of Orange County, University of Colorado, Virginia Tech, Indiana and others."
The Linda Stewart Analysis Report Award recognizes an exceptional applied analysis report written by a staff biostatistician in our department. The winner of the 2023 award is Onur Orun, for a BRAIN-ICU long-term outcomes latent trajectory analysis report, co-authored by associate professor Rameela Raman. The judges for this award stated that "the report is extremely well polished, including high quality figures and tables combined with a thoroughly comprehensive view of the study with solid explanations of background and summary of results."
The Patrick Arbogast Collaborative Publication Award recognizes an exceptional collaborative publication from a biostatistician in our department. Assistant professor Bryan Blette received this award for "Is low-risk status a surrogate outcome in pulmonary arterial hypertension? An analysis of three randomised trials," which was published in The Lancet Respiratory Medicine (October 2023) with co-authors at Penn, Brown, Yale, and Cedars-Sinai. According to the judges, "This biostatistician first-authored paper provided convincing evidence for the invalidity of a widely accepted and used surrogate outcome in clinical care and trials. It serves as an outstanding example of how biostatisticians influence medical practice in a critical and positive manner."
The Methods Publication Award recognizes an exceptional methodological publication from a biostatistician or team of biostatisticians in our department. It was awarded to recent graduate Shengxin Tu (PhD, 2024) and professor Bryan Shepherd for "Rank intraclass correlation for clustered data," which was published in Statistics in Medicine (August 2023) with co-authors at University of Southern California and University of North Carolina. The judges wrote: "This paper introduces a novel rank-based approach to Intraclass Correlation Coefficient that enhances its robustness and applicability, showcasing a solid theoretical foundation and notable methodological creativity. The inclusion of real-world examples, comprehensive simulation results, and an accompanying R package heighten its practical value."
The winners receive personalized plaques and $200, and their names are added to the awards wall in the department, on the 11th floor of 2525 West End Avenue. The list of past winners is posted in the About section of this website.
We are immensely grateful to the faculty and staff who contributed their time and expertise to evaluating and discussing this year's entries. The panel was divided into separate committees for each award, with results relayed to an administrator who compiled and announced the results at the September All-Department Meeting. Judges did not participate on the committees of awards they were in contention for. This year's slate of volunteers:
Gustavo Amorim | Bryan Blette | Hank Domenico | Svetlana Eden |
Cathy Jenkins | Tatsuki Koyama | Jinyuan Liu | Trey McGonigle |
Hui Nian | Laurie Samuels | Jonathan Schildcrout | Yaping Shi |
Jing Wang | Shilin Zhao |
Becker's names VUMC a leading health system in AI
Bradley Malin, Accenture Professor of Biomedical Informatics, Biostatistics, and Computer Science, is co-director of ADVANCE (AI Discovery and Vigilance to Accelerate Innovation and Clinical Excellence) and quoted in the article.