Curriculum Overview of the Data Science Track of the BMI PhD Program

A key aspect of our training philosophy is that students need to be exposed to a variety of real world research applications and innovations along the big data spectrum while they are setting their methodological foundation in biomedical informatics, computer science, and statistics. Vanderbilt has an outstanding environment for this, as big data paradigms are being used in a wide range of interesting biomedical and health policy applications, e.g. *omics analysis, protein functioning based on structural biology, the development of decision support tools for clinicians based on EMR data, the development of precision medicine regimens. This fully integrated philosophy effectively marries real world applications of big data methods with their foundations, making the governing principles of data science - namely generalizability, reproducibility, and validity - much less abstract. A welcome byproduct of this is that students are often encouraged to contribute to projects by building tools that often end up being widely used for similar applications.

 

Classroom Foundations

The BIDS PhD program will be organized to provide all matriculating students with a shared core curriculum that is split across four areas as shown in the Figure below.  1) Biomedical Informatics, which will be composed of three required courses in the foundations of clinical informatics and bioinformatics and the methods that support them, a course on scientific communication, a journal club & a research colloquium.  These courses will be complementary new course Biomedical Data Science Laboratory (discussed in detail below); 2) Computer Science, which will be composed of four required courses in data structures, algorithms, machine learning, and big data infrastructure, 3) Statistical Methods, which will be composed of four courses in biostatistics that follow a progression of basic principles to regression analysis and modeling and conclude with statistical inference methodologies, and 4) Biomedical Science, where students will be provided with the opportunity to take a sequence of two courses from the School of Nursing (for students focused on modeling, managing, and analyzing data from the clinical domain and studying clinical workflow) or one intensive course (double the normal number of credits) focused on the general biomedical graduate school program, which covers a wide range of topics from biochemistry to immunology.

BIDS Chart.jpg

Mentored Research Rotations and Dissertation

Identification of a research problem and faculty mentor begins upon matriculation into the program, with research rotations, seminar series (detailed below), and classes, all of which acquaint the student with challenging problems in the field and faculty’s range of research expertise and opportunities. PhD degree students select an advising committee to conduct the qualifying exam and mentor the dissertation. Contingent on successful written and oral qualifying examinations, the student becomes a doctoral candidate. For the doctoral dissertation research, the candidate prepares a research plan, executes the research, and prepares and publicly defends a written dissertation, as per university rules.

 

Ensuring Depth and Breadth

The new Data Science track will require students to take a deep dive into the computer science and statistical research methodology that is necessary to understand concepts of big data. Additionally, given that trainees will come from a range of undergraduate disciplines and experiences, the BIDS curriculum is designed ensure that trainees achieve sufficient breadth in areas that complement their undergraduate and degree and depth in BD2K areas in several ways. First, the student can choose among 16+ electives to create a tailored plan of study.  Second, if students are deemed to have satisfied the requirements of a core competency through previous coursework and/or experience, they will have opportunities to take additional electives in the same and/or other disciplines, to deepen and complement their existing knowledge and skillset.

 

Pragmatic Laboratory Course

In addition, we will develop a course called Biomedical Data Science Laboratory, which will be subtitled Everything You Wanted to Know about Building Big Systems But Were Too Afraid to Ask.  This course, to be developed by the PIs, with assistance from other faculty mentors, will serve as a synthesis course and provide hands-on learning in big biomedical data science. This course will build on the foundational courses pursued by the students in that it will tailor the course to biomedical applications, focus on problem solving in a team-based fashion, and review big data software systems as opposed to foundational tools and techniques..

 

Additional BIDS Educational Opportunities

Didactic courses and mentored research will be complemented by numerous opportunities for learning and critical thinking, such as several research seminar series, and career development seminar series Academic Survival Skills) – all of which are described in more detail in this section.

 

Biomedical Data Science Journal Club and Research Colloquium. This weekly colloquium will serve as an environment for reviewing advances in big data science and discussing their application in biomedical settings.  Examples of the types of systems that may be reviewed in this course include PARAMO, a pipeline for massively parallel hypothesis testing and predictive modeling in EMR data, Bioconductor, an open source toolkit for compression and analysis of genomic data, SPRINT, which extends Bioconductor to the Amazon Elastic Compute Cloud (EC2), and omniClassifier, recently introduced to support analytics over gene expression datasets on grid computing frameworks.

Biomedical Informatics Seminar Series. This weekly research seminar series, held Wednesday at noon from September through May, features current and relevant research presented by Vanderbilt faculty, advanced trainees, and guests of national and international stature. All students must attend during their course of study.

Software Carpentry Workshop.  The Software Carpentry Workshop (Bootcamp) at Vanderbilt is a workshop designed to help scientists and engineers get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. This hands-on workshop will cover basic concepts and tools, including program design, version control, data management, and task automation.

Biomedical Informatics Summer Seminar Series. DBMI faculty conduct 16 biweekly noon seminars engage full-time and summer internship students in exploration of the research opportunities in biomedical informatics and a range of career options available to students interested in biomedical informatics.

Biostatistics Seminar Series. This weekly seminar series is sponsored by the department of Biostatistics. Although wide ranging, it routinely covers issues related to the analysis of complex biomedical data such as missing data or statistical learning algorithms. One seminar per month is dedicated to a computational topic, which is also likely to be of interest to BIDS students.

Biostatistics Summer Book Club. Each summer, graduate program students choose a book to read with faculty supervision. There are weekly discussions, led by the students. This is a student run enterprise, as students rotate the presentation of chapters or sections and prepare computational examples for discussion.

Academic Survival Skills Seminar Series. Designed to provide training experiences that are complementary to other curricular activities in a seminar format. Topics include career progression in academic settings, job search strategies, and using professional contacts, meetings and community service to further career goals.

Attendance at National Research Conferences. Students will attend and present research findings at conferences appropriate to their subdiscipline. These will expand and deepen the trainees’ appreciation for meaningful contributions to big data and its applications in biomedical settings.

External Internships. Historically, the BIDS faculty members have been successful in assisting their students obtain internships in various academic, government, and industrial environments beyond Vanderbilt.  This will continue with students in the BIDS program to ensure they are exposed to other research settings and create partnerships for future collaborations and, eventually, employment. Faculty have active relationships with these and other organizations, have published jointly with researchers in these settings, and will assist in introducing students to their connections when possible.