Grants | Hail Lab

Automatic Sensing for Clinical Documentation

Award Number:W81XWH-17-C-0252; Principal Investigator:Daniel Fabbri

We are proposing to develop a novel hands free clinical documentation system for use in the operational environment that leverages a combination of off-the-shelf sensors, accelerometers, and cameras to build a software system to automatically detect the motion signatures associated with key clinical tasks and generate an abbreviated care record which can be transmitted upstream in real-time. Clinical documentation during both the point-of-injury and en route phases of care in the theater and operational environments continues to be incomplete, inaccurate, and detrimental to the goal of ensuring that receiving providers at Role 1, 2, or 3 facilities are able to rapidly gain situational awareness of the patients moving through the system. Major limitations that currently prevent the creation of timely and accurate clinical documentation include time pressure, the unique stress of providing care under fire, the use of personal protective equipment, limited visibility, and constrained working spaces. Additionally, even when documentation is generated, it is rarely transmitted either timely, clearly, or effectively.

Crowd Sourcing Labels From Electronic Medical Records to Enable Biomedical Research

Award Number:1 UH2 CA203708-01; Principal Investigator:Daniel Fabbri

Supervised machine learning is a popular method that uses labeled training examples to predict future outcomes. Unfortunately, supervised machine learning for biomedical research is often limited by a lack of labeled data. Current methods to produce labeled data involve manual chart reviews that are laborious and do not scale with data creation rates. This project aims to develop a framework to crowd source labeled data sets from electronic medical records by forming a crowd of clinical personnel labelers. The construction of these labeled data sets will allow for new biomedical research studies that were previously infeasible to conduct. There are numerous practical and theoretical challenges of developing a crowd sourcing platform for clinical data. First, popular, public crowd sourcing platforms such as Amazon's Mechanical Turk are not suitable for medical record labeling as HIPAA makes clinical data sharing risky. Second, the types of clinical questions that are amenable for crowd sourcing are not well understood. Third, it is unclear if the clinical crowd can produce labels quickly and accurately. Each of these challenges will be addressed in a separate Aim. As the first Aim of this project, the team will evaluate different clinical crowd sourcing architectures. The architecture must leverage the scale of the crowd, while minimizing patient information exposure. De-identification tools will be considered to scrub clinical notes t reduce information leakage. Using this design, the team will extend a popular open source crowd sourcing tool, Pybossa, and release it to the public. As the second Aim, the team will study the type, structure, topic and specificity of clinical prediction questions, and how these characteristics impact labeler quality. Lastly, the team will evaluate the quality and accuracy of collected clinical crowd sourced data on two existing chart review problems to determine the platform's utility.

TWC: Small: Analysis and Tools for Auditing Insider Accesses

Award Number:1526014; Principal Investigator:Daniel Fabbri; Co-Principal Investigator:Jonathan Wanderer, Bradley Malin;

Compliance officers specify organizations' policies and procedures for mitigating risk to sensitive data. However, demands for employees' quick access to organizational data often limit which security technologies can be deployed. As a result, many organizations configure an open access environment in which authenticated employees can access any piece of data (e.g., a common practice across health care facilities). One specific risk of an open access environment is that employees may access data they do not need for their role or responsibilities, potentially resulting in data breaches or privacy violations. This insider threat is extremely challenging for compliance officers to detect because of the dynamic nature of access patterns and the large volume of accesses. This project is developing an auditing framework that allows for the simple, interpretable and efficient monitoring of accesses to detect insiders' inappropriate use. The development of this framework will allow compliance officers to drill-down into the access history, filter away accesses that occur for valid operational reasons and focus on suspicious behavior, therefore improving the overall security of sensitive data.

The main hypothesis of this research is that most appropriate accesses in specialized organizations, such as health care facilities, occur for valid operational reasons and those reasons are documented in the organization's database. Therefore, if a reason for access can be gleaned from operational and workflow data and meta-data, a log record of the access can be automatically filtered without requiring manual compliance officer review. This work contrasts with alternative methods that utilize the access log in isolation, and produce results that are difficult to interpret. This project is studying how explanations for accesses (1) are modeled and capture these operational reasons, (2) can be mined directly from the database, (3) can be enhanced by filling-in frequently missing types of data, and (4) can drastically reduce the auditing burden compared to current manual auditing approaches. The explanation methodology is being evaluated on data from a large health care system, which produces approximately one billion logged accesses per year. The empirical evaluation also compares how such an approach compares to current common methods for identifying high-risk insider accesses. Hospital compliance officers are consulting with the research team to verify the approach.

EAGER: Managing Information Risk and Breach Discovery

Award Number:1536871; Principal Investigator:Daniel Fabbri; Co-Principal Investigator:Laurie Novak, Bradley Malin;

Increasing demands for data access dominate privacy concerns, putting both data and organizations at risk. However, there is currently a shortage of research on how organizations develop and maintain practices to ensure information privacy. Small scale, preliminary investigations suggest there is variation in organizational practices and those that have been studied only minimally reflect documented organizational policies. While technologies exist to help monitor accesses to data, they are rarely deployed, such that manual audits remain the norm. This project aims to improve security measures in organizations by better understanding risk management and breach discovery life cycles. Traditional technological solutions lack grounding in real organizational routines, resulting in poor fit with existing work practices and limited adoption. The problem demands a multi-disciplinary effort to represent organizational risks and practices, theory to quantify the risk, and methods to translate the findings for privacy and security practices and technologies that seek to mitigate the risk. This work will influence the development and deployment of technological cybersecurity tools in multiple industries. Specifically, it will provide concrete assessments of breach management routines, how they are structured, and the uptake that can reasonably be expected of breach management technologies given industry-specific constraints.

This project uses a sociotechnical approach, integrating qualitative data on privacy practices, and perceived constraints and influences within the process, into a computational model that will be used to represent constraints and influences on the deployment of privacy and security measures. This model will account for various actors within the privacy and security hierarchy, such as compliance officers, security officers and executives. It allows for conceptualization of organizational practices and the areas of potential adaptation for the practices. In particular, the computational contributions are two-fold: (i) an optimization problem formulation of the risk management and breach discovery life cycle, and (ii) a taxonomy of perceived organizational risks and their mapping to mitigating technological measures. In addition, these computational methods will inform changes in life cycle process, and gaps among current technological offerings. Results include tools for analyzing an organization's security routines and risk perspectives, and output organization guidance to better manage risk.