KMCI - KnowledgeMap Concept Indexer
August 7, 2015
Posted in
The KnowledgeMap Concept Indexer (KMCI) is the underlying natural language processing engine used in the KnowledgeMap and Learning Portfolio website, and has been used for many clinical and genomic research studies. It identifies biomedical concepts, mapped to Unified Medical Language System concepts, from natural language documents and clinical notes.
Nature Biotech article on PheWAS
August 4, 2015
http://www.nature.com/nbt/journal/v29/n1/full/nbt0111-46.htm
Nature Biotech featured PheWAS paper as one of the top computational biology innovations in 2010.
PheWAS R Package
July 20, 2015
Posted in
This package contains methods for performing PheWAS. Please contact PheWAS@vumc.org. if you encounter any errors or apparent bugs. The documentation is done natively in R. The command ?PheWAS once the package is loaded will direct you to the package description, including references to each function and an example. The command vignette("PheWAS-package") will display the package vignette with further "How to's".
NLP, Genomics/Pharmacogenomics, PheWAS, Genetic association studies
KMCI employs part-of-speech information to develop a shallow sentence parse, and performs variant generation and normalization using the SPECIALIST Lexicon and related tools.
PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations
Denny JC, Ritchie MD, Basford M, Pulley J, Bastarache L, Brown-Gentry K, Wang D, Masys DR, Roden DM, Crawford DC. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010 Mar 24. [Epub ahead of print]
Date Published: Wed, 03/24/2010
Replicating known SNP-disease associations using an EMR
(reported odds ratios 1.14-2.36) in at least two previous studies. We developed automated phenotype identification algorithms that used NLP techniques (to identify key findings, medication names, and family history), billing code queries, and structured data elements (such as laboratory results) to identify cases (n=70-698) and controls (n=808-3818). Final algorithms achieved positive predictive values (PPV) of ≥97% for cases and 100% for controls on randomly selected cases and controls.