The KnowledgeMap Concept Indexer (KMCI) is the underlying natural language processing engine used in the KnowledgeMap and Learning Portfolio website, and has been used for many clinical and genomic research studies. It identifies biomedical concepts, mapped to Unified Medical Language System concepts, from natural language documents and clinical notes.
KMCI employs part-of-speech information to develop a shallow sentence parse, and performs variant generation and normalization using the SPECIALIST Lexicon and related tools. The KMCI system was designed particularly for poorly-formatted documents containing ad hoc abbreviations and underspecified concepts (e.g., the document phrase “ST” implying the “ST segment” of an electrocardiogram instead of abnormal finding “ST elevation”). Using probabilistic information and concept co-occurrence data derived from PubMed, KMCI can map ambiguous strings such as “CHF” to the UMLS concept C0018802 “Congestive heart failure” in an echocardiogram report but to the concept C0009714 “Congenital hepatic fibrosis” in a document discussing infantile polycystic kidney disease (a genetically related condition to congenital hepatic fibrosis).
KMCI has performed favorably in comparison to MetaMap and has been validated in a variety of clinical and education contexts (see publications). Later additions to KMCI include the ability to detect negated terms (e.g., "no chest pain) via a Perl implementation of NegEx.
For further information, please contact Josh Denny