Documentation

A number of approaches to clinical Natural Language Processing (NLP) have evolved including simple rule-based methods, symbolic and grammatical computations, artificial intelligence or machine learning techniques and empirical derivations. These methods are described in more detail in An Introduction to Clinical Natural Language Processing and can be considered a series of transforms, or process "pipelines".

The Natural Language Processing Evaluation Workbench (NLPEW) was developed so that an end user could compare and adjudiate the accuracy of clinical document annotations produced by two different NLP transformation pipelines. Transformation pipeline outputs may originate from reference standard human annotation systems (e.g. Knowtator) or automated NLP annotation systems (e.g. Topaz).

 

NLPEW - Two Tools

Annotations are the fundamental unit of analysis in the NLPEW application. In NLP transformations, annotations have zero or more attributes that specify details about the annotation, including relations and spans. After this level of comparison, differences arise in how each system organizes the next level of annotations. There are also differences between systems in describing relations between levels or between annotations. Given the differences in NLP transformation pipeline outputs, the NLPEW distribution consists of two tools, the Type Modeling Tool (TMT), which is used in mapping transformation output to a given "level", and the Evaluation Workbench (Workbench), which is used to display the annotations and related statistics for the mapped transformation output.


TMT: Mapping Developed Pipelines

In addition to being used to determine which types and features are of significance to the end user in the two annotation pipelines under investigation, the TMT is used to assign, or map, a "level" (e.g. Document or Snippet) to each annotation pipeline type and to indicate which features from each pipeline are to be treated as a classification versus an attribute. Classification features are those that are considered intrinsic to the meaning of an annotation, for example a Unified Medical Language System (UMLS) Concept Unique Identifier (CUI), and attributes are modifiers which are important to the interpretation of an annotation but not intrinsic to the meaning as in the case of whether a finding is present or absent.
A primitive Unstructured Information Management applications (UIMA™) analysis engine was created for the TMT, so that a type model can be extracted from information contained in a UIMA Common Analysis System (CAS) object, as opposed to standalone UIMA type definition XML files as was done in the previous release. Similarly, an analysis engine was created for generating Workbench annotation files directly from a UIMA pipeline. For instance, the user can create an aggregate UIMA pipeline consisting of an existing domain pipeline + the workbench analysis engine, then generate a set of Workbench-readable annotation files by using the UIMA Collection Processing Engine configurator to apply that pipeline to a set of medical reports. Finally, the TMT can now accept a Knowtator schema file as input, in addition to accepting a UIMA type definition files, so that a Workbench type model can be extracted from Knowtator annotation sets. For more detailed instructions on how to employ the TMT, please visit the Tutorial II. Type Modeling Tool.


Workbench: Evaluation Ideology

The Workbench is used to display relevance statistics for extracted features. The GUI interface allows the user to view and explore differences and similarities between designated schema levels of two NLP transformation pipelines. A user can navigate the interface in several ways, including directly investigating false positive or negative annotations, browsing an annotated file and viewing disagreements, or investigating outcome measures for annotations or attributes of the annotations. As a user selects a particular level items, statistics and other information is displayed in several panels as well as full text and relevant annotations for the selected level. A major part of the design philosophy behind the Workbench was to minimize the amount of user effort required to activate the display functions of the tool. For example, nearly all behaviors of the tool are activated by moving the mouse around while holding down the control key. The main display panel contains outcome measure statistics for the two sets of annotations, including scores for true positives, false positives, true negatives, false negatives, accuracy, positive predictive value, sensitivity, specificity, and negative predictive value. By default, the match criterion is the semantic type assigned to the annotations. For more detailed instructions on how to employ the Workbench, please visit the Tutorial I. Workbench.

 

Future Development

The NLPEW application distribution is still under development and undergoing debugging. Future development plans include the enhancement of the TMT to allow iterative development of type models through the use of inputs by employing existing Workbench type models in addition to a UIMA (or other) type system definitions and increasing the number of Knowtator generated annotation examples included in the distribution by September 2012.




SHARPn 3rd Annual Summit Presentation; NLP Software Demos - Part IV: Evaluation Workbench

Abstract

The most recent Natural Language Processing Evaluation Workbench (NLPEW) distribution includes two tools for comparing and adjudicating the accuracy of pairs of clinical document annotations produced by various NLP transformation "pipelines" - the Workbench and the Type Modeling Tool (TMT). Several functions and enhancements have been added to each tool, including the ability to display multiple classification names for each annotation (e.g. - UMLS CUI versus normalized concept name), to match pairs of annotations on strict (same start and end positions) versus relaxed (any character overlap) criteria, and to select matching pairs based on whether they belong to the same semantic category, possess a common attribute (e.g. - polarity), or simply whether they textually overlap.

The Java class which converts Unstructured Information Management Architecture (UIMA) and other annotation pipelines into Workbench format was made extensible in cases where those values need to be changed in some way before being displayed in the Workbench. A primitive UIMA analysis engine was created for the TMT, so that a type model can be extracted from information contained in a UIMA Common Analysis System (CAS) object, as opposed to standalone UIMA type definition XML files as was done in the previous release. Similarly, an analysis engine was created for generating Workbench annotation files directly from a UIMA pipeline. For instance, the user can create an aggregate UIMA pipeline consisting of an existing domain pipeline + the workbench analysis engine, then generate a set of Workbench-readable annotation files by using the UIMA Collection Processing Engine configurator to apply that pipeline to a set of medical reports. Finally, the TMT can now accept a Knowtator schema file as input, in addition to accepting a UIMA type definition files, so that a Workbench type model can be extracted from Knowtator annotation sets. Although the NLPEW Knowtator interface is still under development, preliminary versions of the Workbench and TMT are now available here which also include a limited number of example annotations based on Topaz outputs.

Future development plans include the enhancement of the TMT to allow iterative development of type models through the use of inputs by using existing Workbench type models in addition to a UIMA (or other) type system definitions, and an increase in the number of Knowtator generated examples included in the distribution. The next release is scheduled for July 2012.

Handouts for the SHARPn Summit "Secondary Use" 3rd Annual Face-to-Face demo given on June 11-12, 2012.