|
A number of approaches to clinical Natural Language Processing (NLP) have evolved including simple rule-based methods, symbolic and grammatical computations, artificial intelligence or machine learning techniques and empirical derivations. These methods are described in more detail in An Introduction to Clinical Natural Language Processing and can be considered a series of transforms, or process "pipelines". |
|
The Natural Language Processing Evaluation Workbench (NLPEW) was developed so that an end user could compare and adjudiate the accuracy of clinical document annotations produced by two different NLP transformation pipelines. Transformation pipeline outputs may originate from reference standard human annotation systems (e.g. Knowtator) or automated NLP annotation systems (e.g. Topaz). |
![]() |
|
|
NLPEW - Two Tools |
| Annotations are the fundamental unit of analysis in the NLPEW application. In NLP transformations, annotations have zero or more attributes that specify details about the annotation, including relations and spans. After this level of comparison, differences arise in how each system organizes the next level of annotations. There are also differences between systems in describing relations between levels or between annotations. Given the differences in NLP transformation pipeline outputs, the NLPEW distribution consists of two tools, the Type Modeling Tool (TMT), which is used in mapping transformation output to a given "level", and the Evaluation Workbench (Workbench), which is used to display the annotations and related statistics for the mapped transformation output. |
TMT: Mapping Developed Pipelines |
| In addition to being used to determine which types and features are of significance to the end user in the two annotation pipelines under investigation, the TMT is used to assign, or map, a "level" (e.g. Document or Snippet) to each annotation pipeline type and to indicate which features from each pipeline are to be treated as a classification versus an attribute. Classification features are those that are considered intrinsic to the meaning of an annotation, for example a Unified Medical Language System (UMLS) Concept Unique Identifier (CUI), and attributes are modifiers which are important to the interpretation of an annotation but not intrinsic to the meaning as in the case of whether a finding is present or absent. |
| A primitive Unstructured Information Management applications (UIMA™) analysis engine was created for the TMT, so that a type model can be extracted from information contained in a UIMA Common Analysis System (CAS) object, as opposed to standalone UIMA type definition XML files as was done in the previous release. Similarly, an analysis engine was created for generating Workbench annotation files directly from a UIMA pipeline. For instance, the user can create an aggregate UIMA pipeline consisting of an existing domain pipeline + the workbench analysis engine, then generate a set of Workbench-readable annotation files by using the UIMA Collection Processing Engine configurator to apply that pipeline to a set of medical reports. Finally, the TMT can now accept a Knowtator schema file as input, in addition to accepting a UIMA type definition files, so that a Workbench type model can be extracted from Knowtator annotation sets. For more detailed instructions on how to employ the TMT, please visit the Tutorial II. Type Modeling Tool. |
Workbench: Evaluation Ideology |
| The Workbench is used to display relevance statistics for extracted features. The GUI interface allows the user to view and explore differences and similarities between designated schema levels of two NLP transformation pipelines. A user can navigate the interface in several ways, including directly investigating false positive or negative annotations, browsing an annotated file and viewing disagreements, or investigating outcome measures for annotations or attributes of the annotations. As a user selects a particular level items, statistics and other information is displayed in several panels as well as full text and relevant annotations for the selected level. A major part of the design philosophy behind the Workbench was to minimize the amount of user effort required to activate the display functions of the tool. For example, nearly all behaviors of the tool are activated by moving the mouse around while holding down the control key. The main display panel contains outcome measure statistics for the two sets of annotations, including scores for true positives, false positives, true negatives, false negatives, accuracy, positive predictive value, sensitivity, specificity, and negative predictive value. By default, the match criterion is the semantic type assigned to the annotations. For more detailed instructions on how to employ the Workbench, please visit the Tutorial I. Workbench. |
|
|
Future Development |
| The NLPEW application distribution is still under development and undergoing debugging. Future development plans include the enhancement of the TMT to allow iterative development of type models through the use of inputs by employing existing Workbench type models in addition to a UIMA (or other) type system definitions and increasing the number of Knowtator generated annotation examples included in the distribution by September 2012. |

