Importance of Measuring Sentential Semantic Knowledge Base of a "Free Text" Medical Corpus

Date
Language
American English
Embargo Lift Date
Department
Committee Chair
Committee Members
Degree
M.S.
Degree Year
2008-05
Department
School of Informatics
Grantor
Indiana University
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

At present, the healthcare industry uses codified data mainly for billing purpose. Codified data could be used to improve patient care through decision support and analytical systems. However to reduce medical errors, these systems need access to a wide range of medical data. Unfortunately, a great deal of data is only available in a narrative or free text form, requiring natural language processing (NLP) techniques for their codification. Structuring narrative data and analyzing their underlying meaning from a medical domain requires extensive knowledge acquired through studying the domain empirically. Existing NLP system like MedLEE has a limited ability to analyze free text medical observations and codify data against Unified Medical Language System (UMLS) codes. MedLEE was successful in extracting meaning from relatively simple sentences from radiological reports, but could not analyze more complicated sentences which appear frequently in medical reports. An important problem in medical NLP is, understanding how many codes or symbols are necessary to codify a medical domain completely. Another problem is determining whether existing medical lexicons like SNOMED-CT and ICD-9, etc. are suitable for representing the knowledge in medical reports unambiguously. This thesis investigates the problems behind current NLP systems and lexicons, and attempts to estimate the number of required symbols or codes to represent a large corpus of radiology reports. The knowledge will provide a greater understanding of how many symbols may be needed for the complete representation of concepts in other medical domains.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Rights
Source
Alternative Title
Type
Thesis
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}