Comparison of Deep Learning based Concept Representations for Biomedical Document Clustering

Date
2018
Language
English
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
IEEE
Abstract

In this research, document representations based on distributed representations of the concepts along with new weighting schemes for the documents are explored. The baseline weighting scheme is the traditional Term Frequency-Inverse Document Frequency (TF-IDF) of the concepts, whereas, the other two newly proposed ones consider both local content using the TF-IDF and associations between concepts. The distributed representations of the concepts are measured using a deep learning algorithm. The evaluation of the proposed document representations is based on the k-means clustering results. The results show that document representation based on TF-IDF in combination with the term based distributed representations for concepts outperforms the other two based on the returned evaluation metrics - F1-measure (80.21%) and Purity (77.1%).

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Shah, S., & Luo, X. (2018). Comparison of deep learning based concept representations for biomedical document clustering. In 2018 IEEE EMBS International Conference on Biomedical Health Informatics (BHI) (pp. 349–352). https://doi.org/10.1109/BHI.2018.8333440
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
2018 IEEE EMBS International Conference on Biomedical Health Informatics
Rights
Publisher Policy
Source
Author
Alternative Title
Type
conference proceedings
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Author's manuscript
Full Text Available at
This item is under embargo {{howLong}}