• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Representative Labels Selection Technique for Document Cluster using WordNet


Tae-Hoon Kim, Mye Sohn, Journal of Internet Computing and Services, Vol. 18, No. 2, pp. 61-74, Apr. 2017
10.7472/jksii.2017.18.2.61, Full Text:
Keywords: Documents Cluster Labeling, Information content, WordNet, Similarity Calculation

Abstract

In this paper, we propose a Documents Cluster Labeling method using information content of words in clusters to understand what the clusters imply. To do so, we calculate the weight and frequency of the words. These two measures are used to determine the weight among the words in the cluster. As a nest step, we identify the candidate labels using the WordNet. At this time, the candidate labels are matched to least common hypernym of the words in the cluster. Finally, the representative labels are determined with respect to information content of the words and the weight of the words. To prove the superiority of our method, we perform the heuristic experiment using two kinds of measures, named the suitability of the candidate label ($Suitability_{cl}$) and the appropriacy of representative label ($Appropriacy_{rl}$). In applying the method proposed in this research, in case of suitability of the candidate label, it decreases slightly compared with existing methods, but the computational cost is about 20% of the conventional methods. And we confirmed that appropriacy of the representative label is better results than the existing methods. As a result, it is expected to help data analysts to interpret the document cluster easier.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Kim, T. & Sohn, M. (2017). Representative Labels Selection Technique for Document Cluster using WordNet. Journal of Internet Computing and Services, 18(2), 61-74. DOI: 10.7472/jksii.2017.18.2.61.

[IEEE Style]
T. Kim and M. Sohn, "Representative Labels Selection Technique for Document Cluster using WordNet," Journal of Internet Computing and Services, vol. 18, no. 2, pp. 61-74, 2017. DOI: 10.7472/jksii.2017.18.2.61.

[ACM Style]
Tae-Hoon Kim and Mye Sohn. 2017. Representative Labels Selection Technique for Document Cluster using WordNet. Journal of Internet Computing and Services, 18, 2, (2017), 61-74. DOI: 10.7472/jksii.2017.18.2.61.