• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Frequently Occurred Information Extraction from a Collection of Labeled Trees


Ju-Ryon Paik, Jung-Hyun Nam, Sung-Joon Ahn, Ung-Mo Kim, Journal of Internet Computing and Services, Vol. 10, No. 5, pp. 65-78, Oct. 2009
Full Text:
Keywords: Tree mining, Maximal frequent subtree, Embedded tree, Pattern-growth method

Abstract

The most commonly adopted approach to find valuable information from tree data is to extract frequently occurring subtree patterns from them. Because mining frequent tree patterns has a wide range of applications such as xml mining, web usage mining, bioinformatics, and network multicast routing, many algorithms have been recently proposed to find the patterns. However, existing tree mining algorithms suffer from several serious pitfalls in finding frequent tree patterns from massive tree datasets. Some of the major problems are due to (1) modeling data as hierarchical tree structure, (2) the computationally high cost of the candidate maintenance, (3) the repetitious input dataset scans, and (4) the high memory dependency. These problems stem from that most of these algorithms are based on the well-known apriori algorithm and have used anti-monotone property for candidate generation and frequency counting in their algorithms. To solve the problems, we base a pattern-growth approach rather than the apriori approach, and choose to extract maximal frequent subtree patterns instead of frequent subtree patterns. The proposed method not only gets rid of the process for infrequent subtrees pruning, but also totally eliminates the problem of generating candidate subtrees. Hence, it significantly improves the whole mining process.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Paik, J., Nam, J., Ahn, S., & Kim, U. (2009). Frequently Occurred Information Extraction from a Collection of Labeled Trees. Journal of Internet Computing and Services, 10(5), 65-78.

[IEEE Style]
J. Paik, J. Nam, S. Ahn, U. Kim, "Frequently Occurred Information Extraction from a Collection of Labeled Trees," Journal of Internet Computing and Services, vol. 10, no. 5, pp. 65-78, 2009.

[ACM Style]
Ju-Ryon Paik, Jung-Hyun Nam, Sung-Joon Ahn, and Ung-Mo Kim. 2009. Frequently Occurred Information Extraction from a Collection of Labeled Trees. Journal of Internet Computing and Services, 10, 5, (2009), 65-78.