• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)

Opera Clustering: K-means on librettos datasets

Harim Jeong, Joo Hun Yoo, Journal of Internet Computing and Services, Vol. 23, No. 2, pp. 45-52, Apr. 2022
10.7472/jksii.2022.23.2.45, Full Text:
Keywords: Music Analysis, music information retrieval, Natural Language Processing, Embedding, Classification, K-Means Clustering


With the development of artificial intelligence analysis methods, especially machine learning, various fields are widely expanding their application ranges. However, in the case of classical music, there still remain some difficulties in applying machine learning techniques. Genre classification or music recommendation systems generated by deep learning algorithms are actively used in general music, but not in classical music. In this paper, we attempted to classify opera among classical music. To this end, an experiment was conducted to determine which criteria are most suitable among, composer, period of composition, and emotional atmosphere, which are the basic features of music. To generate emotional labels, we adopted zero-shot classification with four basic emotions, ‘happiness’, ‘sadness’, ‘anger’, and ‘fear.’ After embedding the opera libretto with the doc2vec processing model, the optimal number of clusters is computed based on the result of the elbow method. Decided four centroids are then adopted in k-means clustering to classify unsupervised libretto datasets. We were able to get optimized clustering based on the result of adjusted rand index scores. With these results, we compared them with notated variables of music. As a result, it was confirmed that the four clusterings calculated by machine after training were most similar to the grouping result by period. Additionally, we were able to verify that the emotional similarity between composer and period did not appear significantly. At the end of the study, by knowing the period is the right criteria, we hope that it makes easier for music listeners to find music that suits their tastes.

Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article
[APA Style]
Jeong, H. & Yoo, J. (2022). Opera Clustering: K-means on librettos datasets. Journal of Internet Computing and Services, 23(2), 45-52. DOI: 10.7472/jksii.2022.23.2.45.

[IEEE Style]
H. Jeong and J. H. Yoo, "Opera Clustering: K-means on librettos datasets," Journal of Internet Computing and Services, vol. 23, no. 2, pp. 45-52, 2022. DOI: 10.7472/jksii.2022.23.2.45.

[ACM Style]
Harim Jeong and Joo Hun Yoo. 2022. Opera Clustering: K-means on librettos datasets. Journal of Internet Computing and Services, 23, 2, (2022), 45-52. DOI: 10.7472/jksii.2022.23.2.45.