• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Time Series Forecasting Using RAG-based News Information and Large Language Models: A Case Study on COVID-19 Data


Dongkuk Kim, Yeonjun Choi, Beakcheol Jang, Journal of Internet Computing and Services, Vol. 26, No. 4, pp. 79-88, Aug. 2025
10.7472/jksii.2025.26.4.79, Full Text:  HTML
Keywords: Time Series Forecasting, Large Language Model (LLM), COVID-19, RAG, Natural language-based time series forecasting

Abstract

Time series forecasting is widely used across industrial and public sectors. In particular, forecasting the spread of COVID-19 plays a crucial role in planning quarantine policies. Traditional numerical forecasting models effectively learn from historical data but struggle to incorporate external factors such as social events or policy changes. In rapidly evolving situations like a pandemic, it is also difficult to secure sufficient high-quality time series data. Recently, studies have attempted to use large language models (LLMs) for time series forecasting. However, most rely on pre-trained internal knowledge, which limits their adaptability to external changes. This study proposes a new LLM-based forecasting method that integrates external information to address these limitations. Using the Retrieval-Augmented Generation (RAG) approach, we retrieve COVID-19-related news from the three days preceding the target date. These documents are summarized and converted into a natural language template, which is then used as input to the LLM. Experiments using daily confirmed COVID-19 case data in South Korea show that the proposed framework consistently outperforms traditional models such as LSTM, TCN, Transformer, and Informer, as well as existing LLM-based methods. By refining and integrating unstructured external information into the LLM input, this approach offers a new direction for time series forecasting. It also shows the potential for broader applications beyond infectious disease prediction, including domains such as economics, climate, and healthcare.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Kim, D., Choi, Y., & Jang, B. (2025). Time Series Forecasting Using RAG-based News Information and Large Language Models: A Case Study on COVID-19 Data. Journal of Internet Computing and Services, 26(4), 79-88. DOI: 10.7472/jksii.2025.26.4.79.

[IEEE Style]
D. Kim, Y. Choi, B. Jang, "Time Series Forecasting Using RAG-based News Information and Large Language Models: A Case Study on COVID-19 Data," Journal of Internet Computing and Services, vol. 26, no. 4, pp. 79-88, 2025. DOI: 10.7472/jksii.2025.26.4.79.

[ACM Style]
Dongkuk Kim, Yeonjun Choi, and Beakcheol Jang. 2025. Time Series Forecasting Using RAG-based News Information and Large Language Models: A Case Study on COVID-19 Data. Journal of Internet Computing and Services, 26, 4, (2025), 79-88. DOI: 10.7472/jksii.2025.26.4.79.