• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Advanced Dyna-Q: A Real-World Reinforcement Learning Approach for Systems with Physical Delays


Jinuk Huh, YongJin Kwon, Journal of Internet Computing and Services, Vol. 26, No. 6, pp. 93-100, Dec. 2025
10.7472/jksii.2025.26.6.93, Full Text:  HTML
Keywords: Real-World RL, Dyna-Q, Advanced Dyna-Q, DQN, Greenhouse Temperature Control, Physical Delay, Physical AI

Abstract

Deep reinforcement learning (DRL) has shown outstanding performance in simulated domains such as strategy games and robotic control, where abundant data and rapid iteration are possible. However, transferring DRL to physical real-world systems, often referred to as physical AI, introduces additional challenges, including limited data availability, sensor and actuator noise, non-stationary dynamics, and physical response delays. In the context of a real farm greenhouse, we observed an average three-minute delay between executing actuator commands, such as activating the heater or air conditioner, and observing a measurable change in indoor temperature. If left unaddressed, such delays can misalign actions with their actual outcomes, thereby degrading learning efficiency and policy stability. To address this issue, we reformulated the Markov Decision Process (MDP) so that each decision step matches the system’s physical response interval, ensuring that state transitions and rewards are attributed to the correct action. Building on this formulation, we propose Advanced Dyna-Q, an extension of the original Dyna-Q algorithm that integrates a simulator-initialized environment model with continual mixed-experience learning from both real and model-generated transitions. The simulator, built using Gaussian Process Regression on historical greenhouse log data, is used solely for transfer learning to initialize the environment model prior to deployment. Once in operation, the model is periodically updated with new real-world data, enabling the generation of increasingly accurate simulated experiences to complement scarce real interactions. We evaluated Advanced Dyna-Q on a real-world greenhouse temperature control task, comparing it with a Deep Q-Network (DQN) baseline trained entirely in simulation. Over a 24-hour evaluation period, Advanced Dyna-Q reduced the mean absolute control error from 0.32 °C to 0.10 °C and increased the time the indoor temperature remained within the target range by more than three hours. The proposed method also demonstrated smoother control behavior, particularly under rapidly changing external conditions, by anticipating environmental changes and adjusting actions proactively. These results demonstrate that aligning the RL framework with the physical characteristics of the target system, combined with simulator-aided transfer learning and continual integration of real and simulated experiences, can yield accurate, stable, and data-efficient policies in real environments. The successful deployment of Advanced Dyna-Q in a real greenhouse suggests its potential as a step toward practical, reliable physical AI systems, with broader applicability to other delayed-response control problems in industrial and agricultural domains.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Huh, J. & Kwon, Y. (2025). Advanced Dyna-Q: A Real-World Reinforcement Learning Approach for Systems with Physical Delays. Journal of Internet Computing and Services, 26(6), 93-100. DOI: 10.7472/jksii.2025.26.6.93.

[IEEE Style]
J. Huh and Y. Kwon, "Advanced Dyna-Q: A Real-World Reinforcement Learning Approach for Systems with Physical Delays," Journal of Internet Computing and Services, vol. 26, no. 6, pp. 93-100, 2025. DOI: 10.7472/jksii.2025.26.6.93.

[ACM Style]
Jinuk Huh and YongJin Kwon. 2025. Advanced Dyna-Q: A Real-World Reinforcement Learning Approach for Systems with Physical Delays. Journal of Internet Computing and Services, 26, 6, (2025), 93-100. DOI: 10.7472/jksii.2025.26.6.93.