Autocorrelation effects in a stochastic-process model for decision making via time series

Dataemia
2 Min Read



arXiv:2603.05559v1 Announce Type: new
Abstract: Decision makers exploiting photonic chaotic dynamics obtained by semiconductor lasers provide an ultrafast approach to solving multi-armed bandit problems by using a temporal optical signal as the driving source for sequential decisions. In such systems, the sampling interval of the chaotic waveform shapes the temporal correlation of the resulting time series, and experiments have reported that decision accuracy depends strongly on this autocorrelation property. However, it remains unclear whether the benefit of autocorrelation can be explained by a minimal mathematical model. Here, we analyze a stochastic-process model of the time-series-based decision making using the tug-of-war principle for solving the two-armed bandit problem, where the threshold and a two-valued Markov signal evolve jointly. Numerical results reveal an environment-dependent structure: negative (positive) autocorrelation is optimal in reward-rich (reward-poor) environments. These findings show that negative autocorrelation of the time series is advantageous when the sum of the winning probabilities is more than $1$, whereas positive autocorrelation is useful when the sum of the winning probabilities is less than $1$. Moreover, the performance is independent of autocorrelation if the sum of the winning probabilities equals $1$, which is mathematically clarified. This study paves the way for improving the decision-making scheme for reinforcement learning applications in wireless communications and robotics.



Source link

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

error: Content is protected !!