Photonic artificial intelligence has attracted considerable interest in accelerating machine learning; however, the unique optical properties have not been fully used for achieving higher-order functionalities. Chaotic itinerancy, with its spontaneous transient dynamics among multiple quasi-attractors, can be used to realize brain-like functionalities. In this study, we numerically and experimentally investigate a method for controlling the chaotic itinerancy in a multimode semiconductor laser to solve a machine learning task, namely, the multiarmed bandit problem, which is fundamental to reinforcement learning. The proposed method uses chaotic itinerant motion in mode competition dynamics controlled via optical injection. We found that the exploration mechanism is completely different from a conventional searching algorithm and is highly scalable, outperforming the conventional approaches for large-scale bandit problems. This study paves the way to use chaotic itinerancy for effectively solving complex machine learning tasks as photonic hardware accelerators.