TY  - JOUR
AU  - Gambo, Ishaya 
AU  - Agbonkhese, Christopher 
AU  - Aina, Segun 
AU  - Otegbayo, Mogboluwaga Tayo 
AU  - Adekunle, Johnson Bayo 
AU  - Odetola, Israel 
AU  - Gambo, Omobola 
AU  - Oluwadare, Tolulope 
AU  - Odetola, Oluwatoni 
PY  - 2024
TI  - Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem
JF  - Journal of Computer Science
VL  - 20
IS  - 11
DO  - 10.3844/jcssp.2024.1519.1529
UR  - https://thescipub.com/abstract/jcssp.2024.1519.1529
AB  - The ever-evolving landscape of digital payments demands continuous innovation and self-improvement. This study addresses this imperative by simulating a model for payment routing, a crucial aspect of the digital payment ecosystem. To achieve this, industry professionals were interviewed to inform the approach, emphasizing data randomization for effective data collection. Using Python, a randomized dataset is created and three Reinforcement Learning (RL) algorithms are implemented and evaluated: Epsilon Greedy, Upper Confidence Bound (UCB), and Thompson Sampling. The paper adopts the Multi-Armed Bandit (MAB) framework to model payment routing as a resource allocation problem, offering a computational approach to real-world resource allocation dilemmas. Through simulation, we eliminate real-time transaction costs, allowing us to focus on algorithmic approaches without implications for customers, businesses, or payment providers. Among the RL algorithms studied, UCB emerges as the most effective in addressing this Multi-Armed Bandit problem, corroborating findings from prior research. This study suggests not only the potential of modeling real-world problems as MAB but also the superior performance of the UCB algorithm in solving RL problems. The paper underscores the need for increased focus on non-consumer-facing aspects of the financial services industry, emphasizing cross-disciplinary research to create infrastructure and software solutions. Researchers can extend this study by exploring MAB algorithms in various domains with options for system choices. The simulation-based approach offers a cost-effective means of testing system performance and hypotheses across a spectrum of industries, fostering innovation and progress.