This is a React-based simulator for the k-armed bandit problem, designed to test and visualize reinforcement learning strategies with an interactive UI styled using Tailwind CSS.
- React - A JavaScript library for building user interfaces.
- Tailwind CSS - A utility-first CSS framework for rapid UI development.
First, ensure you have Node.js installed to manage your project's dependencies.
Clone the project and install dependencies:
git clone https://github.com/mweglowski/bandit_demonstration.git
cd bandit_demonstration
npm installTo run the application in development mode:
npm startThis will open the simulator in your default web browser. For production builds, you can use:
npm run build- Multiple bandits with unique probabilistic reward distributions.
- Interactive interface for 'pulling' bandit arms, built with React.
- Responsive and modern UI using Tailwind CSS.
- Visualization of action counts and estimated values.
The simulator focuses on the ε-greedy strategy, balancing exploration and exploitation by selecting the best-known action with probability 1−ϵ and exploring a random action with probability ϵ.
The simulator updates the estimated action value Q using the formula:
Q(n+1) = Q(n) + (1/n) * (Rn - Q(n))
Where:
Q(n+1)is the new estimate,Q(n)is the current estimate,Rnis the reward received,nis the number of times the action has been chosen.
After launching the simulator, interact with the UI by selecting a bandit to 'pull'. Observe the algorithm's performance and how estimated values update based on the reward distributions.
Explore the simulator online at https://bandit-problem-simulator.vercel.app/.
I welcome contributions! If you have suggestions or are interested in improving the k-armed bandit simulator, please feel free to fork the repository, make changes, and submit a pull request.
Inspired by the foundational reinforcement learning work of Sutton and Barto.




