---
title: ai-q-learning-vacuum-robot-sm
emoji: 🏠⚙️🤖
colorFrom: pink
colorTo: green
sdk: gradio
sdk_version: "4.12.0"
app_file: app.py
pinned: false
---

# AI-Q-Learning-Vacuum-Robot-Cleaner-Simulation

This project is an experimental application v2.0 ...

## Project Overview

This application allows users to train a vacuum robot cleaner to recognize an environment.

## Technical Details

The project utilizes the following technologies:
- **Q-Learning**: Reinforcement learning algorithm for training the robot.
- **Gradio**: Provides an interactive web interface for users to upload images and adjust parameters.

## Instructions

**1- Set up the environment**:
- Edit the grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum and Generate Environment.
        
**2- Train the robot vacuum cleaner**:
- Reinforcement learning: Q-learning to train the robot vacuum cleaner.
- Start Position Certification: Ensure that the robot does not start in a dirt or wall position.
- Dirt Cleaning: After finding dirt, the robot cleans it, updating the position to 0.
- Reduce Epsilon Decay Rate: This will allow the robot to explore for longer before it starts exploring less.
- Reset the Home State Periodically: To ensure that dirt reappears and the robot has new opportunities to learn.
- Check that the Robot is Not Stuck: A mechanism was add to ensure that the robot is not stuck in a cycle of invalid states.
- Epsilon decay: The decay rate (reduced to 0.999), will allow for more exploration.
- House State Reset: The house is reset every episode to ensure that dirt is present in each new episode.
- Increase the learning rate: Set the alpha to (e.g. 0.2) to see if it helps you learn faster.
- Increase the discount factor: Set the gamma to (e.g. 0.95) to give more value to future rewards.
- Add more randomness to the choice of initial state: This can help to vary training experiences more.
- Reduce the reward when encountering dirt: Reducing the direct reward can make the robot try harder to learn other parts of the environment.
- Add penalties for movement: Adding a small penalty for each movement can encourage the robot to find dirt more efficiently.
- Increase the variation of initial states: Starting from a greater variety of initial positions can help the robot explore more of the environment.
- Change the learning rate (alpha): If you notice that the robot is converging too slowly or too quickly, adjusting the learning rate can help.
- Add more dirt or obstacles: Adding more elements to the environment can make the problem more challenging and interesting for the robot.
- Test different exploration-exploitation (epsilon) policies: Experiment with different epsilon decay strategies to find a good balance between exploration and exploitation.
- Increase the number of episodes: In some cases, training for more episodes can help to further improve the robot's performance.
        
**3- Simulate**:
- New Simulation Grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum, set iterations (episodes/epochs) and simulate robot.

## License

ECL

## Developer Information

Developed by Ramon Mayor Martins, Ph.D. (2024)
- Email: rmayormartins@gmail.com
- Homepage: [https://rmayormartins.github.io/](https://rmayormartins.github.io/)
- Twitter: @rmayormartins
- GitHub: [https://github.com/rmayormartins](https://github.com/rmayormartins)

## Acknowledgements

Special thanks to Instituto Federal de Santa Catarina (Federal Institute of Santa Catarina) IFSC-São José-Brazil.

## Contact

For any queries or suggestions, please contact the developer using the information provided above.