metadata

title: ai-q-learning-vacuum-robot-sm
emoji: 🏠⚙️🤖
colorFrom: pink
colorTo: green
sdk: gradio
sdk_version: 4.12.0
app_file: app.py
pinned: false

AI-Q-Learning-Vacuum-Robot-Cleaner-Simulation

This project is an experimental application v2.0 ...

Project Overview

This application allows users to train a vacuum robot cleaner to recognize an environment.

Technical Details

The project utilizes the following technologies:

Q-Learning: Reinforcement learning algorithm for training the robot.
Gradio: Provides an interactive web interface for users to upload images and adjust parameters.

Instructions

1- Set up the environment:

Edit the grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum and Generate Environment.

2- Train the robot vacuum cleaner:

Reinforcement learning: Q-learning to train the robot vacuum cleaner.
Start Position Certification: Ensure that the robot does not start in a dirt or wall position.
Dirt Cleaning: After finding dirt, the robot cleans it, updating the position to 0.
Reduce Epsilon Decay Rate: This will allow the robot to explore for longer before it starts exploring less.
Reset the Home State Periodically: To ensure that dirt reappears and the robot has new opportunities to learn.
Check that the Robot is Not Stuck: A mechanism was add to ensure that the robot is not stuck in a cycle of invalid states.
Epsilon decay: The decay rate (reduced to 0.999), will allow for more exploration.
House State Reset: The house is reset every episode to ensure that dirt is present in each new episode.
Increase the learning rate: Set the alpha to (e.g. 0.2) to see if it helps you learn faster.
Increase the discount factor: Set the gamma to (e.g. 0.95) to give more value to future rewards.
Add more randomness to the choice of initial state: This can help to vary training experiences more.
Reduce the reward when encountering dirt: Reducing the direct reward can make the robot try harder to learn other parts of the environment.
Add penalties for movement: Adding a small penalty for each movement can encourage the robot to find dirt more efficiently.
Increase the variation of initial states: Starting from a greater variety of initial positions can help the robot explore more of the environment.
Change the learning rate (alpha): If you notice that the robot is converging too slowly or too quickly, adjusting the learning rate can help.
Add more dirt or obstacles: Adding more elements to the environment can make the problem more challenging and interesting for the robot.
Test different exploration-exploitation (epsilon) policies: Experiment with different epsilon decay strategies to find a good balance between exploration and exploitation.
Increase the number of episodes: In some cases, training for more episodes can help to further improve the robot's performance.

3- Simulate:

New Simulation Grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum, set iterations (episodes/epochs) and simulate robot.

License

ECL

Developer Information

Developed by Ramon Mayor Martins, Ph.D. (2024)

Email: [email protected]
Homepage: https://rmayormartins.github.io/
Twitter: @rmayormartins
GitHub: https://github.com/rmayormartins

Acknowledgements

Special thanks to Instituto Federal de Santa Catarina (Federal Institute of Santa Catarina) IFSC-São José-Brazil.

Contact

For any queries or suggestions, please contact the developer using the information provided above.