A newer version of the Gradio SDK is available:
5.45.0
metadata
title: ai-q-learning-vacuum-robot-sm
emoji: 🏠⚙️🤖
colorFrom: pink
colorTo: green
sdk: gradio
sdk_version: 4.12.0
app_file: app.py
pinned: false
AI-Q-Learning-Vacuum-Robot-Cleaner-Simulation
This project is an experimental application v2.0 ...
Project Overview
This application allows users to train a vacuum robot cleaner to recognize an environment.
Technical Details
The project utilizes the following technologies:
- Q-Learning: Reinforcement learning algorithm for training the robot.
- Gradio: Provides an interactive web interface for users to upload images and adjust parameters.
Instructions
1- Set up the environment:
- Edit the grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum and Generate Environment.
2- Train the robot vacuum cleaner:
- Reinforcement learning: Q-learning to train the robot vacuum cleaner.
- Start Position Certification: Ensure that the robot does not start in a dirt or wall position.
- Dirt Cleaning: After finding dirt, the robot cleans it, updating the position to 0.
- Reduce Epsilon Decay Rate: This will allow the robot to explore for longer before it starts exploring less.
- Reset the Home State Periodically: To ensure that dirt reappears and the robot has new opportunities to learn.
- Check that the Robot is Not Stuck: A mechanism was add to ensure that the robot is not stuck in a cycle of invalid states.
- Epsilon decay: The decay rate (reduced to 0.999), will allow for more exploration.
- House State Reset: The house is reset every episode to ensure that dirt is present in each new episode.
- Increase the learning rate: Set the alpha to (e.g. 0.2) to see if it helps you learn faster.
- Increase the discount factor: Set the gamma to (e.g. 0.95) to give more value to future rewards.
- Add more randomness to the choice of initial state: This can help to vary training experiences more.
- Reduce the reward when encountering dirt: Reducing the direct reward can make the robot try harder to learn other parts of the environment.
- Add penalties for movement: Adding a small penalty for each movement can encourage the robot to find dirt more efficiently.
- Increase the variation of initial states: Starting from a greater variety of initial positions can help the robot explore more of the environment.
- Change the learning rate (alpha): If you notice that the robot is converging too slowly or too quickly, adjusting the learning rate can help.
- Add more dirt or obstacles: Adding more elements to the environment can make the problem more challenging and interesting for the robot.
- Test different exploration-exploitation (epsilon) policies: Experiment with different epsilon decay strategies to find a good balance between exploration and exploitation.
- Increase the number of episodes: In some cases, training for more episodes can help to further improve the robot's performance.
3- Simulate:
- New Simulation Grid: 0 = Empty, 1 = Dirt, 2 = Wall, 3 = Vacuum, set iterations (episodes/epochs) and simulate robot.
License
ECL
Developer Information
Developed by Ramon Mayor Martins, Ph.D. (2024)
- Email: [email protected]
- Homepage: https://rmayormartins.github.io/
- Twitter: @rmayormartins
- GitHub: https://github.com/rmayormartins
Acknowledgements
Special thanks to Instituto Federal de Santa Catarina (Federal Institute of Santa Catarina) IFSC-São José-Brazil.
Contact
For any queries or suggestions, please contact the developer using the information provided above.