Spaces:
Sleeping
Sleeping
Update readme.md
Browse files
readme.md
CHANGED
@@ -1,72 +1,28 @@
|
|
1 |
-
# Context-Aware Multimodal Assistant
|
2 |
|
3 |
-
##
|
|
|
4 |
|
5 |
-
|
|
|
|
|
|
|
6 |
|
7 |
-
|
|
|
|
|
|
|
8 |
|
9 |
-
##
|
|
|
|
|
|
|
10 |
|
11 |
-
|
12 |
-
-
|
13 |
-
-
|
14 |
-
-
|
15 |
|
16 |
---
|
17 |
|
18 |
-
|
19 |
-
|
20 |
-
- Reduces **cognitive overload** by presenting information in a simpler, clearer way.
|
21 |
-
- Supports users in **staying focused and productive** during stressful or distracting moments.
|
22 |
-
- Offers a **personalized interaction** by combining multimodal inputs — voice and vision — to better understand user context.
|
23 |
-
- Makes digital communication and task management feel less daunting when the user is under pressure.
|
24 |
-
|
25 |
-
---
|
26 |
-
|
27 |
-
## Key Features & Technologies Used
|
28 |
-
|
29 |
-
- **Multimodal Inputs:**
|
30 |
-
- **Speech (voice input):** Users upload voice recordings that the system analyzes for stress cues.
|
31 |
-
- **Vision (facial images):** Webcam images are analyzed to detect facial expressions related to stress.
|
32 |
-
|
33 |
-
- **Stress Detection Models:**
|
34 |
-
- Placeholder dummy functions simulate stress detection for voice and face input (replaceable with real pretrained models).
|
35 |
-
|
36 |
-
- **Task Simplification:**
|
37 |
-
- Uses the **T5-base** transformer model from Hugging Face for natural language simplification and paraphrasing.
|
38 |
-
- Prompts guide the model to adapt outputs based on detected stress levels.
|
39 |
-
|
40 |
-
- **User Interface:**
|
41 |
-
- Built with **Gradio** for easy prototyping and interaction within a Google Colab notebook.
|
42 |
-
- Planned deployment on **Hugging Face Spaces** with a simple UI for user-friendly access.
|
43 |
-
|
44 |
-
---
|
45 |
-
|
46 |
-
## How to run the project
|
47 |
-
|
48 |
-
1. Run the app locally or in Google Colab by uploading voice recordings and face images.
|
49 |
-
2. Type your task or message into the input box.
|
50 |
-
3. The assistant detects your stress level from voice and facial cues, then simplifies your message if needed.
|
51 |
-
4. Get a clear, simplified response to help you manage your cognitive load.
|
52 |
-
|
53 |
-
---
|
54 |
-
|
55 |
-
## Future Improvements
|
56 |
-
|
57 |
-
- Replace dummy stress detection functions with real pretrained models for accurate voice and facial stress recognition.
|
58 |
-
- Add real-time stress detection via webcam and live microphone.
|
59 |
-
- Extend to handle calendar and email data for task summarization.
|
60 |
-
- Personalize responses based on user history and preferences.
|
61 |
-
- Add multilingual support for wider accessibility.
|
62 |
-
|
63 |
-
---
|
64 |
-
|
65 |
-
## Acknowledgments
|
66 |
-
|
67 |
-
This project leverages pretrained models and libraries from Hugging Face Transformers and Gradio, enabling accessible and powerful multimodal AI applications.
|
68 |
-
|
69 |
-
---
|
70 |
-
|
71 |
-
Feel free to reach out if you want to collaborate or improve the project!
|
72 |
-
|
|
|
1 |
+
# Context-Aware Multimodal Assistant
|
2 |
|
3 |
+
## Overview
|
4 |
+
This project builds a multimodal assistant that helps users manage cognitive load by detecting stress from voice recordings and facial images. Based on the stress level, it simplifies or rephrases user tasks or messages to make them easier to understand.
|
5 |
|
6 |
+
## Features
|
7 |
+
- Detects stress from voice and face inputs (placeholder logic, easy to replace).
|
8 |
+
- Simplifies text input using the `facebook/bart-large-cnn` model from Hugging Face.
|
9 |
+
- Interactive UI built with Gradio for easy testing and deployment.
|
10 |
|
11 |
+
## How to Use
|
12 |
+
1. Upload a voice recording (.wav) and a face image.
|
13 |
+
2. Enter the task or message you want help with.
|
14 |
+
3. The assistant detects your stress level and simplifies your input accordingly.
|
15 |
|
16 |
+
## Technologies
|
17 |
+
- Hugging Face Transformers (`facebook/bart-large-cnn`)
|
18 |
+
- Gradio for the user interface
|
19 |
+
- Torch and other libraries for processing
|
20 |
|
21 |
+
## Future Work
|
22 |
+
- Integrate real stress detection models for voice and facial expressions.
|
23 |
+
- Add real-time input support.
|
24 |
+
- Extend functionality for calendar and email summarization.
|
25 |
|
26 |
---
|
27 |
|
28 |
+
Feel free to explore and contribute!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|