Sam Heutmaker
commited on
Commit
·
a14369e
1
Parent(s):
45a2264
update readme
Browse files
README.md
CHANGED
@@ -51,7 +51,6 @@ The model generates structured, schema-consistent JSON outputs for every video f
|
|
51 |
- **Production-ready** - Battle-tested on trillion-scale video frame captioning workloads
|
52 |
- **Schema-consistent JSON** - Reliable structured output for every frame
|
53 |
- **Cost-efficient** - Optimized for high-throughput inference
|
54 |
-
- **Temporal consistency** - Maintains semantic coherence across video sequences
|
55 |
- **Open source** - Build and deploy without proprietary API dependencies
|
56 |
|
57 |
## Architecture
|
@@ -223,28 +222,15 @@ Given a nature scene with a wooden boardwalk through grassland:
|
|
223 |
- **Video Analytics** - Extract insights from large video collections
|
224 |
- **Content Management** - Automatic tagging and organization of video libraries
|
225 |
|
226 |
-
##
|
227 |
|
228 |
-
|
229 |
-
- English-only descriptions (can identify text in other languages)
|
230 |
-
- Maximum image size: 1MB
|
231 |
-
- Requires specific prompts for optimal performance
|
232 |
-
- Not supported on A100 GPUs (no native FP8)
|
233 |
-
|
234 |
-
## Best Practices
|
235 |
-
|
236 |
-
1. **Use exact prompts** - The provided system and user prompts are optimized for best results
|
237 |
-
2. **Set low temperature** - Use temperature=0.1 for consistent outputs
|
238 |
-
3. **Enable JSON mode** - Always set response_format to ensure valid JSON
|
239 |
-
4. **Process systematically** - Maintain temporal order when processing video sequences
|
240 |
-
5. **Batch similar content** - Group frames from the same video for efficiency
|
241 |
|
242 |
## Support
|
243 |
|
244 |
-
- **Documentation**: [docs.inference.net](https://
|
245 |
-
- **API Access**: [inference.net/use-cases/video-understanding](https://
|
246 |
- **Email**: [email protected]
|
247 |
-
- **Enterprise**: [Schedule a consultation](https://inference.net/sales)
|
248 |
|
249 |
## License
|
250 |
|
|
|
51 |
- **Production-ready** - Battle-tested on trillion-scale video frame captioning workloads
|
52 |
- **Schema-consistent JSON** - Reliable structured output for every frame
|
53 |
- **Cost-efficient** - Optimized for high-throughput inference
|
|
|
54 |
- **Open source** - Build and deploy without proprietary API dependencies
|
55 |
|
56 |
## Architecture
|
|
|
222 |
- **Video Analytics** - Extract insights from large video collections
|
223 |
- **Content Management** - Automatic tagging and organization of video libraries
|
224 |
|
225 |
+
## Interested in training your own model?
|
226 |
|
227 |
+
Contact us at [[email protected]](mailto:[email protected]) for a free consultation with our research team.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
228 |
|
229 |
## Support
|
230 |
|
231 |
+
- **Documentation**: [docs.inference.net](https://inference.net/use-cases/video-understanding)
|
232 |
+
- **API Access**: [inference.net/use-cases/video-understanding](https://inference.net/use-cases/video-understanding)
|
233 |
- **Email**: [email protected]
|
|
|
234 |
|
235 |
## License
|
236 |
|