AI & ML interests

Small LMs for small computers

Recent Activity

M4-ai's activity

prithivMLmodsย 
posted an update about 4 hours ago
view post
Post
303
Dropping Downstream tasks using newly initialized parameters and weights ([classifier.bias & weights]) support domain-specific ๐—ถ๐—บ๐—ฎ๐—ด๐—ฒ ๐—ฐ๐—น๐—ฎ๐˜€๐˜€๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป. Based on siglip2-base-patch16-224 and DomainNet (single-domain, multi-source adaptation), with Fashion-MNIST for experimental testing. ๐Ÿงคโ˜„๏ธ

Fashion-Mnist : prithivMLmods/Fashion-Mnist-SigLIP2
Multisource-121 : prithivMLmods/Multisource-121-DomainNet
Painting-126 : prithivMLmods/Painting-126-DomainNet
Sketch-126 : prithivMLmods/Sketch-126-DomainNet
Clipart-126 : prithivMLmods/Clipart-126-DomainNet

Models are trained with different parameter settings for experimental purposes only, with the intent of further development. Refer to the model page below for instructions on running it with Transformers ๐Ÿค—.

Collection : prithivMLmods/domainnet-exp-67e0e3c934c03cc40c6c8782

Citations : SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features https://arxiv.org/pdf/2502.14786 & Moment Matching for Multi-Source Domain Adaptation : https://arxiv.org/pdf/1812.01754

prithivMLmodsย 
posted an update 4 days ago
view post
Post
2139
Play with Orpheus TTS, a Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been fine-tuned to deliver human-level speech synthesis ๐Ÿ”ฅ๐Ÿ—ฃ๏ธ

๐Ÿ‘‰GitHub: https://github.com/PRITHIVSAKTHIUR/Orpheus-TTS-Edge

Demo supporting both text-to-speech and text-to-llm responses in speech.

> voice: tara, dan, emma, josh
> emotion: <laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>.

๐Ÿฅ Orpheus-3b-0.1-ft
Model Page: canopylabs/orpheus-3b-0.1-ft

๐Ÿฅ Orpheus-3b-0.1-ft
Colab Inference Notebook: https://colab.research.google.com/drive/1KhXT56UePPUHhqitJNUxq63k-pQomz3N?usp=sharing

๐Ÿฅ Finetune [ orpheus-3b-0.1-pretrained ]
Resource: https://github.com/canopyai/Orpheus-TTS/tree/main/finetune

๐Ÿฅ Model-releases:
https://canopylabs.ai/model-releases
  • 1 reply
ยท
AtAndDevย 
posted an update 8 days ago
view post
Post
4090
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
ยท
prithivMLmodsย 
posted an update 10 days ago
view post
Post
917
Hey Guys! One Small Announcement ๐Ÿค—
Stranger Zone now accepts LoRA requests!

โœ๏ธRequest : strangerzonehf/Request-LoRA [ or ] strangerzonehf/Request-LoRA#1

Page : https://huggingface.co/strangerzonehf

Describe the artistic properties by posting sample images or links to similar images in the request discussion. If the adapters you're asking for are truly creative and safe for work, I'll train and upload the LoRA to the Stranger Zone repo!

Thank you!
AtAndDevย 
posted an update 12 days ago
view post
Post
1543
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
prithivMLmodsย 
posted an update 12 days ago
view post
Post
2469
Gemma-3-4B : Image and Video Inference ๐Ÿ–ผ๏ธ๐ŸŽฅ

๐ŸงคSpace: prithivMLmods/Gemma-3-Multimodal
๐Ÿฅ Git : https://github.com/PRITHIVSAKTHIUR/Gemma-3-Multimodal

@gemma3 : {Tag + Space_+ 'prompt'}
@video-infer : {Tag + Space_+ 'prompt'}

+ Gemma3-4B : google/gemma-3-4b-it
+ By default, it runs : prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Gemma 3 Technical Report : https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf
  • 1 reply
ยท
not-lainย 
posted an update 12 days ago
prithivMLmodsย 
posted an update 13 days ago
Tonicย 
posted an update 17 days ago
view post
Post
1156
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธHey there folks,

Did you know that you can use ModernBERT to detect model hallucinations ?

Check out the Demo : Tonic/hallucination-test

See here for Medical Context Demo : MultiTransformer/tonic-discharge-guard

check out the model from KRLabs : KRLabsOrg/lettucedect-large-modernbert-en-v1

and the library they kindly open sourced for it : https://github.com/KRLabsOrg/LettuceDetect

๐Ÿ‘†๐Ÿปif you like this topic please contribute code upstream ๐Ÿš€

  • 2 replies
ยท
Tonicย 
posted an update 19 days ago
view post
Post
700
Powered by KRLabsOrg/lettucedect-large-modernbert-en-v1 from KRLabsOrg.

Detect hallucinations in answers based on context and questions using ModernBERT with 8192-token context support!

### Model Details
- **Model Name**: [lettucedect-large-modernbert-en-v1]( KRLabsOrg/lettucedect-large-modernbert-en-v1)
- **Organization**: [KRLabsOrg](https://huggingface.co/KRLabsOrg)
- **Github**: [https://github.com/KRLabsOrg/LettuceDetect](https://github.com/KRLabsOrg/LettuceDetect)
- **Architecture**: ModernBERT (Large) with extended context support up to 8192 tokens
- **Task**: Token Classification / Hallucination Detection
- **Training Dataset**: [RagTruth]( wandb/RAGTruth-processed)
- **Language**: English
- **Capabilities**: Detects hallucinated spans in answers, provides confidence scores, and calculates average confidence across detected spans.

LettuceDetect excels at processing long documents to determine if an answer aligns with the provided context, making it a powerful tool for ensuring factual accuracy.
prithivMLmodsย 
posted an update 19 days ago
Locutusqueย 
posted an update 27 days ago
view post
Post
2629
๐ŸŽ‰ Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! โœจ๏ธ

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! ๐Ÿง 

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! โœ๏ธ
prithivMLmodsย 
posted an update 27 days ago
view post
Post
5863
Dropping some of the custom fine-tunes based on SigLIP2,
with a single/multi label classification problem type! ๐ŸŒ€๐Ÿงค

- AI vs Deepfake vs Real : prithivMLmods/AI-vs-Deepfake-vs-Real-Siglip2
- Deepfake Detect : prithivMLmods/Deepfake-Detect-Siglip2
- Fire Detection : prithivMLmods/Fire-Detection-Siglip2
- Deepfake Quality Assess : prithivMLmods/Deepfake-Quality-Assess-Siglip2
- Guard Against Unsafe Content : prithivMLmods/Guard-Against-Unsafe-Content-Siglip2

๐ŸŒ Collection : prithivMLmods/siglip2-custom-67bcdb2de8fe96b99fb4e19e
KnutJaegersbergย 
posted an update about 1 month ago
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
5835
It's really interesting about the deployment of a new state of matter in Majorana 1: the worldโ€™s first quantum processor powered by topological qubits. If you missed this news this week, here are some links for you:

๐Ÿ…ฑ๏ธTopological qubit arrays: https://arxiv.org/pdf/2502.12252

โš›๏ธ Quantum Blog: https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/

๐Ÿ“– Read the story: https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/

๐Ÿ“ Majorana 1 Intro: https://youtu.be/Q4xCR20Dh1E?si=Z51DbEYnZFp_88Xp

๐ŸŒ€The Path to a Million Qubits: https://youtu.be/wSHmygPQukQ?si=TS80EhI62oWiMSHK
ยท
mmhamdyย 
posted an update about 1 month ago
view post
Post
2743
๐ŸŽ‰ We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

๐Ÿ’ก But what makes MemoryCode unique?! The combination of the following:

โœ… Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

โœ… Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

โœ… Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

โœ… Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

โœ… Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

๐Ÿ“Œ Our Findings

1๏ธโƒฃ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2๏ธโƒฃ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

๐Ÿ”— Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
๐Ÿ“ฆ Code: https://github.com/for-ai/MemoryCode
KnutJaegersbergย 
posted an update about 1 month ago
view post
Post
719
Mimicking Consciousness in LLMs: Ascending the Dimensions of Thought with Recurrent Processing

This blog post explores how **recurrent processing** can transform Large Language Models (LLMs) to mimic aspects of human thought by engaging in iterative feedback loops. Inspired by string theory, the post describes how LLMs can "ascend dimensions" of cognition, progressing through foundational cognitive loopsโ€”such as basic cognition, executive functions, and meta-cognitionโ€”before advancing into **world simulation**. In this stage, LLMs explore higher dimensions, perceiving non-linear time, simulating branching possibilities, and integrating multiple realities. The interaction between the **Generator** and **Reflective Compass** allows AI systems to refine their outputs iteratively, moving toward a **point attractor** where ideas become coherent and polished. While this process doesn't bestow true consciousness, it offers a compelling imitation of reflective and adaptive thinking, leading to smarter dialogue, enhanced creativity, and more robust problem-solving.

https://huggingface.co/blog/KnutJaegersberg/oscillatory-recurrence-for-llms
prithivMLmodsย 
posted an update about 1 month ago
view post
Post
3928
Dino: The Minimalist Multipurpose Chat System ๐ŸŒ 
Agent-Dino : prithivMLmods/Agent-Dino
Github: https://github.com/PRITHIVSAKTHIUR/Agent-Dino

By default, it performs the following tasks:
{Text-to-Text Generation}, {Image-Text-Text Generation}
@image: Generates an image using Stable Diffusion xL.
@3d: Generates a 3D mesh.
@web: Web search agents.
@rAgent: Initiates a reasoning chain using Llama mode for coding explanations.
@tts1-โ™€, @tts2-โ™‚: Voice generation (Female and Male voices).
@yolo : Object Detection