28 11 60

nicolo

nicolollo

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection

liked a model 3 days ago

cognition-ai/Kevin-32B

reacted to nicolay-r's post with 🔥 17 days ago

🚀 Delighted to share a major milestone in adapting reasoning techniques for data collections augmentation! Introducing bulk-chain 1.0.0 -- the first major release of a no-string API for adapting your LLM for Chain-of-Thought alike reasoning over records with large amount of parameters across large datasets. ⭐ Check it out: https://github.com/nicolay-r/bulk-chain What’s new and why it matters: 📦 Fully no-string API for easy client deployment 🔥 Demos are now standalone projects: Demos: 📺 bash / shell (dispatched): https://github.com/nicolay-r/bulk-chain-shell 📺 tksheet: https://github.com/nicolay-r/bulk-chain-tksheet-client Using nlp-thirdgate to host the supported providers: 🌌 LLM providers: https://github.com/nicolay-r/nlp-thirdgate

View all activity

Organizations

nicolollo's activity

upvoted a paper 3 days ago

AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection

Paper • 2505.00312 • Published 14 days ago • 2

liked a model 3 days ago

cognition-ai/Kevin-32B

Updated 8 days ago • 918 • 122

reacted to nicolay-r's post with 🔥 17 days ago

Post

2647

🚀 Delighted to share a major milestone in adapting reasoning techniques for data collections augmentation!
Introducing bulk-chain 1.0.0 -- the first major release of a no-string API for adapting your LLM for Chain-of-Thought alike reasoning over records with large amount of parameters across large datasets.

⭐ Check it out: https://github.com/nicolay-r/bulk-chain

What’s new and why it matters:
📦 Fully no-string API for easy client deployment
🔥 Demos are now standalone projects:

Demos:
📺 bash / shell (dispatched): https://github.com/nicolay-r/bulk-chain-shell
📺 tksheet: https://github.com/nicolay-r/bulk-chain-tksheet-client

Using nlp-thirdgate to host the supported providers:
🌌 LLM providers: https://github.com/nicolay-r/nlp-thirdgate

liked a Space 2 months ago

1.28k

Joy Caption Alpha Two

👁

Generate captions for images in various styles

liked a model 3 months ago

nomic-ai/nomic-embed-text-v2-moe

reacted to grimjim's post with ❤️ 3 months ago

Post

2367

This recent paper points to an explanation for the unreasonable effectiveness of Frankenmerges: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)

Specifically, the duplication of layers in Frankenmerges serves a purpose similar to what occurs in their recurrent-depth architecture. Successful frankenmerges that operate without additional fine-tuning are able to recover or "heal" from any damage due to abrupt transitions between layer blocks. Operational replicated layer blocks can provide functional benefits grounded in latent reasoning. Frankenmerges can also result in hybrid reasoning, by splicing together the latent reasoning of different models.

Back in April 2024, I was able to duplicate a few layers in the Llama 3 8B model, turning it into a 9B model, without harming benchmarks significantly, despite any transition damage.
grimjim/llama-3-experiment-v1-9B
My informal experimentation suggested that latent reasoning circuits could occupy continguous stacks of 2-4 layers, though the result was highly sensitive to the choice of transition location between layers.

1 reply

liked a model 3 months ago

tomg-group-umd/huginn-0125

Text Generation • Updated 30 days ago • 4.01k • 268

liked 3 models 4 months ago

upvoted 2 collections 4 months ago

Qwen2.5-1M

Collection

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 16 days ago • 118

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated 16 days ago • 466

updated a model 4 months ago

nicolollo/test1-Q4_K_M-GGUF

Updated Jan 26 • 5

published a model 4 months ago

nicolollo/test1-Q4_K_M-GGUF

Updated Jan 26 • 5

updated a model 4 months ago

nicolollo/test1

Updated Jan 26

published a model 4 months ago

nicolollo/test1

Updated Jan 26

liked a model 4 months ago

google/timesfm-2.0-500m-pytorch

Time Series Forecasting • Updated 29 days ago • 159

upvoted a collection 4 months ago

VideoChat-Flash

Collection

Faster and more powerful VideoChat. • 15 items • Updated 25 days ago • 11

reacted to merve's post with ❤️🚀 4 months ago

Post

4917

supercharge your LLM apps with smolagents 🔥

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents