Nicolay Rusnachenko's picture

Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (πŸ–Ό+πŸ“) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

Organizations

None yet

nicolay-r's activity

reacted to fdaudens's post with 🀯 5 days ago
view post
Post
4001
🎨 Designers, meet OmniSVG! This new model helps you create professional vector graphics from text/images, generate editable SVGs from icons to detailed characters, convert rasters to vectors, maintain style consistency with references, and integrate into your workflow.

@OmniSVG
  • 2 replies
Β·
posted an update 7 days ago
posted an update 12 days ago
view post
Post
1741
πŸ“’ For those who in textual IR and experimenting with quick deployment of CoT / reasoning, the following update might be relevant. I am happy to announce new version of the bulk-chain 0.25.3. It is a no-string framework for quick application of reasoning schema adaptation over your data.

https://github.com/nicolay-r/bulk-chain/releases/tag/0.25.3

The latest release brings huge updates on:
βœ… Reforged mechanism of models inference that work in steraming mode.
- Callbacks support for streaming mode (earlier only in demo)
- Deployment of various clients (shell, tksheet; see attachment)
βœ… Support for batching (earlier in API mode only)
βœ… Optional caching of inferred data in SQlite (always enabled earlier)
- This now makes possible to faster launch small (but mighty) LLMs

🌟 Project: https://github.com/nicolay-r/bulk-chain
🌌 Proviers: https://github.com/nicolay-r/nlp-thirdgate

posted an update 26 days ago
view post
Post
1666
The Concept behind xLSTM has recently turn into the xLSTM-7B model that showcase the performance in the category of the similar-scale Gemma 7B, LLama2 7B, FlaconMamba 7B but with higher performing Inference Kernel

Model: NX-AI/xLSTM-7b
Paper: https://arxiv.org/abs/2503.13427

  • 1 reply
Β·
posted an update about 1 month ago
view post
Post
669
πŸ“’ Several weeks ago Microsoft announced Phi-4. My most-recent list of LLM models have had only wrapper for Phi-2, so it was time to update! With this post, happy to share that Phi-4 wrapper is now available at nlp-thirdgate for adopting Chain-of-Thought reasoning:

πŸ€– https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_phi4.py

πŸ“’ https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_phi4.py

Findings on adaptation: I was able to reproduce only the pipeline based model launching. This version is for textual llm only. Microsoft also released multimodal Phi-4 which is out of scope of this wrapper.

🌌 nlp-thirdgate: https://lnkd.in/ef-wBnNn
posted an update about 1 month ago
view post
Post
1131
πŸ“’ Delighted to announce the updated version of the no-string framework for chain-of-thought application over JSONL/CSV data:
https://github.com/nicolay-r/bulk-chain/releases/tag/0.25.2

πŸ”§ Fixes:
- Fixed issues with batching mode
- Fixed problem with parsing and passing args in shell mode

⚠️ Limitation: bathing mode is still available only via API.

πŸ“’ Quick Start with Gemma-3 in batching mode: https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_gemma_3.ipynb
replied to their post about 1 month ago
view reply

The important comment is to use the very latest version of the bulk-chain from github which fixes the bug for double-inference in batching.

posted an update about 1 month ago
view post
Post
1580
πŸ“’ With the recent release of Gemma-3, If you interested to play with textual chain-of-though, the notebook below is a wrapper over the the model (native transformers inference API) for passing the predefined schema of promps in batching mode.
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_gemma_3.ipynb

Limitation: schema supports texts only (for now), while gemma-3 is a text+image to text.

Model: google/gemma-3-1b-it
Provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_gemma3.py
  • 1 reply
Β·
reacted to onekq's post with πŸ‘€ about 1 month ago
view post
Post
1415
The performance of deepseek-r1-distill-qwen-32b is abysmal. I know Qwen instruct (not coder) is quite poor on coding. As such, I have low expectation on other R1 repro works also based on Qwen instruct too. onekq-ai/r1-reproduction-works-67a93f2fb8b21202c9eedf0b

This makes it particularly mysterious what went into QwQ-32B? Why did it work so well? Was it trained from scratch? Anyone has insights about this?
onekq-ai/WebApp1K-models-leaderboard
  • 5 replies
Β·
replied to ritvik77's post about 1 month ago
view reply

@ritvik77 , sounds good on your plans! Meanwhile looking forward to adapt 7B version to experiment in radiology domain. Happy to read more on that and once and if it gets to the paper, so I can populate the survey of the related advances.