AI & ML interests

None defined yet.

do-me 
posted an update 4 months ago
view post
Post
435
Wrote a quick one-liner to run Qwen3-Next-80B-A3B-Instruct-8bit with mlx-lm on MacOS with mlx-lm and uv:

curl -sL https://gist.githubusercontent.com/do-me/34516f7f4d8cc701da823089b09a3359/raw/5f3b7e92d3e5199fd1d4f21f817a7de4a8af0aec/prompt.py | uv run --with git+https://github.com/ml-explore/mlx-lm.git python - --prompt "What is the meaning of life?"


... or if you prefer the more secure 2-liner version (if you check the script before executing):

curl -sL https://gist.githubusercontent.com/do-me/34516f7f4d8cc701da823089b09a3359/raw/5f3b7e92d3e5199fd1d4f21f817a7de4a8af0aec/prompt.py -o prompt.py
uv run --with git+https://github.com/ml-explore/mlx-lm.git python prompt.py --prompt "What is the meaning of life?"


I get like 45-50 tokens on an M3 Max, pretty happy with the generation speed!

Stats from the video:
Prompt: 15 tokens, 80.972 tokens-per-sec
Generation: 256 tokens, 45.061 tokens-per-sec
Peak memory: 84.834 GB
do-me 
posted an update about 1 year ago
do-me 
posted an update over 1 year ago
view post
Post
1289
What are your favorite text chunkers/splitters?
Mine are:
- https://github.com/benbrandt/text-splitter (Rust/Python, battle-tested, Wasm version coming soon)
- https://github.com/umarbutler/semchunk (Python, really performant but some issues with huge docs)

I tried the huge Jina AI regex, but it failed for my (admittedly messy) documents, e.g. from EUR-LEX. Their free segmenter API is really cool but unfortunately times out on my huge docs (~100 pages): https://jina.ai/segmenter/

Also, I tried to write a Vanilla JS chunker with a simple, adjustable hierarchical logic (inspired from the above). I think it does a decent job for the few lines of code: https://do-me.github.io/js-text-chunker/

Happy to hear your thoughts!
  • 1 reply
·
do-me 
posted an update over 1 year ago
view post
Post
3474
SemanticFinder now supports WebGPU thanks to @Xenova 's efforts with transformers.js v3!
Expect massive performance gains. Inferenced a whole book with 46k chunks in <5min. If your device doesn't support #WebGPU use the classic Wasm-based version:
- WebGPU: https://do-me.github.io/SemanticFinder/webgpu/
- Wasm: https://do-me.github.io/SemanticFinder/

WebGPU harnesses the full power of your hardware, no longer being restricted to just the CPU. The speedup is significant (4-60x) for all kinds of devices: consumer-grade laptops, heavy Nvidia GPU setups or Apple Silicon. Measure the difference for your device here: Xenova/webgpu-embedding-benchmark
Chrome currently works out of the box, Firefox requires some tweaking.

WebGPU + transformers.js allows to build amazing applications and make them accessible to everyone. E.g. SemanticFinder could become a simple GUI for populating your (vector) DB of choice. See the pre-indexed community texts here: do-me/SemanticFinder
Happy to hear your ideas!
  • 1 reply
·
do-me 
posted an update over 1 year ago
view post
Post
1165
Hey HuggingFace, love your open source attitude and particularly transformers.js for embedding models! Your current integration "use this model" gives you the transformers.js code, but there is no quick way to really test a model in one click.
SemanticFinder ( do-me/SemanticFinder) offers such an integration for all compatible feature-extraction models! All you need to do is add a URL parameter with the model ID to it, like so: https://do-me.github.io/SemanticFinder/?model=Xenova/bge-small-en-v1.5. You can also decide between quantized and normal mode with https://do-me.github.io/SemanticFinder/?model=Xenova/bge-small-en-v1.5&quantized=false. Maybe that would do for a HF integration?
I know it's a small open source project, but I really believe that it provides value for devs before deciding for one model or the other. Also, it's much easier than having to spin up a notebook, install dependencies etc.. It's private, so you could even do some real-world evaluation on personal data without having to worry about third-party services data policies.
Happy to hear the community's thoughts!
  • 1 reply
·
do-me 
posted an update over 1 year ago
view post
Post
1585
Get daily/weekly/monthly notifications about latest trending feature-extraction models compatible with transformers.js for semantic search! All open source built on GitHub Actions and ntfy.sh.

I'm also providing daily updated tables (filterable and sortable by onnx model size too!) if you want to have a look only once in a while. Download what suits you best: csv, xlsx, parquet, json, html.

Would you like to monitor other models/tags? Feel free to open a PR :)

GitHub: https://github.com/do-me/trending-huggingface-models
Ntfy.sh daily channel: https://ntfy.sh/feature_extraction_transformers_js_models_daily
Sortable table: https://do-me.github.io/trending-huggingface-models/

And the best part: all 145 models are integrated in SemanticFinder to play around with https://do-me.github.io/SemanticFinder/!

do-me 
posted an update over 1 year ago
view post
Post
2344
Question: HF model search not showing all results

I noticed that when I use the HF model search with these tags:
- feature-extraction
- transformers.js
it is not showing all models that are actually tagged.

Example: All Alibaba-NLP models (e.g. gte family) are correctly tagged but they don't show here
- https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js&sort=trending&search=gte
- correctly tagged model Alibaba-NLP/gte-large-en-v1.5

Does anyone know why?

fyi @Xenova
  • 3 replies
·
do-me 
posted an update almost 2 years ago
view post
Post
1904
Hey, I just added three useful advanced use cases to do-me/SemanticFinder.
SemanticFinder is a collection of embeddings for public documents or books. You can create your own index file from any text or pdf and save it without installing or downloading anything. Try yourself:

1. Translating from 100+ languages to English (even though it might confuse a strawberry with a grapefruit ;D): https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_70320cde&firstOnly=true&inferencingActive=False
2. Finding English synonyms: https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_0d1e28dc&firstOnly=true&inferencingActive=False
3. The "universal index idea": create an embedding index with 30k English words and reuse it on unseen texts. You can decide to fill the gaps in the index by additional inferencing or just stick to the 30k index for instant semantic similarity.
Initial idea: https://github.com/do-me/SemanticFinder/discussions/48
Try here: https://do-me.github.io/SemanticFinder/?hf=List_of_the_Most_Common_English_Words_0d1e28dc&inferencingActive=False&universalIndexSettingsWordLevel with a text of your choice.

This could be enhanced by adding duplets or triplets like "climate change" or "green house gas". Eventually I'd like to set up vector DB integrations.

Super happy to hear your feedback, ideas and maybe even contributions! :)

---
Edit: Apparently markdown url formatting does only work for HF links.