AI & ML interests

NLP, ASR, Text-To-Speech

Recent Activity

ayymenΒ  updated a dataset 21 days ago
Tamazight-NLP/AmaWar
ayymenΒ  updated a collection 21 days ago
Text Datasets
ayymenΒ  updated a collection 21 days ago
Language Models
View all activity

ayymenΒ 
updated a collection about 1 month ago
omarkamaliΒ 
posted an update about 2 months ago
view post
Post
1676
New year, new dataset πŸš€

I just released omarkamali/wikipedia-labels, with all the structural labels and namespace from wikipedia in 300+ languages. A gift for the data preprocessors and cleaners among us.

Happy new year 2026 everyone! πŸŽ†
omarkamaliΒ 
posted an update 2 months ago
view post
Post
296
Picomon v0.2.0 released! πŸ’«

- Supports all of AMD, Nvidia and Apple Silicon πŸ§‘β€πŸ§‘β€πŸ§’β€πŸ§’
- Beautiful TUI with themes (who said monitoring should be boring?) πŸ’…
- Shareable Rig Cards! Boast to friends, family and foes alike 🫨

Get it now! uvx picomon or pip install picomon then picomon
  • 3 replies
Β·
omarkamaliΒ 
posted an update 3 months ago
view post
Post
3490
Hello picomon! AMD GPU Monitoring made easy

Just run uvx picomon and behold:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ GPU 0  GFX  42%  UMC  21%                β”‚  β”‚ GPU 1  GFX  78%  UMC  66%                β”‚
β”‚ PWR 135/250W (54%)  VRAM 10.0/16.0GB 62% β”‚  β”‚ PWR 210/250W (84%)  VRAM 14.5/16.0GB 90% β”‚
β”‚                                          β”‚  β”‚                                          β”‚
β”‚ GFX β–β–‚β–‚β–ƒβ–„β–„β–…β–†β–†β–‡β–ˆβ–‡β–†β–…β–„β–ƒβ–‚β–                   β”‚  β”‚ GFX β–‚β–ƒβ–„β–…β–†β–‡β–ˆβ–ˆβ–‡β–†β–…β–„β–‚β–‚β–ƒβ–…β–†                    β”‚
β”‚ PWR β–β–β–‚β–‚β–ƒβ–„β–„β–…β–†β–‡β–ˆβ–ˆβ–‡β–†β–…β–„β–‚β–                   β”‚  β”‚ PWR β–‚β–‚β–ƒβ–„β–…β–†β–‡β–ˆβ–ˆβ–‡β–†β–…β–„β–ƒβ–‚β–‚β–ƒ                    β”‚
β”‚ VRM β–β–β–‚β–‚β–ƒβ–„β–„β–…β–†β–‡β–ˆβ–ˆβ–ˆβ–‡β–†β–…β–„β–‚                   β”‚  β”‚ VRM β–‚β–ƒβ–„β–…β–†β–†β–‡β–ˆβ–ˆβ–ˆβ–‡β–†β–…β–„β–ƒβ–‚β–‚                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


Repo at https://github.com/omarkamali/picomon
Or pypi at https://pypi.org/project/picomon
omarkamaliΒ 
posted an update 3 months ago
view post
Post
5229
Exciting updates to the Wikipedia Monthly dataset for November! πŸš€

・ Fixed a bug to remove infobox leftovers and other wiki markers such as __TOC__
・ New python package https://pypi.org/project/wikisets: a dataset builder with efficient sampling so you can combine the languages you want seamlessly for any date (ideal for pretraining data but works for any purpose)
・ Moved the pipeline to a large server. Much higher costs but with better reliability and predictability (let me know if you'd like to sponsor this!).
・ Dataset sizes are unfortunately missing for this month due to shenanigans with the migration, but should be back in December's update.

Check out the dataset:
omarkamali/wikipedia-monthly