Stefan Schweter PRO
stefan-it
AI & ML interests
Flair Library, NER & PoS Tagging, LM Pretraining (mostly encoder-only), Historical Language Models
Recent Activity
updated
a model
about 12 hours ago
stefan-it/bert5urk
reacted
to
davanstrien's
post
with 🔥
2 days ago
🌍 Big step for multilingual AI data!
The Hugging Face community has rated educational content in languages spoken by 1.6 billion people! New additions:
• Japanese
• Italian
• Old High German
Learn more and contribute: https://huggingface.co/blog/davanstrien/fineweb2-community
These ratings can help enhance training data for major world languages.
Articles
Organizations
stefan-it's activity
upvoted
a
paper
10 days ago
upvoted
a
paper
14 days ago
upvoted
a
paper
15 days ago
upvoted
an
article
about 1 month ago
Article
FineWeb2-C: Help Build Better Language Models in Your Language
By
•
•
18jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Paper
•
2412.08802
•
Published
•
5
Evaluating Pixel Language Models on Non-Standardized Languages
Paper
•
2412.09084
•
Published
•
1
Training LayoutLM from Scratch for Efficient Named-Entity Recognition in the Insurance Domain
Paper
•
2412.09341
•
Published
•
1
OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages
Paper
•
2412.09587
•
Published
•
3
The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective
Paper
•
2412.09460
•
Published
•
7
upvoted
an
article
about 2 months ago
Article
They Said It Couldn’t Be Done
By
•
•
77upvoted
a
paper
2 months ago
upvoted
a
collection
3 months ago
upvoted
a
paper
3 months ago