Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Omartificial-Intelligence-Space 's Collections
DIRA – Diraya Arabic Reasoning AI
Arabic Matryoshka & GATE Embedding Models
Arabic NLI & Semantic Similarity Datasets
Semantic Arabic Qwen Embeddings
Arabic Re-Ranking Hub
Saudi Dialect Sentence Embedding Models Collection
AraEuroBERT
Arabic ModernBERT
ArabianLLM Series
Arabic LLAMA3 & 3.1 FineTuned Models
Huggingface FineWeb2 Arabic Dataset Portions

Huggingface FineWeb2 Arabic Dataset Portions

updated Jun 22

Collection of a comprehensive dataset of Arabic text sourced from the FineWeb2 project, representing diverse content across Arabic MSA and Dialect.

Upvote
1

  • HuggingFaceFW/fineweb-2

    Viewer • Updated Jun 27 • 5.02B • 672k • 601

    Note This is the Original Repo for FineWeb2 include 1000s languages. Fine the Arabic Portions below


  • Omartificial-Intelligence-Space/FineWeb2-MSA

    Viewer • Updated Dec 15, 2024 • 907M • 236 • 1

  • Omartificial-Intelligence-Space/FineWeb2-Egyptian-Arabic

    Viewer • Updated Dec 12, 2024 • 23.9M • 61 • 2

  • Omartificial-Intelligence-Space/FineWeb2-Moroccan-Arabic

    Viewer • Updated Dec 12, 2024 • 69.6M • 92 • 1

  • Omartificial-Intelligence-Space/FineWeb2-North-Levantine-Arabic

    Viewer • Updated Dec 12, 2024 • 223k • 51 • 1

  • Omartificial-Intelligence-Space/FineWeb2-Najdi-Arabic

    Viewer • Updated Dec 12, 2024 • 48.4M • 76 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs