Commit History

Update README.md
e008419
verified

gizadatateam commited on

Update README.md
134e74d
verified

gizadatateam commited on

Update README.md
5faf516
verified

gizadatateam commited on

Create README.md
1b9c79a
verified

gizadatateam commited on

Delete README.md
c567c55
verified

gizadatateam commited on

--- ## Model Card for `gizadatateam/Arabic-ModernBERT` ### Summary `Arabic-ModernBERT` is a ModernBERT‐architecture Masked Language Model tailored for Arabic. It supports very long contexts (up to 8 192 tokens), rotary position embeddings, local‐global alternating attention, and GeGLU activations. ### Files & Versions - **.gitattributes** (1.63 kB) - **config.json** (1.41 kB) citeturn2view0turn4view0 - **model.safetensors** (845 MB; LFS) - **special_tokens_map.json** (866 Bytes) - **tokenizer.json** (18.6 MB; LFS) - **tokenizer_config.json** (14.2 kB; LFS) - **training_state.pt** (1.65 GB; LFS) citeturn2view0 ### Configuration (from `config.json`) ```jsonc { "_attn_implementation_autoset": true, "_name_or_path": "./checkpoint_step_20000/", "architectures": ["ModernBertForMaskedLM"], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 50281, "classifier_activation": "gelu", "classifier_bias": false, "classifier_dropout": 0.0, "classifier_pooling": "mean", "cls_token_id": 50281, "decoder_bias": true, "deterministic_flash_attn": false, "embedding_dropout": 0.0, "eos_token_id": 50282, "global_attn_every_n_layers": 3, "global_rope_theta": 160000.0, "gradient_checkpointing": false, "hidden_activation": "gelu", "hidden_size": 768, "initializer_cutoff_factor": 2.0, "initializer_range": 0.02, "intermediate_size": 1152, "layer_norm_eps": 1e-05, "local_attention": 128, "local_rope_theta": 10000.0, "max_position_embeddings": 8192, "mlp_bias": false, "mlp_dropout": 0.0, "model_type": "modernbert", "norm_bias": false, "norm_eps": 1e-05, "num_attention_heads": 12, "num_hidden_layers": 22, "pad_token_id": 50283, "position_embedding_type": "absolute", "reference_compile": false, "repad_logits_with_grad": false, "sep_token_id": 50282, "sparse_pred_ignore_index": -100, "sparse_prediction": false, "torch_dtype": "float32", "transformers_version": "4.48.3", "use_cache": false, "vocab_size": 130368 } ``` All fields above are verbatim from the repository’s `config.json`. citeturn4view0 ### Model Statistics - **Model size**: 211 M parameters - **Tensor type**: `float32` - **Downloads (last month)**: 0 - **Inference Providers**: none (not deployed) citeturn3view0 ### Usage Example ```python from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline model_id = "gizadatateam/Arabic-ModernBERT" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForMaskedLM.from_pretrained(model_id) fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer) print(fill_mask("مرحبا أيها [MASK]!")) ``` ### Citation If you use this model, please cite the underlying ModernBERT paper: ```bibtex @misc {modernbert, title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference}, author={Warner et al.}, year={2024}, eprint={2412.13663}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2412.13663}, } ``` ---
0d36657
verified

gizadatateam commited on

Add tokenizer files
60338d1
verified

gizadatateam commited on

Add model files
7e9b733
verified

gizadatateam commited on