--- language: id pipeline_tag: dependency-parsing # Tag yang benar untuk tugas model ini widget: # Contoh kalimat yang bisa diproses model ini - text: "Presiden Joko Widodo mengunjungi korban bencana alam di Palu." license: mit # Sesuai input Anda library_name: spacy # Pastikan tidak ada karakter aneh setelah 'spacy' tags: - id - spacy - dependency-parsing - indonesian - gsd # Karena menggunakan corpus GSD model-index: # Bagian ini membantu pengindeksan lebih lanjut (opsional tapi bagus) - name: spacy-dep-parsing-id # Nama model/repo Anda results: # Menambahkan hasil evaluasi utama di sini - task: type: dependency-parsing name: Dependency Parsing dataset: type: ud-id-gsd # Mengacu pada dataset test name: UD Indonesian GSD (Test Split) config: test split: test revision: main # atau commit hash jika spesifik metrics: - type: dep_uas value: 0.8282 # UAS pada test set name: UAS (Unlabeled Attachment Score) - type: dep_las value: 0.7436 # LAS pada test set name: LAS (Labeled Attachment Score) - type: sents_f value: 0.9937 # Sentence F-score pada test set name: Sentence F-Score --- # spaCy Dependency Parsing Model for Indonesian (UD-ID-GSD) This repository contains a spaCy v3 model trained for **Dependency Parsing** on the Indonesian language. The model was trained using the configuration generated by `spacy init config` with default settings for the parser component. ## Dataset The model was trained on the **Universal Dependencies Indonesian GSD (UD-ID-GSD)** dataset. *(Reference: McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bednářová, Z., Wang, S., & Lee, Y. (2013). Universal Dependency Annotation for Multilingual Parsing.)* The dataset splits used contained the following number of documents: * **Total Sentences (approx.):** 5,593 * **Training Set:** 4,477 documents * **Development (Dev) Set:** 559 documents * **Test Set:** 557 documents ## Pipeline Components This model's pipeline only contains the `parser` component. It does **not** include a tagger, NER, or other components by default. The parser relies on internal token-to-vector embeddings trained during the process. ## How to Use You can load this model directly using spaCy after installing it: ```python import spacy # Load the model from Hugging Face Hub model_id = "freksowibowo/spacy-dep-parsing-id" try: nlp = spacy.load(model_id) print(f"Model '{model_id}' loaded successfully.") # Example usage text = "Gubernur Jawa Barat Ridwan Kamil meresmikan jembatan baru di Cirebon." doc = nlp(text) print("\nDependency Parse Results:") print(f"{'Token':<15} {'Relation':<10} {'Head':<15} {'Head POS':<8}") print("-" * 50) for token in doc: print(f"{token.text:<15} {token.dep_:<10} {token.head.text:<15} {token.head.pos_:<8}") # You can also visualize using displacy (if in Jupyter/IPython) # from spacy import displacy # displacy.render(doc, style="dep", jupyter=True, options={'distance': 100}) except OSError: print(f"Error: Model '{model_id}' not found.") print("Please ensure you have internet connection and the repository ID is correct.") except Exception as e: print(f"An error occurred: {e}")