SaCTI: Sanskrit Compound Type Identifier

Trained Models for the paper "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit". If you use these models please cite the paper.

How to use the models

1. Clone the github repository of the paper

git clone https://github.com/ashishgupta2598/SaCTI.git

2. Create a new environment and activate it

conda create --name sactienv python=3.9
conda activate sactienv

3. Install all required packages

pip3 install -r requirements.txt

4. Download the model corresponding to the experiment from hugging face repository from the Available Models list given below

/save_models_english
/save_models_marathi
/save_models_saCTIbase_coarse
/save_models_saCTIbase_fine
/save_models_saCTIlarge_coarse
/save_models_saCTIlarge_fine

Each of the above model has a bert model, posdep model and an xlm-roberta-base model

Run following command in bash

python3 main.py --model_path='<path to downloaded model>' --experiment='<exp-name>' --training= False

Following are the valid exp-names
english, marathi, sacti-base_coarse, sacti-base_fine, sacti-large_coarse, sacti-large_fine

NOTE: These models are obtained after running the training pipeline as mentioned in the official github repository using default batch size = 75 and epochs = 70.

Folder Structure

├── LICENSE
├── README.md
├── save_models_english/
│ ├── bert/
│ │  └── model.pth
│ ├── posdep/
│ │  └── model.pth
│ └── xlm-roberta-base/
│    └── customized-mwt-ner/
│       ├── customized-mwt-ner.tagger.mdl
│       └── customized-mwt-ner.vocabs.json
├── save_models_marathi/
│ └── ... (same structure as above)
├── save_models_saCTIbase_coarse/
│ └── ... (same structure as above)
├── save_models_saCTIbase_fine/
│ └── ... (same structure as above)
├── save_models_saCTIlarge_coarse/
│ └── ... (same structure as above)
└── save_models_saCTIlarge_fine/
└── ... (same structure as above)

Each folder contains three models:

bert/model.pth
posdep/model.pth
xlm-roberta-base/customized-mwt-ner/

customized-mwt-ner.tagger.mdl
customized-mwt-ner.vocabs.json

Citation

@inproceedings{sandhan-etal-2022-novel,
    title = "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in {S}anskrit",
    author = "Sandhan, Jivnesh  and Gupta, Ashish  and Terdalkar, Hrishikesh  and Sandhan, Tushar  and Samanta, Suvendu  and Behera, Laxmidhar  and Goyal, Pawan",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.358",
    pages = "4071--4083",
    abstract = "The phenomenon of compounding is ubiquitous in Sanskrit. It serves for achieving brevity in expressing thoughts, while simultaneously enriching the lexical and structural formation of the language. In this work, we focus on the Sanskrit Compound Type Identification (SaCTI) task, where we consider the problem of identifying semantic relations between the components of a compound word. Earlier approaches solely rely on the lexical information obtained from the components and ignore the most crucial contextual and syntactic information useful for SaCTI. However, the SaCTI task is challenging primarily due to the implicitly encoded context-sensitive semantic relation between the compound components. Thus, we propose a novel multi-task learning architecture which incorporates the contextual information and enriches the complementary syntactic information using morphological tagging and dependency parsing as two auxiliary tasks. Experiments on the benchmark datasets for SaCTI show 6.1 points (Accuracy) and 7.7 points (F1-score) absolute gain compared to the state-of-the-art system. Further, our multi-lingual experiments demonstrate the efficacy of the proposed architecture in English and Marathi languages.",
}

License

This project is licensed under the terms of the Apache license 2.0.

Acknowledgements

The models in this repository are obtained after training based on the original paper
Official Github repository of the paper.

sanganaka
/

SaCTI-Models