SaCTI: Sanskrit Compound Type Identifier
Trained Models for the paper "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit". If you use these models please cite the paper.
How to use the models
1. Clone the github repository of the paper
git clone https://github.com/ashishgupta2598/SaCTI.git
2. Create a new environment and activate it
conda create --name sactienv python=3.9
conda activate sactienv
3. Install all required packages
pip3 install -r requirements.txt
4. Download the model corresponding to the experiment from hugging face repository from the Available Models list given below
/save_models_english
/save_models_marathi
/save_models_saCTIbase_coarse
/save_models_saCTIbase_fine
/save_models_saCTIlarge_coarse
/save_models_saCTIlarge_fine
Each of the above model has a bert model
, posdep model
and an xlm-roberta-base model
Run following command in bash
python3 main.py --model_path='<path to downloaded model>' --experiment='<exp-name>' --training= False
Following are the valid exp-namesenglish
, marathi
, sacti-base_coarse
, sacti-base_fine
, sacti-large_coarse
, sacti-large_fine
NOTE: These models are obtained after running the training pipeline as mentioned in the official github repository using default batch size = 75
and epochs = 70
.
Folder Structure
βββ LICENSE
βββ README.md
βββ save_models_english/
β βββ bert/
β β βββ model.pth
β βββ posdep/
β β βββ model.pth
β βββ xlm-roberta-base/
β βββ customized-mwt-ner/
β βββ customized-mwt-ner.tagger.mdl
β βββ customized-mwt-ner.vocabs.json
βββ save_models_marathi/
β βββ ... (same structure as above)
βββ save_models_saCTIbase_coarse/
β βββ ... (same structure as above)
βββ save_models_saCTIbase_fine/
β βββ ... (same structure as above)
βββ save_models_saCTIlarge_coarse/
β βββ ... (same structure as above)
βββ save_models_saCTIlarge_fine/
βββ ... (same structure as above)
Each folder contains three models:
bert/model.pth
posdep/model.pth
xlm-roberta-base/customized-mwt-ner/
customized-mwt-ner.tagger.mdl
customized-mwt-ner.vocabs.json
Citation
@inproceedings{sandhan-etal-2022-novel,
title = "A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in {S}anskrit",
author = "Sandhan, Jivnesh and Gupta, Ashish and Terdalkar, Hrishikesh and Sandhan, Tushar and Samanta, Suvendu and Behera, Laxmidhar and Goyal, Pawan",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = oct,
year = "2022",
address = "Gyeongju, Republic of Korea",
publisher = "International Committee on Computational Linguistics",
url = "https://aclanthology.org/2022.coling-1.358",
pages = "4071--4083",
abstract = "The phenomenon of compounding is ubiquitous in Sanskrit. It serves for achieving brevity in expressing thoughts, while simultaneously enriching the lexical and structural formation of the language. In this work, we focus on the Sanskrit Compound Type Identification (SaCTI) task, where we consider the problem of identifying semantic relations between the components of a compound word. Earlier approaches solely rely on the lexical information obtained from the components and ignore the most crucial contextual and syntactic information useful for SaCTI. However, the SaCTI task is challenging primarily due to the implicitly encoded context-sensitive semantic relation between the compound components. Thus, we propose a novel multi-task learning architecture which incorporates the contextual information and enriches the complementary syntactic information using morphological tagging and dependency parsing as two auxiliary tasks. Experiments on the benchmark datasets for SaCTI show 6.1 points (Accuracy) and 7.7 points (F1-score) absolute gain compared to the state-of-the-art system. Further, our multi-lingual experiments demonstrate the efficacy of the proposed architecture in English and Marathi languages.",
}
License
This project is licensed under the terms of the Apache license 2.0
.
Acknowledgements
The models in this repository are obtained after training based on the original paper
Official Github repository of the paper.