updated readme
Browse files
README.md
CHANGED
|
@@ -58,14 +58,18 @@ widget:
|
|
| 58 |
example_title: "gast and dust"
|
| 59 |
---
|
| 60 |
|
| 61 |
-
# astroBERT
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
This model is cased (it treats `ads` and `ADS` differently).
|
| 65 |
|
| 66 |
-
##
|
| 67 |
-
0. [
|
| 68 |
-
1. [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
|
| 71 |
### BibTeX
|
|
|
|
| 58 |
example_title: "gast and dust"
|
| 59 |
---
|
| 60 |
|
| 61 |
+
# ***astroBERT: a language model for astrophysics***
|
| 62 |
+
This public repository contains the work of the [NASA/ADS](https://ui.adsabs.harvard.edu/) on building an NLP language model tailored to astrophysics, along with tutorials and miscellaneous related files.
|
| 63 |
+
This model is **cased** (it treats `ads` and `ADS` differently).
|
|
|
|
| 64 |
|
| 65 |
+
## astroBERT models
|
| 66 |
+
0. **Base model**: Pretrained model on English language using a masked language modeling (MLM) and next sentence prediction (NSP) objective. It was introduced in [this paper at ADASS 2021](https://arxiv.org/abs/2112.00590) and made public at ADASS 2022.
|
| 67 |
+
1. **NER-DEAL model**: This model adds a token classification head to the base model finetuned on the [DEAL@WIESP2022 named entity recognition](https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks) task. Must be loaded from the `revision='NER-DEAL'` branch (see tutorial 2).
|
| 68 |
+
|
| 69 |
+
### Tutorials
|
| 70 |
+
0. [generate text embedding (for downstream tasks)](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/0_Embeddings.ipynb)
|
| 71 |
+
1. [use astroBERT for the Fill-Mask task](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/1_Fill-Mask.ipynb)
|
| 72 |
+
2. [make NER-DEAL predictions](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/2_NER_DEAL.ipynb)
|
| 73 |
|
| 74 |
|
| 75 |
### BibTeX
|