sjhuskey commited on
Commit
3f1798a
·
verified ·
1 Parent(s): 1ec9ee6

update readme with ref to dataset

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -6,6 +6,11 @@ language:
6
  base_model:
7
  - distilbert/distilbert-base-multilingual-cased
8
  library_name: transformers
 
 
 
 
 
9
  ---
10
 
11
  # DLL Catalog Author Reconciliation Model
@@ -26,4 +31,4 @@ Achieving accuracy and reliability in this process will make the second goal of
26
 
27
  ## The Model
28
 
29
- After preliminary experiments with sequential neural network models using [bag-of-words](https://en.wikipedia.org/wiki/Bag-of-words_model), [term frequency-inverse document frequency](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) (tf-idf), and custom word embedding encoding, I settled on using a pretrained BERT model developed by [Devlin et al. 2018](https://arxiv.org/abs/1810.04805v2). Specifically, I'm using [Hugging Face's DistilBert base multilingual (cased) model](https://huggingface.co/distilbert/distilbert-base-multilingual-cased), which is based on work by [Sanh et al. 2020](https://doi.org/10.48550/arXiv.1910.01108).
 
6
  base_model:
7
  - distilbert/distilbert-base-multilingual-cased
8
  library_name: transformers
9
+ datasets:
10
+ - sjhuskey/latin_author_dll_id
11
+ metrics:
12
+ - f1
13
+ - accuracy
14
  ---
15
 
16
  # DLL Catalog Author Reconciliation Model
 
31
 
32
  ## The Model
33
 
34
+ After preliminary experiments with sequential neural network models using [bag-of-words](https://en.wikipedia.org/wiki/Bag-of-words_model), [term frequency-inverse document frequency](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) (tf-idf), and custom word embedding encoding, I settled on using a pretrained BERT model developed by [Devlin et al. 2018](https://arxiv.org/abs/1810.04805v2). Specifically, I'm using [Hugging Face's DistilBert base multilingual (cased) model](https://huggingface.co/distilbert/distilbert-base-multilingual-cased), which is based on work by [Sanh et al. 2020](https://doi.org/10.48550/arXiv.1910.01108).