YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

scibert-wechsel-korean

Scibert(๐Ÿ‡บ๐Ÿ‡ธ) converted into Korean(๐Ÿ‡ฐ๐Ÿ‡ท) using WECHSEL technique.

Description

  • SciBERT is trained on papers from the corpus of semanticscholar.org. Corpus size is 1.14M papers, 3.1B tokens.
  • Wechsel is converting embedding layer's subword tokens from source language to target language.
  • SciBERT trained with English language is converted into Korean langauge using Wechsel technique.
  • Korean tokenizer is selected with KLUE PLMs' tokenizers due to its similar vocab size(32000) and performance.

Reference

Downloads last month
143
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.