shisa-v1 - a shisa-ai Collection

shisa-ai 's Collections

updated May 27, 2024

JA/EN Bilingual LLMs

shisa-ai/shisa-v1-llama3-70b

Text Generation • Updated Mar 19 • 86 • 3

Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 70B outperforms gpt-3.5-turbo
shisa-ai/shisa-v1-llama3-8b

Text Generation • Updated Mar 19 • 51 • 6

Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 8B leads to significantly improved performance
augmxnt/shisa-gamma-7b-v1

Text Generation • Updated Mar 9 • 21.3k • • 18

Note 2023-12: A version using the shisa-v1 dataset applied to Japanese Stable LM Base Gamma 7B. Less tokenizer efficiency, but better overall performance
augmxnt/shisa-7b-v1

Text Generation • Updated Dec 20, 2023 • 387 • 28

Note 2023-12: In addition to SFT, this also underwent a DPO round which improved human preference rating
augmxnt/ultra-orca-boros-en-ja-v1

Viewer • Updated Apr 12 • 188k • 34 • 10

Note Largely synthetic dataset combining Airoboros, Ultrachat, Orca in JA and EN. Also, the Jaster train set
augmxnt/shisa-base-7b-v1

Text Generation • Updated Dec 9, 2023 • 333 • 16

Note 2023-12: A continued pre-train (8B 90% JA tokens) of Mistral 7B v0.1 w/ tokenizer extension; probably needs 10B more tokens of pretraining tbt
augmxnt/shisa-pretrain-en-ja-v1

Viewer • Updated Dec 5, 2023 • 4.7M • 50 • 7