Kaito Sugimoto's picture

Kaito Sugimoto

kaisugi

·

https://kaisugi.me

kaisugi

AI & ML interests

Japanese LLMs

Recent Activity

liked a dataset 2 days ago

tohoku-nlp/JMultiPL-E

liked a model 2 days ago

EQUES/jpharma-bert-large

liked a model 2 days ago

EQUES/jpharma-bert-base

View all activity

Organizations

upvoted a collection 25 days ago

Gemma-2-Swallow

6 items • Updated May 18 • 3

upvoted a collection about 2 months ago

Shisa V2

A family of bilingual JA/EN LLMs • 32 items • Updated 22 days ago • 8

upvoted a collection 3 months ago

8b-class-japanese-models

14 items • Updated Apr 15 • 2

upvoted 3 collections 4 months ago

Sarashina2.2

Large Language Models developed by SB Intuitions. Pretrained and instruction-tuned models are available in three sizes: 0.5B, 1B, and 3B. • 6 items • Updated Mar 5 • 6

ModernBERT-Ja

Bringing Japanese-BERT into modernity • 4 items • Updated Feb 27 • 6

Asagi-VLM

Asagi is a Japanese Vision & Language model, trained on a large-scale synthetic dataset. • 4 items • Updated Feb 24 • 6

upvoted a collection 5 months ago

TinySwallow

Compact Japanese models trained with "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models" • 5 items • Updated Jan 30 • 17

upvoted 2 articles 6 months ago

Article

Navigating Korean LLM Research #2: Evaluation Tools

By

•

Oct 23, 2024

• 8

Article

Navigating Korean LLM Research #1: Models

By

•

Oct 22, 2024

• 26

upvoted 2 collections 7 months ago

LLM-jp-3 Fine-tuned Models

Fine-tuned models in the LLM-jp-3 model series • 25 items • Updated 30 days ago • 6

LLM-jp-3 Pre-trained Models

Pre-trained models in the LLM-jp-3 model series • 10 items • Updated 30 days ago • 6

upvoted an article 9 months ago

Article

How to generate text: using different decoding methods for language generation with Transformers

By

•

Mar 1, 2020

• 218

upvoted a collection 9 months ago

Llama-3.1-Swallow

9 items • Updated Jan 31 • 8

upvoted a paper 9 months ago

PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency

Paper • 2410.07563 • Published Oct 10, 2024 • 2

upvoted 2 collections 9 months ago

gemma-2-baku

The baku model series are based on the gemma-2 series and have been continually pre-trained on Japanese-specific corpora. • 4 items • Updated Mar 16 • 5

Gemma 2 JPN Release

A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated 27 days ago • 29

upvoted a collection 10 months ago

Borea

3 items • Updated Aug 21, 2024 • 2

upvoted a paper 10 months ago

Ruri: Japanese General Text Embeddings

Paper • 2409.07737 • Published Sep 12, 2024 • 8

upvoted 2 collections 10 months ago

Ruri: Japanese General Text Embeddings

26 items • Updated Apr 18 • 14

Japanese SimCSE

Tsukagoshi et al., Japanese SimCSE Technical Report, arXiv 2023. https://arxiv.org/abs/2310.19349 • 5 items • Updated Apr 18 • 2