Llama-3.1-8B-Spider-SQL-Ko
νκ΅μ΄ μ§λ¬Έμ SQL μΏΌλ¦¬λ‘ λ³ννλ Text-to-SQL λͺ¨λΈμ λλ€. Spider train λ°μ΄ν°μ μ νκ΅μ΄λ‘ λ²μν spider-ko λ°μ΄ν°μ μ νμ©νμ¬ λ―ΈμΈμ‘°μ νμμ΅λλ€.
π μ£Όμ μ±λ₯
Spider νκ΅μ΄ κ²μ¦ λ°μ΄ν°μ (1,034κ°) νκ° κ²°κ³Ό:
- μ ν μΌμΉμ¨: 42.65% (441/1034)
- μ€ν μ νλ: 65.47% (677/1034)
π‘ μ€ν μ νλκ° μ ν μΌμΉμ¨λ³΄λ€ λμ μ΄μ λ, SQL λ¬Έλ²μ΄ λ€λ₯΄λλΌλ λμΌν κ²°κ³Όλ₯Ό λ°ννλ κ²½μ°κ° λ§κΈ° λλ¬Έμ λλ€.
π λ°λ‘ μμνκΈ°
from unsloth import FastLanguageModel
# λͺ¨λΈ λΆλ¬μ€κΈ°
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="huggingface-KREW/Llama-3.1-8B-Spider-SQL-Ko",
max_seq_length=2048,
dtype=None,
load_in_4bit=True,
)
# νκ΅μ΄ μ§λ¬Έ β SQL λ³ν
question = "κ°μλ λͺ λͺ
μ΄ μλμ?"
schema = """ν
μ΄λΈ: singer
컬λΌ: singer_id, name, country, age"""
prompt = f"""λ°μ΄ν°λ² μ΄μ€ μ€ν€λ§:
{schema}
μ§λ¬Έ: {question}
SQL:"""
# κ²°κ³Ό: SELECT count(*) FROM singer
π λͺ¨λΈ μκ°
- κΈ°λ° λͺ¨λΈ: Llama 3.1 8B Instruct (4bit μμν)
- νμ΅ λ°μ΄ν°: spider-ko (1-epoch)
- μ§μ DB: 166κ°μ λ€μν λλ©μΈ λ°μ΄ν°λ² μ΄μ€ ( spider dataset )
- νμ΅ λ°©λ²: LoRA (r=16, alpha=32)
π¬ νμ© μμ
κΈ°λ³Έ μ¬μ©λ²
def generate_sql(question, schema_info):
"""νκ΅μ΄ μ§λ¬Έμ SQLλ‘ λ³ν"""
prompt = f"""λ€μ λ°μ΄ν°λ² μ΄μ€ μ€ν€λ§λ₯Ό μ°Έκ³ νμ¬ μ§λ¬Έμ λν SQL 쿼리λ₯Ό μμ±νμΈμ.
### λ°μ΄ν°λ² μ΄μ€ μ€ν€λ§:
{schema_info}
### μ§λ¬Έ: {question}
### SQL 쿼리:"""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=150, temperature=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response.split("### SQL 쿼리:")[-1].strip()
μ€μ μ¬μ© μμ
# μμ 1: μ§κ³ ν¨μ
question = "λΆμμ₯λ€ μ€ 56μΈλ³΄λ€ λμ΄κ° λ§μ μ¬λμ΄ λͺ λͺ
μ
λκΉ?"
# κ²°κ³Ό: SELECT count(*) FROM head WHERE age > 56
# μμ 2: μ‘°μΈ
question = "κ°μ₯ λ§μ λνλ₯Ό κ°μ΅ν λμμ μνλ 무μμΈκ°μ?"
# κ²°κ³Ό: SELECT T1.Status FROM city AS T1 JOIN farm_competition AS T2 ON T1.City_ID = T2.Host_city_ID GROUP BY T2.Host_city_ID ORDER BY COUNT(*) DESC LIMIT 1
# μμ 3: μλΈμΏΌλ¦¬
question = "κΈ°μ
κ°κ° μλ μ¬λλ€μ μ΄λ¦μ 무μμ
λκΉ?"
# κ²°κ³Ό: SELECT Name FROM people WHERE People_ID NOT IN (SELECT People_ID FROM entrepreneur)
β οΈ μ¬μ© μ μ£Όμμ¬ν
μ νμ¬ν
- β μμ΄ ν μ΄λΈ/컬λΌλͺ μ¬μ© (νκ΅μ΄ μ§λ¬Έ β μμ΄ SQL)
- β Spider λ°μ΄ν°μ λλ©μΈμ μ΅μ ν
- β NoSQL, κ·Έλν DB λ―Έμ§μ
- β λ§€μ° λ³΅μ‘ν μ€μ²© 쿼리λ μ νλ νλ½
π§ κΈ°μ μ¬μ
νμ΅ νκ²½
- GPU: NVIDIA Tesla T4 (16GB)
- νμ΅ μκ°: μ½ 4μκ°
- λ©λͺ¨λ¦¬ μ¬μ©: μ΅λ 7.6GB VRAM
νμ΄νΌνλΌλ―Έν°
training_args = {
"per_device_train_batch_size": 2,
"gradient_accumulation_steps": 4,
"learning_rate": 5e-4,
"num_train_epochs": 1,
"optimizer": "adamw_8bit",
"lr_scheduler_type": "cosine",
"warmup_ratio": 0.05
}
lora_config = {
"r": 16,
"lora_alpha": 32,
"lora_dropout": 0,
"target_modules": ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"]
}
π μ°Έκ³ μλ£
μΈμ©
@misc{llama31_spider_sql_ko_2025,
title={Llama-3.1-8B-Spider-SQL-Ko: Korean Text-to-SQL Model},
author={[Sohyun Sim, Youngjun Cho, Seongwoo Choi]},
year={2025},
publisher={Hugging Face KREW},
url={https://huggingface.co/huggingface-KREW/Llama-3.1-8B-Spider-SQL-Ko}
}
κ΄λ ¨ λ Όλ¬Έ
- Spider: A Large-Scale Human-Labeled Dataset (Yu et al., 2018)
π€ κΈ°μ¬μ
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for huggingface-KREW/Llama-3.1-8B-Spider-SQL-Ko
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct
Finetuned
unsloth/Meta-Llama-3.1-8B-Instruct
Datasets used to train huggingface-KREW/Llama-3.1-8B-Spider-SQL-Ko
Space using huggingface-KREW/Llama-3.1-8B-Spider-SQL-Ko 1
Evaluation results
- exact_match on Spider (Korean)self-reported42.650
- execution_accuracy on Spider (Korean)self-reported65.470