Colab Notebook

Open In Colab

ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹

[AI hub]์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ์งˆ์˜(NL2SQL) ๊ฒ€์ƒ‰ ์ƒ์„ฑ ๋ฐ์ดํ„ฐ

https://huggingface.co/combe4259/NHSQLNL/blob/main/TEXT_NL2SQL_label_nh_consultation.json https://huggingface.co/combe4259/NHSQLNL/blob/main/nh_consultation_db_annotation.json

NHSQLNL: ๊ธˆ์œต ์ž์—ฐ์–ด โ†’ SQL ๋ณ€ํ™˜ ๋ชจ๋ธ

NHSQLNL์€ ํ•œ๊ตญ์–ด ๊ธˆ์œต ์ž์—ฐ์–ด ์งˆ์˜๋ฅผ SQL ์ฟผ๋ฆฌ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” Text-to-SQL (NL2SQL) ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
์€ํ–‰ ๋ฐ ๊ธˆ์œต๊ถŒ ๋„๋ฉ”์ธ ์งˆ์˜๋ฅผ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์งˆ์˜(SQL)๋กœ ์ž๋™ ๋ณ€ํ™˜ํ•˜์—ฌ, ๊ณ ๊ฐ ์งˆ์˜ ์‘๋‹ต ์‹œ์Šคํ…œ ๋ฐ ๊ธˆ์œต ๋ฐ์ดํ„ฐ ๋ถ„์„์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


์ฃผ์š” ๊ธฐ๋Šฅ (Features)

  • ํ•œ๊ตญ์–ด ๊ธˆ์œต ๋„๋ฉ”์ธ ์ž์—ฐ์–ด ์ž…๋ ฅ์„ SQL ์ฟผ๋ฆฌ๋กœ ๋ณ€ํ™˜
  • ์‚ฌ์ „ ์ •์˜๋œ ์Šคํ‚ค๋งˆ์— ๋งž์ถ˜ ์•ˆ์ „ํ•œ SQL ์ƒ์„ฑ
  • PyTorch ๋ฐ Hugging Face transformers ๊ธฐ๋ฐ˜

์‚ฌ์šฉ ๋ฐฉ๋ฒ• (How to Use)

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# ๋ชจ๋ธ ๋กœ๋“œ
MODEL_PATH = "combe4259/NHSQLNL"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_PATH)

# ์ž…๋ ฅ ์งˆ์˜
query = "2023๋…„์— ๊ฐœ์„ค๋œ ์˜ˆ๊ธˆ ๊ณ„์ขŒ ์ˆ˜๋ฅผ ์•Œ๋ ค์ค˜"

inputs = tokenizer(query, return_tensors="pt")

# SQL ์˜ˆ์ธก
outputs = model.generate(**inputs, max_length=128)
sql = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("์ž…๋ ฅ:", query)
print("์ƒ์„ฑ๋œ SQL:", sql)


---

## ํ•™์Šต ๋ฐ์ดํ„ฐ (Training Data)

- ์ž์ฒด ๊ตฌ์ถ•ํ•œ ๊ธˆ์œต ๋„๋ฉ”์ธ **์ž์—ฐ์–ด โ†” SQL ๋งคํ•‘ ๋ฐ์ดํ„ฐ์…‹** ์‚ฌ์šฉ  
- ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ: SQL ์Šคํ‚ค๋งˆ ์ •๊ทœํ™” ๋ฐ ํ† ํฌ๋‚˜์ด์ € ๊ธฐ๋ฐ˜ ์ž…๋ ฅ ๋ณ€ํ™˜  

---
---

## ํ™œ์šฉ ๊ฐ€๋Šฅ ๋ถ„์•ผ (Applications)

- ๊ธˆ์œต๊ถŒ ์ฑ—๋ด‡ ๋ฐ ์ƒ๋‹ด ์ž๋™ํ™”  
- ์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์กฐํšŒ ๋ฐ ๋ฆฌํฌํŠธ ์ƒ์„ฑ  
- ๋น„์ „๋ฌธ๊ฐ€ ๋Œ€์ƒ SQL ํ•™์Šต/์—ฐ์Šต ๋„๊ตฌ  
Downloads last month
82
Safetensors
Model size
275M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support