Christopher Glaze
commited on
Commit
·
cbaea3e
1
Parent(s):
4bd3fde
Add blogpost link
Browse files
README.md
CHANGED
@@ -17,6 +17,10 @@ tags:
|
|
17 |
- data filtering
|
18 |
---
|
19 |
|
|
|
|
|
|
|
|
|
20 |
# Summary
|
21 |
Instruction tuning has emerged as an important step in developing performant large language models (LLMs) for generative AI tasks. While industry-backed LLMs such as ChatGPT, Bard, Claude, and even the open-source Llama 2 have relied on massive, expensive proprietary datasets unavailable to the public, the open source community has banded together to create similar datasets such as OpenAssistant and Dolly that are available to everyone. However, high variance in the quality and distribution of responses collected by volunteers has limited the quality of resulting open source models.
|
22 |
|
|
|
17 |
- data filtering
|
18 |
---
|
19 |
|
20 |
+
## See [our blogpost](https://snorkel.ai/how-we-built-a-better-genai-with-programmatic-data-development/) for a more in-depth discussion of the model.
|
21 |
+
|
22 |
+
<br>
|
23 |
+
|
24 |
# Summary
|
25 |
Instruction tuning has emerged as an important step in developing performant large language models (LLMs) for generative AI tasks. While industry-backed LLMs such as ChatGPT, Bard, Claude, and even the open-source Llama 2 have relied on massive, expensive proprietary datasets unavailable to the public, the open source community has banded together to create similar datasets such as OpenAssistant and Dolly that are available to everyone. However, high variance in the quality and distribution of responses collected by volunteers has limited the quality of resulting open source models.
|
26 |
|