Peanut Jar Mixers Development

Activity Feed

AI & ML interests

WIP LLM datasets and models.

Recent Activity

xzuyn updated a dataset 14 days ago

PJMixers-Dev/Fundus-AP-News-Formatted-3-AutoSuffixCleaned

xzuyn published a dataset 14 days ago

PJMixers-Dev/Fundus-AP-News-Formatted-3-AutoSuffixCleaned

xzuyn updated a dataset 14 days ago

PJMixers-Dev/Fundus-AP-News-Formatted-AutoSuffixCleaned

View all activity

PJMixers-Dev 's collections 19

Sets for Automated Cleaning

PJMixers-Dev/AP-Fundus-Suffix-Cleaner

Viewer • Updated 18 days ago • 1.83k • 110
PJMixers-Dev/Nelathan_synthetic-sugar-quill-cleaner

Viewer • Updated 19 days ago • 9.62k • 122 • 1

Human vs. AI Writing RL Datasets

PJMixers-Dev/Lit-axo-Shuffled-RL-v0.1

Viewer • Updated May 20 • 976 • 22
PJMixers-Dev/Nelathan_synthetic-sugar-quill-RL-v0.1

Viewer • Updated May 20 • 141 • 23
jondurbin/gutenberg-dpo-v0.1

Viewer • Updated Jan 12, 2024 • 918 • 805 • 147
nbeerbower/gutenberg2-dpo

Viewer • Updated Nov 16, 2024 • 293 • 67 • 20

Fundus News Scrapes

https://github.com/flairNLP/fundus

PJMixers-Dev/Fundus-105K

Viewer • Updated Dec 8, 2024 • 105k • 10
PJMixers-Dev/Fundus-105K-Formatted

Viewer • Updated Dec 8, 2024 • 71.6k • 9
PJMixers-Dev/Fundus-CC-2.5M

Viewer • Updated Dec 30, 2024 • 2.22M • 291
PJMixers-Dev/Fundus-CC-2.5M-Formatted

Viewer • Updated May 15 • 1.4M • 20 • 1

QwQ Re-Genned RP Datasets

QwQ responses on the "1" split of the datasets.

PJMixers-Dev/aesir-rpg-charcards-gpt4-nsfw-sharegpt-flatguard-split-qwq

Updated Mar 29 • 14
PJMixers-Dev/aesir-rpg-charcards-gpt4-sfw-sharegpt-flatguard-split-qwq

Viewer • Updated Mar 29 • 1.36k • 23
PJMixers-Dev/aesir-rpg-fantasy-markdown-flatguard-split-qwq

Viewer • Updated Mar 29 • 277 • 7 • 1
PJMixers-Dev/aesir-rpg-fantasy-novel-flatguard-split-qwq

Viewer • Updated Mar 29 • 767 • 17

Prepped for Re-Gen

Dataset flattened, S4 removed, and split into 4 even splits. Ready to have final turn regenerated with another model.

PJMixers-Dev/aesir-rpg-charcards-gpt4-nsfw-sharegpt-flatguard-split

Updated Mar 28 • 29
PJMixers-Dev/aesir-rpg-charcards-gpt4-sfw-sharegpt-flatguard-split

Viewer • Updated Mar 28 • 6.04k • 32
PJMixers-Dev/aesir-rpg-fantasy-markdown-flatguard-split

Viewer • Updated Mar 28 • 1.11k • 29
PJMixers-Dev/aesir-rpg-fantasy-novel-flatguard-split

Viewer • Updated Mar 28 • 4.19k • 38

[Old-ish] QwQ RP/Co-Writing Datasets

https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py

PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite

Viewer • Updated Mar 12 • 11.9k • 57
PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite

Viewer • Updated Mar 12 • 5.76k • 40
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite

Viewer • Updated Mar 12 • 6.86k • 28
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite

Viewer • Updated Mar 8 • 1.73k • 41

LLaMa-Guard-3 Classified Datasets

null classification means it was skipped due to the turns not correctly converting within apply_chat_template

PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 5.76k • 3
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 200 • 5
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 1.73k • 4
PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 2.11k • 4

gemini-2.0-flash-thinking-exp-1219 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn.

PJMixers-Dev/Weyaxi_HelpSteer-filtered-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.36k • 10 • 3
PJMixers-Dev/lmsys_lmsys-chat-1m-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.09k • 12
PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 16.9k • 9 • 4
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 4.41k • 16

gemini-exp-1206 Datasets

Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn.

PJMixers-Dev/allenai_WildChat-1M-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 13 • 205 • 16
PJMixers-Dev/grimulkan_theory-of-mind-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 13 • 463 • 8
PJMixers-Dev/camel-ai_biology-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 14 • 87 • 9

Fake Distill Datasets

PJMixers-Dev/Tiefighter-13B-Fake-Distill-ShareGPT

Viewer • Updated Jan 10 • 4.85k • 45 • 1

Earthen Model Series

PJMixers-Dev/Gemma-3-Earthen-Completion-v0.1-4B

Text Generation • 4B • Updated May 21 • 7
PJMixers-Dev/Gemma-3-Earthen-v0.1-4B

Text Generation • 4B • Updated May 22 • 9
PJMixers-Dev/Gemma-3-Earthen-v0.2-4B

Text Generation • 4B • Updated May 22 • 10
PJMixers-Dev/Granite-3.1-Earthen-v0.3-3B-A800M

Text Generation • 3B • Updated May 24 • 15

Salesforce Writing Quality Reward Models Datasets

⁣https://github.com/salesforce/creativity_eval/tree/main/WritingRewards/training/Llama/data

AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation

Paper • 2504.07532 • Published Apr 10
PJMixers-Dev/Salesforce_LAMP-R

Viewer • Updated May 17 • 2.13k • 14 • 1
PJMixers-Dev/Salesforce_LAMP-P

Viewer • Updated May 17 • 1.79k • 14
PJMixers-Dev/Salesforce_LAMP-P-exp

Viewer • Updated May 17 • 1.79k • 13

V3-0324 Re-Genned RP Datasets

V3-0324 responses on the "0" split of the datasets.

PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split-v3-0324

Viewer • Updated Mar 28 • 2.58k • 16 • 1

R1 Rag Datasets

PJMixers-Dev/Subtitles-rag-questions-r1

Viewer • Updated Mar 26 • 622 • 32
PJMixers-Dev/Subtitles-rag-answers-r1

Viewer • Updated Mar 28 • 6.97k • 32

QwQ Rag Datasets

https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py

PJMixers-Dev/AP-News-2024-rag-questions-qwq-all-kcpp

Viewer • Updated Mar 21 • 167 • 10
PJMixers-Dev/AP-News-2024-rag-answers-qwq-all-kcpp

Viewer • Updated Mar 21 • 290 • 9
PJMixers-Dev/Subtitles-rag-questions-qwq-all-kcpp

Viewer • Updated Mar 19 • 133 • 50

Length Filtered Thinking Datasets

Filtered to remove thinking or responses which are too long compared to the average distribution. Also tried to clean some stuff.

PJMixers-Dev/nvidia-r1-code-1k-think-256-response-filtered-ShareGPT

Viewer • Updated Mar 20 • 136k • 34
PJMixers-Dev/OpenThoughts-114k-Code_decontaminated-4k-think-2k-response-filtered-ShareGPT

Viewer • Updated Mar 20 • 6.08k • 39
PJMixers-Dev/dolphin-deepseek-1k-think-1k-response-filtered-ShareGPT

Viewer • Updated Mar 22 • 86.1k • 130
PJMixers-Dev/KodCode_KodCode-V1-SFT-R1-4k-think-1k-response-ShareGPT

Viewer • Updated Mar 23 • 174k • 82 • 1

gemini-2.0-flash-thinking-exp-01-21 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-01-21. Currently only single-turn.

PJMixers-Dev/m-a-p_CodeFeedback-Filtered-Instruction-gemini-2.0-flash-thinking-exp-01-21-CustomShareGPT

Viewer • Updated Jan 28 • 1.94k • 11 • 2
PJMixers-Dev/cognitivecomputations_dolphin-r1-reasoning-flash-CustomShareGPT

Viewer • Updated Jan 30 • 250k • 53 • 1

gemini-2.0-flash-exp Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-exp. Currently only single-turn.

PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 485 • 32
PJMixers-Dev/grimulkan_theory-of-mind-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 539 • 6
PJMixers-Dev/grimulkan_physical-reasoning-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 898 • 16
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 136 • 14

Thinking/Reasoning Datasets

PJMixers-Dev/Weyaxi_HelpSteer-filtered-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.36k • 10 • 3
PJMixers-Dev/lmsys_lmsys-chat-1m-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.09k • 12
PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 16.9k • 9 • 4
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 4.41k • 16

Sets for Automated Cleaning

PJMixers-Dev/AP-Fundus-Suffix-Cleaner

Viewer • Updated 18 days ago • 1.83k • 110
PJMixers-Dev/Nelathan_synthetic-sugar-quill-cleaner

Viewer • Updated 19 days ago • 9.62k • 122 • 1

Earthen Model Series

PJMixers-Dev/Gemma-3-Earthen-Completion-v0.1-4B

Text Generation • 4B • Updated May 21 • 7
PJMixers-Dev/Gemma-3-Earthen-v0.1-4B

Text Generation • 4B • Updated May 22 • 9
PJMixers-Dev/Gemma-3-Earthen-v0.2-4B

Text Generation • 4B • Updated May 22 • 10
PJMixers-Dev/Granite-3.1-Earthen-v0.3-3B-A800M

Text Generation • 3B • Updated May 24 • 15

Human vs. AI Writing RL Datasets

PJMixers-Dev/Lit-axo-Shuffled-RL-v0.1

Viewer • Updated May 20 • 976 • 22
PJMixers-Dev/Nelathan_synthetic-sugar-quill-RL-v0.1

Viewer • Updated May 20 • 141 • 23
jondurbin/gutenberg-dpo-v0.1

Viewer • Updated Jan 12, 2024 • 918 • 805 • 147
nbeerbower/gutenberg2-dpo

Viewer • Updated Nov 16, 2024 • 293 • 67 • 20

Salesforce Writing Quality Reward Models Datasets

⁣https://github.com/salesforce/creativity_eval/tree/main/WritingRewards/training/Llama/data

AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation

Paper • 2504.07532 • Published Apr 10
PJMixers-Dev/Salesforce_LAMP-R

Viewer • Updated May 17 • 2.13k • 14 • 1
PJMixers-Dev/Salesforce_LAMP-P

Viewer • Updated May 17 • 1.79k • 14
PJMixers-Dev/Salesforce_LAMP-P-exp

Viewer • Updated May 17 • 1.79k • 13

Fundus News Scrapes

https://github.com/flairNLP/fundus

PJMixers-Dev/Fundus-105K

Viewer • Updated Dec 8, 2024 • 105k • 10
PJMixers-Dev/Fundus-105K-Formatted

Viewer • Updated Dec 8, 2024 • 71.6k • 9
PJMixers-Dev/Fundus-CC-2.5M

Viewer • Updated Dec 30, 2024 • 2.22M • 291
PJMixers-Dev/Fundus-CC-2.5M-Formatted

Viewer • Updated May 15 • 1.4M • 20 • 1

V3-0324 Re-Genned RP Datasets

V3-0324 responses on the "0" split of the datasets.

PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split-v3-0324

Viewer • Updated Mar 28 • 2.58k • 16 • 1

QwQ Re-Genned RP Datasets

QwQ responses on the "1" split of the datasets.

PJMixers-Dev/aesir-rpg-charcards-gpt4-nsfw-sharegpt-flatguard-split-qwq

Updated Mar 29 • 14
PJMixers-Dev/aesir-rpg-charcards-gpt4-sfw-sharegpt-flatguard-split-qwq

Viewer • Updated Mar 29 • 1.36k • 23
PJMixers-Dev/aesir-rpg-fantasy-markdown-flatguard-split-qwq

Viewer • Updated Mar 29 • 277 • 7 • 1
PJMixers-Dev/aesir-rpg-fantasy-novel-flatguard-split-qwq

Viewer • Updated Mar 29 • 767 • 17

R1 Rag Datasets

PJMixers-Dev/Subtitles-rag-questions-r1

Viewer • Updated Mar 26 • 622 • 32
PJMixers-Dev/Subtitles-rag-answers-r1

Viewer • Updated Mar 28 • 6.97k • 32

Prepped for Re-Gen

Dataset flattened, S4 removed, and split into 4 even splits. Ready to have final turn regenerated with another model.

PJMixers-Dev/aesir-rpg-charcards-gpt4-nsfw-sharegpt-flatguard-split

Updated Mar 28 • 29
PJMixers-Dev/aesir-rpg-charcards-gpt4-sfw-sharegpt-flatguard-split

Viewer • Updated Mar 28 • 6.04k • 32
PJMixers-Dev/aesir-rpg-fantasy-markdown-flatguard-split

Viewer • Updated Mar 28 • 1.11k • 29
PJMixers-Dev/aesir-rpg-fantasy-novel-flatguard-split

Viewer • Updated Mar 28 • 4.19k • 38

QwQ Rag Datasets

https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py

PJMixers-Dev/AP-News-2024-rag-questions-qwq-all-kcpp

Viewer • Updated Mar 21 • 167 • 10
PJMixers-Dev/AP-News-2024-rag-answers-qwq-all-kcpp

Viewer • Updated Mar 21 • 290 • 9
PJMixers-Dev/Subtitles-rag-questions-qwq-all-kcpp

Viewer • Updated Mar 19 • 133 • 50

[Old-ish] QwQ RP/Co-Writing Datasets

https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py

PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite

Viewer • Updated Mar 12 • 11.9k • 57
PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite

Viewer • Updated Mar 12 • 5.76k • 40
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite

Viewer • Updated Mar 12 • 6.86k • 28
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite

Viewer • Updated Mar 8 • 1.73k • 41

Length Filtered Thinking Datasets

Filtered to remove thinking or responses which are too long compared to the average distribution. Also tried to clean some stuff.

PJMixers-Dev/nvidia-r1-code-1k-think-256-response-filtered-ShareGPT

Viewer • Updated Mar 20 • 136k • 34
PJMixers-Dev/OpenThoughts-114k-Code_decontaminated-4k-think-2k-response-filtered-ShareGPT

Viewer • Updated Mar 20 • 6.08k • 39
PJMixers-Dev/dolphin-deepseek-1k-think-1k-response-filtered-ShareGPT

Viewer • Updated Mar 22 • 86.1k • 130
PJMixers-Dev/KodCode_KodCode-V1-SFT-R1-4k-think-1k-response-ShareGPT

Viewer • Updated Mar 23 • 174k • 82 • 1

LLaMa-Guard-3 Classified Datasets

null classification means it was skipped due to the turns not correctly converting within apply_chat_template

PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 5.76k • 3
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 200 • 5
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 1.73k • 4
PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-classified

Viewer • Updated Mar 27 • 2.11k • 4

gemini-2.0-flash-thinking-exp-01-21 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-01-21. Currently only single-turn.

PJMixers-Dev/m-a-p_CodeFeedback-Filtered-Instruction-gemini-2.0-flash-thinking-exp-01-21-CustomShareGPT

Viewer • Updated Jan 28 • 1.94k • 11 • 2
PJMixers-Dev/cognitivecomputations_dolphin-r1-reasoning-flash-CustomShareGPT

Viewer • Updated Jan 30 • 250k • 53 • 1

gemini-2.0-flash-thinking-exp-1219 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn.

PJMixers-Dev/Weyaxi_HelpSteer-filtered-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.36k • 10 • 3
PJMixers-Dev/lmsys_lmsys-chat-1m-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.09k • 12
PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 16.9k • 9 • 4
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 4.41k • 16

gemini-2.0-flash-exp Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-exp. Currently only single-turn.

PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 485 • 32
PJMixers-Dev/grimulkan_theory-of-mind-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 539 • 6
PJMixers-Dev/grimulkan_physical-reasoning-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 898 • 16
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-exp-ShareGPT

Viewer • Updated Jan 13 • 136 • 14

gemini-exp-1206 Datasets

Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn.

PJMixers-Dev/allenai_WildChat-1M-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 13 • 205 • 16
PJMixers-Dev/grimulkan_theory-of-mind-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 13 • 463 • 8
PJMixers-Dev/camel-ai_biology-gemini-exp-1206-ShareGPT

Viewer • Updated Jan 14 • 87 • 9

Thinking/Reasoning Datasets

PJMixers-Dev/Weyaxi_HelpSteer-filtered-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.36k • 10 • 3
PJMixers-Dev/lmsys_lmsys-chat-1m-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 1.09k • 12
PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 16.9k • 9 • 4
PJMixers-Dev/WizardLMTeam_WizardLM_evol_instruct_70k-gemini-2.0-flash-thinking-exp-1219-CustomShareGPT

Viewer • Updated Jan 13 • 4.41k • 16

Fake Distill Datasets

PJMixers-Dev/Tiefighter-13B-Fake-Distill-ShareGPT

Viewer • Updated Jan 10 • 4.85k • 45 • 1

AI & ML interests

Recent Activity

Team members 1

PJMixers-Dev 's collections 19