11 1 14

Bend Ed

melekuk

AI & ML interests

None yet

Recent Activity

new activity about 4 hours ago

Skywork/Skywork-OR1-32B-Preview:Thinking in LM Studio

new activity about 4 hours ago

ds4sd/SmolDocling-256M-preview:Relicensing issues

new activity about 5 hours ago

AlexBefest/CardProjector-27B-v4:Smaller version?

View all activity

Organizations

None yet

melekuk's activity

New activity in Skywork/Skywork-OR1-32B-Preview about 4 hours ago

Thinking in LM Studio

#1 opened 2 days ago by

urtuuuu

New activity in ds4sd/SmolDocling-256M-preview about 4 hours ago

Relicensing issues

#37 opened 10 days ago by

JLouisBiz

New activity in AlexBefest/CardProjector-27B-v4 about 5 hours ago

Smaller version?

#1 opened about 5 hours ago by

melekuk

liked a model 2 days ago

AlexBefest/CardProjector-27B-v4

Text Generation • Updated 4 days ago • 44 • 11

liked a model 5 days ago

abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1

Text Generation • Updated about 19 hours ago • 815 • 11

New activity in unsloth/Llama-4-Scout-17B-16E-Instruct 10 days ago

DOA

#1 opened 10 days ago by

MrDevolver

liked a model 20 days ago

AlexBefest/CardProjector-7B-v3

Updated 20 days ago • 15 • 5

New activity in arcee-ai/mergekit-gui 23 days ago

Update

#47 opened about 1 month ago by

Kirkito

liked 2 models 24 days ago

tianweiy/DMD2

Text-to-Image • Updated Jun 11, 2024 • 37.6k • 149

h1t/TCD-SDXL-LoRA

Text-to-Image • Updated Apr 16, 2024 • 1.5k • 109

liked a model 25 days ago

knoveleng/Open-RS3

Text Generation • Updated 23 days ago • 4.43k • 19

reacted to Jaward's post with 👀 26 days ago

Post

1760

Finally, the ground truth / AlexNet’s original source code is available to all.
Context: AlexNet had a historic win in the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC), reducing error rate from 26% (previous best) to 15.3%. It’s a deep CNN with 8 layers (5 convolutional + 3 fully connected), pioneering the use of ReLU activations for faster training, dropout for regularization, and GPU acceleration for large-scale learning. This moment marked the beginning of the deep learning revolution, inspiring architectures like VGG, ResNet, and modern transformers.
Code: https://github.com/computerhistory/AlexNet-Source-Code

reacted to v2ray's post with 👍 26 days ago

Post

2212

GPT4chan Series Release

GPT4chan is a series of models I trained on v2ray/4chan dataset, which is based on lesserfield/4chan-datasets. The dataset contains mostly posts from 2023. Not every board is included, for example, /pol/ is NOT included. To see which boards are included, visit v2ray/4chan.

This release contains 2 models sizes, 8B and 24B. The 8B model is based on meta-llama/Llama-3.1-8B and the 24B model is based on mistralai/Mistral-Small-24B-Base-2501.

Why I made these models? Because for a long time after the original gpt-4chan model, there aren't any serious fine-tunes on 4chan datasets. 4chan is a good data source since it contains coherent replies and nice topics. It's fun to talk to an AI generated version of 4chan and get instant replies, and without the need to actually visit 4chan. You can also sort of analyze the content and behavior of 4chan posts by probing the model's outputs.

Disclaimer: The GPT4chan models should only be used for research purposes, the outputs they generated do not represent the view of me on the subjects. Moderate the responses before sending it online.

Model links:

Full model:
- v2ray/GPT4chan-8B
- v2ray/GPT4chan-24B

Adapter:
- v2ray/GPT4chan-8B-QLoRA
- v2ray/GPT4chan-24B-QLoRA

AWQ:
- v2ray/GPT4chan-8B-AWQ
- v2ray/GPT4chan-24B-AWQ

FP8:
- v2ray/GPT4chan-8B-FP8