Appreciate your work!
ลuayp Talha Kocabay
suayptalha
AI & ML interests
NLP, LLMs, Transformers, Merging, RNNs, CNNs, ANNs, Computer Vision and ML algorithms
Recent Activity
updated
a model
1 day ago
suayptalha/Falcon3-Jessi-v0.4-7B-Slerp
liked
a model
2 days ago
openai-community/gpt2-large
replied to
sometimesanotion's
post
8 days ago
I've managed a #1 score of 41.22% average for 14B parameter models on the Open LLM Leaderboard. As of this writing, sometimesanotion/Lamarck-14B-v0.7 is #8 for all models up to 70B parameters.
It took a custom toolchain around Arcee AI's mergekit to manage the complex merges, gradients, and LoRAs required to make this happen. I really like seeing features of many quality finetunes in one solid generalist model.
Organizations
suayptalha's activity
replied to
sometimesanotion's
post
8 days ago
replied to
their
post
8 days ago
Thanks for your support!
reacted to
sometimesanotion's
post with ๐ฅ
8 days ago
Post
2598
I've managed a #1 score of 41.22% average for 14B parameter models on the Open LLM Leaderboard. As of this writing, sometimesanotion/Lamarck-14B-v0.7 is #8 for all models up to 70B parameters.
It took a custom toolchain around Arcee AI's mergekit to manage the complex merges, gradients, and LoRAs required to make this happen. I really like seeing features of many quality finetunes in one solid generalist model.
It took a custom toolchain around Arcee AI's mergekit to manage the complex merges, gradients, and LoRAs required to make this happen. I really like seeing features of many quality finetunes in one solid generalist model.
posted
an
update
9 days ago
Post
1608
My last Falcon3-7B merge model,
suayptalha/Falcon3-Jessi-v0.4-7B-Slerp, is currently ranked #1 on the
open-llm-leaderboard/open_llm_leaderboard among all models with up to 14B parameters.
My Qwen2.5-7B merge model, suayptalha/HomerCreativeAnvita-Mix-Qw7B, is also ranked #7, placing two of my models in the top 10!
My Qwen2.5-7B merge model, suayptalha/HomerCreativeAnvita-Mix-Qw7B, is also ranked #7, placing two of my models in the top 10!
posted
an
update
about 1 month ago
Post
1930
๐ Introducing ๐
๐ข๐ซ๐ฌ๐ญ ๐๐ฎ๐ ๐ ๐ข๐ง๐ ๐
๐๐๐ ๐๐ง๐ญ๐๐ ๐ซ๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐ฆ๐ข๐ง๐๐๐ ๐๐จ๐๐๐ฅ๐ฌ from the paper ๐๐๐ซ๐ ๐๐๐๐ฌ ๐๐ฅ๐ฅ ๐๐ ๐๐๐๐๐๐?
๐ฅ I have integrated ๐ง๐๐ฑ๐ญ-๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐จ๐ง ๐๐๐๐ฌ, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "๐ญ๐ซ๐๐ง๐ฌ๐๐จ๐ซ๐ฆ๐๐ซ๐ฌ" ๐ฅ๐ข๐๐ซ๐๐ซ๐ฒ for both usage and training.
๐ป I integrated two main tasks: ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง and ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง:
You can use this class for ๐๐๐ช๐ฎ๐๐ง๐๐ ๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐:
You can use this class for ๐๐๐ฎ๐ฌ๐๐ฅ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!
๐ ๐๐ข๐ง๐ค๐ฌ:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1
๐ฐ ๐๐ซ๐๐๐ข๐ญ๐ฌ:
Paper Link: https://arxiv.org/abs/2410.01201
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.
๐ฅ I have integrated ๐ง๐๐ฑ๐ญ-๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐จ๐ง ๐๐๐๐ฌ, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "๐ญ๐ซ๐๐ง๐ฌ๐๐จ๐ซ๐ฆ๐๐ซ๐ฌ" ๐ฅ๐ข๐๐ซ๐๐ซ๐ฒ for both usage and training.
๐ป I integrated two main tasks: ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง and ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง:
You can use this class for ๐๐๐ช๐ฎ๐๐ง๐๐ ๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐:
You can use this class for ๐๐๐ฎ๐ฌ๐๐ฅ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!
๐ ๐๐ข๐ง๐ค๐ฌ:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1
๐ฐ ๐๐ซ๐๐๐ข๐ญ๐ฌ:
Paper Link: https://arxiv.org/abs/2410.01201
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.
reacted to
AdinaY's
post with ๐ฅ
about 1 month ago
Post
3608
The Chinese community is shipping ๐ข
DeepSeek V3 (685 B MoE) has quietly released on the hub!
Base: deepseek-ai/DeepSeek-V3-Base
Instruct: deepseek-ai/DeepSeek-V3
Canโt wait to see whatโs next!
DeepSeek V3 (685 B MoE) has quietly released on the hub!
Base: deepseek-ai/DeepSeek-V3-Base
Instruct: deepseek-ai/DeepSeek-V3
Canโt wait to see whatโs next!
replied to
their
post
about 1 month ago
Thank you for your reply, but I never intended to mislead anyone. This post was not entirely created with AI. We spent hours working on the models and dataset preparation and are proud of our efforts. I expected more constructive criticism.
Post
2490
๐ Introducing Substitution Cipher Solvers!
As @suayptalha and @Synd209 we are thrilled to share an update!
๐ This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI
As @suayptalha and @Synd209 we are thrilled to share an update!
๐ This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI
posted
an
update
about 1 month ago
Post
2490
๐ Introducing Substitution Cipher Solvers!
As @suayptalha and @Synd209 we are thrilled to share an update!
๐ This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI
As @suayptalha and @Synd209 we are thrilled to share an update!
๐ This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
Model Collection Link: Cipher-AI/substitution-cipher-solvers-6731ebd22f0f0d8e0e2e2e00
Organization Link: https://huggingface.co/Cipher-AI
posted
an
update
about 1 month ago
Post
1633
๐ FastLlama Series is Live!
๐ฆพ Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
๐ค Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.
๐ป Its compact size makes it versatile for a wide range of applications!
๐ฌ Chat with the model:
๐ Chat Link: suayptalha/Chat-with-FastLlama
๐ Model Link: suayptalha/FastLlama-3.2-1B-Instruct
๐ฆพ Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
๐ค Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.
๐ป Its compact size makes it versatile for a wide range of applications!
๐ฌ Chat with the model:
๐ Chat Link: suayptalha/Chat-with-FastLlama
๐ Model Link: suayptalha/FastLlama-3.2-1B-Instruct