๐ Introducing ๐ ๐ข๐ซ๐ฌ๐ญ ๐๐ฎ๐ ๐ ๐ข๐ง๐ ๐ ๐๐๐ ๐๐ง๐ญ๐๐ ๐ซ๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐ฆ๐ข๐ง๐๐๐ ๐๐จ๐๐๐ฅ๐ฌ from the paper ๐๐๐ซ๐ ๐๐๐๐ฌ ๐๐ฅ๐ฅ ๐๐ ๐๐๐๐๐๐?
๐ฅ I have integrated ๐ง๐๐ฑ๐ญ-๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐จ๐ง ๐๐๐๐ฌ, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "๐ญ๐ซ๐๐ง๐ฌ๐๐จ๐ซ๐ฆ๐๐ซ๐ฌ" ๐ฅ๐ข๐๐ซ๐๐ซ๐ฒ for both usage and training.
๐ป I integrated two main tasks: ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง and ๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ช๐ฎ๐๐ง๐๐๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง: You can use this class for ๐๐๐ช๐ฎ๐๐ง๐๐ ๐๐ฅ๐๐ฌ๐ฌ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.
๐๐ข๐ง๐๐๐๐ ๐จ๐ซ๐๐๐ฎ๐ฌ๐๐ฅ๐๐: You can use this class for ๐๐๐ฎ๐ฌ๐๐ฅ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!
๐ This project contains a text-to-text model designed to decrypt English and Turkish text encoded using a substitution cipher. In a substitution cipher, each letter in the plaintext is replaced by a corresponding, unique letter to form the ciphertext. The model leverages statistical and linguistic properties of English to make educated guesses about the letter substitutions, aiming to recover the original plaintext message.
These models were fine-tuned on T5-base. The models are for monoalphabetic English and Turkish substitution ciphers, and they output decoded text and the alphabet with an accuracy that has never been achieved before!
Example:
Encoded text: Z hztwgx tstcsf qf z ulooqfe osfuqb tzx uezx awej z ozewsbe vlfwby fsmqisfx.
Decoded text: A family member or a support person may stay with a patient during recovery.
๐ฆพ Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
๐ค Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.