SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle โ Generate 2048x2048 Images
Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle.
This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well:
Hi HuggingFacers๐ค, I decided to ship early this year, and here's what I came up with:
๐๐๐๐๐ญ๐๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo ๐ https://github.com/AstraBert/PdfItDown PyPi Package ๐ https://pypi.org/project/pdfitdown/
๐๐๐ง๐๐ซ๐๐ฏ ๐ฏ๐.๐.๐ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐ฟ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฎ๐น performance of your ๐๐ฒ๐ ๐ ๐ฒ๐บ๐ฏ๐ฒ๐ฑ๐ฑ๐ถ๐ป๐ด models, I have good news for you๐ฅณ๐ฅณ The new release for ๐๐๐ง๐๐ซ๐๐ฏ now supports ๐ฑ๐ฒ๐ป๐๐ฒ and ๐๐ฝ๐ฎ๐ฟ๐๐ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐๐ฒ๐ ๐-๐ฏ๐ฎ๐๐ฒ๐ฑ ๐ณ๐ถ๐น๐ฒ ๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐ฟ๐ฒ๐น๐ฒ๐๐ฎ๐ป๐ฐ๐ฒ ๐บ๐ฒ๐๐ฟ๐ถ๐ฐ๐! GitHub repo ๐ https://github.com/AstraBert/SenTrEv Release Notes ๐ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0 PyPi Package ๐ https://pypi.org/project/sentrev/
Happy New Year, Huggingface community! In 2025, I'll continue my quantization (and some fine-tuning) efforts to support the open-source AI and Make knowledge free for everyone.
That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!
Details: ๐ค Based on ModernBERT-base with 149M parameters. ๐ Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB! ๐๏ธ Immediate FA2 and unpacking support for super efficient inference. ๐ช Trained with Matryoshka support, i.e. 2 valid output dimensionalities: 768 and 256. โก๏ธ Maximum sequence length of 8192 tokens! 2๏ธโฃ Trained in 2 stages: unsupervised contrastive data -> high quality labeled datasets. โ Integrated in Sentence Transformers, Transformers, LangChain, LlamaIndex, Haystack, etc. ๐๏ธ Apache 2.0 licensed: fully commercially permissible
The deepseek-ai/DeepSeek-V3-Base model has featured today on CNBC tech news. The whale made a splash by using FP8 and shrink the cost of training significantly!
So a cool thing happened, Nomic/GPT4ALL released a "Reasoning/Thinking"(QwQ/o1/o3 type) Model using JavaScript functions to calculate things like the haversine function for distance between two places and so on, it's VERY cool the complex calculative/recursive AI in such a small package..
I was able to adapt their methods to one of my small models "Replicant" 2gb and created a new model with importance matrix Quantization using "THE_KEY" Dataset for better inference in the coding model I pulled from Whiterabbitneo's Qwen2.5 model... I give you Reasoning Rabbit.. enjoy
So a cool thing happened, Nomic/GPT4ALL released a "Reasoning/Thinking"(QwQ/o1/o3 type) Model using JavaScript functions to calculate things like the haversine function for distance between two places and so on, it's VERY cool the complex calculative/recursive AI in such a small package..
I was able to adapt their methods to one of my small models "Replicant" 2gb and created a new model with importance matrix Quantization using "THE_KEY" Dataset for better inference in the coding model I pulled from Whiterabbitneo's Qwen2.5 model... I give you Reasoning Rabbit.. enjoy
So a cool thing happened, Nomic/GPT4ALL released a "Reasoning/Thinking"(QwQ/o1/o3 type) Model using JavaScript functions to calculate things like the haversine function for distance between two places and so on, it's VERY cool the complex calculative/recursive AI in such a small package..
I was able to adapt their methods to one of my small models "Replicant" 2gb and created a new model with importance matrix Quantization using "THE_KEY" Dataset for better inference in the coding model I pulled from Whiterabbitneo's Qwen2.5 model... I give you Reasoning Rabbit.. enjoy
๐ฆพ Experience faster, lighter, and smarter language models! The new FastLlama makes Meta's LLaMA models work with smaller file sizes, lower system requirements, and higher performance. The model supports 8 languages, including English, German, and Spanish.
๐ค Built on the LLaMA 3.2-1B-Instruct model, fine-tuned with Hugging Face's SmolTalk and MetaMathQA-50k datasets, and powered by LoRA (Low-Rank Adaptation) for groundbreaking mathematical reasoning.
In the past seven days, the Diffusers team has shipped:
1. Two new video models 2. One new image model 3. Two new quantization backends 4. Three new fine-tuning scripts 5. Multiple fixes and library QoL improvements
Coffee on me if someone can guess 1 - 4 correctly.
8pm est New Discussion on AI privatization and it's importance for cooperative and confidential development, client services, and family use.
We can also touch on the NEW OPEN SOURCE which will solve MANY of the current problems we face not only with AI but as a society. 8pm (Sorry upon startup some guy hacked the chat or simply crashed it) new link for 8pm est https://x.com/i/spaces/1MnxnDQrkjYGO
1 reply
ยท
reacted to csabakecskemeti's
post with ๐๐ฅabout 1 month ago
8pm est New Discussion on AI privatization and it's importance for cooperative and confidential development, client services, and family use.
We can also touch on the NEW OPEN SOURCE which will solve MANY of the current problems we face not only with AI but as a society. 8pm (Sorry upon startup some guy hacked the chat or simply crashed it) new link for 8pm est https://x.com/i/spaces/1MnxnDQrkjYGO
1 reply
ยท
reacted to freddyaboulton's
post with ๐about 1 month ago
I'm super excited to release my first open-source text dataset:
WorldScenario 20K is a novel dataset of 20,000 synthetically generated multi-stakeholder scenarios designed to simulate real-world decision-making processes. Each scenario explores a unique environmental, societal, or economic issue.
I used the brand new meta-llama/Llama-3.3-70B-Instruct model to generate this dataset and I put the dataset through some post processing to clean and evaluate the dataset for diversity.
I'd appreciate some feedback and thoughts on my new release! Thanks!
For anyone looking to boost their LLM fine-tuning and alignment skills this decemeber. We're running this free and open course called smol course. Itโs not big like Li Yin and @mlabonne, itโs just smol.
๐ท It focuses on practical use cases, so if youโre working on something, bring it along.
๐ฏโโ๏ธ Itโs peer reviewed and open so you can discuss and get feedback.
๐ค If youโre already a smol pro, feel free to drop a star or issue.
> > Part 1 starts now, and itโs on instruction tuning!