Tucano2
An open suite of large language models (LLMs) with 0.5-3.7 billion parameters, designed to address the gap in open-source development for Portuguese.
- Paper β’ 2603.03543 β’ Published β’ 6
-
Tucano2Cool Chat Demo
π¦Tucano 2 is the coolest open source Portuguese LLM!
Polygl0t/Tucano2-0.6B-Base
Text Generation β’ 0.7B β’ Updated β’ 25Note π§± Base version of Tucano2 0.6B. Use as a foundation for post-training.
Polygl0t/Tucano2-qwen-0.5B-Base
Text Generation β’ 0.5B β’ Updated β’ 27Note π§± Base version of Tucano2 0.5B. Use as a foundation for post-training.
Polygl0t/Tucano2-qwen-0.5B-Instruct
Text Generation β’ 0.5B β’ Updated β’ 70 β’ 1Note π¬ Instruct version of Tucano2 0.5B. Suited for chat applications.
Polygl0t/Tucano2-qwen-0.5B-Think
Text Generation β’ 0.5B β’ Updated β’ 56Note π€ Think version of Tucano2 0.5B. Suited for reasoning tasks.
Polygl0t/Tucano2-qwen-1.5B-Base
Text Generation β’ 2B β’ Updated β’ 313Note π§± Base version of Tucano2 1.5B. Use as a foundation for post-training.
Polygl0t/Tucano2-qwen-1.5B-Instruct
Text Generation β’ 2B β’ Updated β’ 326 β’ 1Note π¬ Instruct version of Tucano2 1.5B. Suited for chat applications.
Polygl0t/Tucano2-qwen-1.5B-Think
Text Generation β’ 2B β’ Updated β’ 25Note π€ Think version of Tucano2 1.5B. Suited for reasoning tasks.
Polygl0t/Tucano2-qwen-3.7B-Base
Text Generation β’ 4B β’ Updated β’ 20Note π§± Base version of Tucano2 3.7B. Use as a foundation for post-training.
Polygl0t/Tucano2-qwen-3.7B-Instruct
Text Generation β’ 4B β’ Updated β’ 55 β’ 1Note π¬ Instruct version of Tucano2 3.7B. Suited for chat applications.
Polygl0t/Tucano2-qwen-3.7B-Think
Text Generation β’ 4B β’ Updated β’ 40Note π€ Think version of Tucano2 3.7B. Suited for reasoning tasks.
Polygl0t/gigaverbo-v2
Viewer β’ Updated β’ 375M β’ 112Note π Pretraining dataset.
Polygl0t/gigaverbo-v2-synth
Viewer β’ Updated β’ 11.2M β’ 66Note π Synthetic dataset.
Polygl0t/gigaverbo-v2-sft
Viewer β’ Updated β’ 4.09M β’ 96 β’ 1Note π Supervised fine-tuning dataset.
Polygl0t/gigaverbo-v2-preferences
Viewer β’ Updated β’ 28.4k β’ 49Note π Preference dataset.
Polygl0t/GigaVerbo-v2-ablation-EDU-Synth-1.5B
Text Generation β’ 2B β’ Updated β’ 16Note π¬ Ablation Experiment (Edu+Synth)
Polygl0t/GigaVerbo-v2-ablation-EDU-1.5B
Text Generation β’ 2B β’ Updated β’ 14Note π¬ Ablation Experiment (Edu)
Polygl0t/GigaVerbo-v2-ablation-Synth-1.5B
Text Generation β’ 2B β’ Updated β’ 13Note π¬ Ablation Experiment (Synth)
Polygl0t/GigaVerbo-v2-ablation-NonEDU-1.5B
Text Generation β’ 2B β’ Updated β’ 13Note π¬ Ablation Experiment (NonEdu)
Polygl0t/portuguese-edu-qwen-annotations
Viewer β’ Updated β’ 700k β’ 5Note π Annotations to train classifiers/filters (Educational).
Polygl0t/portuguese-toxicity-qwen-annotations
Viewer β’ Updated β’ 700k β’ 7Note π Annotations to train classifiers/filters (Toxicity).
Polygl0t/portuguese-instruct-quality-qwen-annotations
Viewer β’ Updated β’ 500k β’ 3Note π Annotations to train classifiers/filters (Instructions).
Polygl0t/portuguese-bertimbau-edu-classifier
Text Classification β’ 0.1B β’ Updated β’ 14Note π― Quality Filter (Educational)
Polygl0t/portuguese-bertimbau-large-edu-classifier
Text Classification β’ 0.3B β’ Updated β’ 14Note π― Quality Filter (Educational)
Polygl0t/portuguese-bertimbau-toxicity-classifier
Text Classification β’ 0.1B β’ Updated β’ 15Note π― Quality Filter (Toxicity)
Polygl0t/portuguese-bertabaporu-large-toxicity-classifier
Text Classification β’ 0.4B β’ Updated β’ 14Note π― Quality Filter (Toxicity)
Polygl0t/portuguese-qwen3-4b-instruct-quality-classifier
Text Classification β’ 4B β’ Updated β’ 21Note π― Quality Filter (Instructions)
Polygl0t/portuguese-qwen3-4b-instruct-quality-judge
Text Generation β’ 4B β’ Updated β’ 19Note π― Quality Filter (Instructions)
Polygl0t/tokenizers
Viewer β’ Updated β’ 8.98M β’ 15Note π Data used to train the Tucano2 tokenizer.
Polygl0t/gsm8k-pt
Viewer β’ Updated β’ 8.76k β’ 15Note π An evaluation for mathematical reasoning in Portuguese.
Polygl0t/IFEval-PT
Viewer β’ Updated β’ 300 β’ 14Note π An evaluation for instruction following in Portuguese.
Polygl0t/portuguese-eval-logs-olmo2-smollm3
Viewer β’ Updated β’ 203 β’ 24Note π¬ Evaluation suite experiments.