Improve model card: Update license & pipeline tag, add project page

This PR improves the model card by:

* Updating the `license` to `mit` for consistency with the associated GitHub repository.
* Changing the `pipeline_tag` to `text-generation` to better reflect the model's primary use case as a large language model and align with the provided usage examples.
* Adding a link to the project page (`https://itay1itzhak.github.io/planted-in-pretraining`) in the model card content for easier access to more project details.

Please review and merge this PR if everything looks good.

Files changed (1) hide show

README.md +30 -29

README.md CHANGED Viewed

@@ -1,18 +1,18 @@
 ---
-license: apache-2.0
-tags:
-- language-modeling
-- causal-lm
-- bias-analysis
-- cognitive-bias
 datasets:
 - allenai/tulu-v2-sft-mixture
 language:
 - en
-base_model:
-- google/t5-v1_1-xxl
-pipeline_tag: text2text-generation
 library_name: transformers
 ---
 # Model Card for T5-Tulu
@@ -25,12 +25,13 @@ This 🤗 Transformers model was finetuned using LoRA adapters for the arXiv pap
 We study whether cognitive biases in LLMs emerge from pretraining, instruction tuning, or training randomness.
 This is one of 3 idnetical versions trained with different random seeds.
-- **Model type**: encoder-decoder based transformer
-- **Language(s)**: English
-- **License**: Apache 2.0
-- **Finetuned from**: `google/t5-v1_1-xxl`
-- **Paper**: https://arxiv.org/abs/2507.07186
-- **Repository**: https://github.com/itay1itzhak/planted-in-pretraining
 ## Uses
@@ -55,26 +56,26 @@ print(tokenizer.decode(outputs[0]))
 ## Training Details
-- Finetuning method: LoRA (high-rank, rank ∈ [64, 512])
-- Instruction data: Tulu-2
-- Seeds: 3 per setting to evaluate randomness effects
-- Batch size: 128 (OLMo) / 64 (T5)
-- Learning rate: 1e-6 to 1e-3
-- Steps: ~5.5k (OLMo) / ~16k (T5)
-- Mixed precision: fp16 (OLMo) / bf16 (T5)
 ## Evaluation
-- Evaluated on 32 cognitive biases from Itzhak et al. (2024) and Malberg et al. (2024)
-- Metrics: mean bias score, PCA clustering, MMLU accuracy
-- Findings: Biases primarily originate in pretraining; randomness introduces moderate variation
 ## Environmental Impact
-- Hardware: 4× NVIDIA A40
-- Estimated time: ~120 GPU hours/model
 ## Technical Specifications
-- Architecture: T5-11B
-- Instruction dataset: Tulu-2

 ---
+base_model:
+- google/t5-v1_1-xxl
 datasets:
 - allenai/tulu-v2-sft-mixture
 language:
 - en
 library_name: transformers
+license: mit
+pipeline_tag: text-generation
+tags:
+- language-modeling
+- causal-lm
+- bias-analysis
+- cognitive-bias
 ---
 # Model Card for T5-Tulu
 We study whether cognitive biases in LLMs emerge from pretraining, instruction tuning, or training randomness.
 This is one of 3 idnetical versions trained with different random seeds.
+-   **Model type**: encoder-decoder based transformer
+-   **Language(s)**: English
+-   **License**: MIT
+-   **Finetuned from**: `google/t5-v1_1-xxl`
+-   **Paper**: https://arxiv.org/abs/2507.07186
+-   **Repository**: https://github.com/itay1itzhak/planted-in-pretraining
+-   **Project Page**: https://itay1itzhak.github.io/planted-in-pretraining
 ## Uses
 ## Training Details
+-   Finetuning method: LoRA (high-rank, rank ∈ [64, 512])
+-   Instruction data: Tulu-2
+-   Seeds: 3 per setting to evaluate randomness effects
+-   Batch size: 128 (OLMo) / 64 (T5)
+-   Learning rate: 1e-6 to 1e-3
+-   Steps: ~5.5k (OLMo) / ~16k (T5)
+-   Mixed precision: fp16 (OLMo) / bf16 (T5)
 ## Evaluation
+-   Evaluated on 32 cognitive biases from Itzhak et al. (2024) and Malberg et al. (2024)
+-   Metrics: mean bias score, PCA clustering, MMLU accuracy
+-   Findings: Biases primarily originate in pretraining; randomness introduces moderate variation
 ## Environmental Impact
+-   Hardware: 4× NVIDIA A40
+-   Estimated time: ~120 GPU hours/model
 ## Technical Specifications
+-   Architecture: T5-11B
+-   Instruction dataset: Tulu-2