Improve model card: Add library_name, science tag, GitHub link, and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +51 -4
README.md CHANGED
@@ -1,17 +1,25 @@
1
  ---
2
- license: llama3.1
 
3
  datasets:
4
  - MegaScience/MegaScience
5
  language:
6
  - en
 
7
  metrics:
8
  - accuracy
9
- base_model:
10
- - meta-llama/Llama-3.1-8B
11
  pipeline_tag: text-generation
 
 
 
12
  ---
 
13
  # [MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning](https://arxiv.org/abs/2507.16812)
14
 
 
 
 
 
15
  ## Llama3.1-8B-MegaScience
16
 
17
  ### Training Recipe
@@ -39,6 +47,45 @@ pipeline_tag: text-generation
39
  <img src="https://cdn-uploads.huggingface.co/production/uploads/616bfc2b40e2f69baa1c7add/VogIpBbjfNxXFP9DfVMms.png" alt="Data Pipeline" style="width:100%;">
40
  </div>
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ## Citation
43
 
44
  Check out our [paper](https://arxiv.org/abs/2507.16812) for more details. If you use our dataset or find our work useful, please cite
@@ -51,4 +98,4 @@ Check out our [paper](https://arxiv.org/abs/2507.16812) for more details. If you
51
  journal={arXiv preprint arXiv:2507.16812},
52
  url={https://arxiv.org/abs/2507.16812}
53
  }
54
- ```
 
1
  ---
2
+ base_model:
3
+ - meta-llama/Llama-3.1-8B
4
  datasets:
5
  - MegaScience/MegaScience
6
  language:
7
  - en
8
+ license: llama3.1
9
  metrics:
10
  - accuracy
 
 
11
  pipeline_tag: text-generation
12
+ library_name: transformers
13
+ tags:
14
+ - science
15
  ---
16
+
17
  # [MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning](https://arxiv.org/abs/2507.16812)
18
 
19
+ **Llama3.1-8B-MegaScience** is a model fine-tuned on **MegaScience**, a large-scale mixture of high-quality open-source scientific datasets totaling 1.25 million instances, as presented in the paper "MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning". The MegaScience dataset features truthful reference answers extracted from 12k university-level scientific textbooks, comprising 650k reasoning questions spanning 7 scientific disciplines. This model significantly outperforms corresponding official instruct models in average performance on scientific reasoning tasks and exhibits greater effectiveness for larger and stronger models, suggesting a scaling benefit for scientific tuning.
20
+
21
+ For more details on the project, including the data curation pipeline and evaluation system, visit the [official GitHub repository](https://github.com/GAIR-NLP/lm-open-science-evaluation).
22
+
23
  ## Llama3.1-8B-MegaScience
24
 
25
  ### Training Recipe
 
47
  <img src="https://cdn-uploads.huggingface.co/production/uploads/616bfc2b40e2f69baa1c7add/VogIpBbjfNxXFP9DfVMms.png" alt="Data Pipeline" style="width:100%;">
48
  </div>
49
 
50
+ ### Usage
51
+
52
+ You can use the model with the `transformers` library:
53
+
54
+ ```python
55
+ from transformers import AutoTokenizer, AutoModelForCausalLM
56
+ import torch
57
+
58
+ model_id = "MegaScience/Llama3.1-8B-MegaScience"
59
+
60
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
61
+ model = AutoModelForCausalLM.from_pretrained(
62
+ model_id,
63
+ torch_dtype=torch.bfloat16,
64
+ device_map="auto"
65
+ )
66
+
67
+ messages = [
68
+ {"role": "user", "content": "Explain the concept of quantum entanglement."},
69
+ ]
70
+
71
+ input_ids = tokenizer.apply_chat_template(
72
+ messages,
73
+ add_generation_prompt=True,
74
+ return_tensors="pt"
75
+ ).to(model.device)
76
+
77
+ outputs = model.generate(
78
+ input_ids,
79
+ max_new_tokens=512,
80
+ eos_token_id=tokenizer.eos_token_id,
81
+ do_sample=True,
82
+ temperature=0.7,
83
+ top_p=0.9
84
+ )
85
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
86
+ print(response)
87
+ ```
88
+
89
  ## Citation
90
 
91
  Check out our [paper](https://arxiv.org/abs/2507.16812) for more details. If you use our dataset or find our work useful, please cite
 
98
  journal={arXiv preprint arXiv:2507.16812},
99
  url={https://arxiv.org/abs/2507.16812}
100
  }
101
+ ```