PKU-ML commited on
Commit
26a49a7
·
verified ·
1 Parent(s): ae22f6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -3
README.md CHANGED
@@ -1,3 +1,122 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - PKU-ML/Erdos
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ base_model:
10
+ - Qwen/Qwen2.5-3B-Instruct
11
+ pipeline_tag: text-generation
12
+ tags:
13
+ - graph
14
+ - chat
15
+ library_name: transformers
16
+ ---
17
+
18
+
19
+ # G1-3B
20
+
21
+ ## Introduction
22
+
23
+ G1 is the series of large language models trained on our benchmark [Erdos](https://huggingface.co/datasets/PKU-ML/Erdos) for solving graph reasoning tasks, based on Qwen2.5-Instruct.
24
+ We apply Group Relative Policy Optimization (GRPO) for reinforcement learning with supervised finetuning as a prelimary step.
25
+
26
+ G1 brings the following improvements:
27
+
28
+ - **Significant improvement on graph reasoning**: G1 models achieve up to 46% improvement over baselines on Erdős, with the 7B variant matching OpenAI’s o3-mini and the 3B model surpassing Qwen2.5-72B-Instruct by notable margins.
29
+ - **Strong Generalization to unseen graph tasks**: G1 exhibits zero-shot generalization on unseen graph tasks, improving performance on *other graph reasoning benchmarks* (GraphWiz, GraphArena) and *real-world graphs* (Cora, PubMed).
30
+ - **NO Compromise on general reasoning**: Crucially, G1 preserves general reasoning ability (GSM8K, MATH, MMLU-Pro), proving its versatility.
31
+
32
+
33
+ **This repo contains the G1-3B model**, which has the following features:
34
+ - Type: Causal Language Models
35
+ - Training Stage: SFT & RL
36
+ - Architecture: the same with Qwen2.5-Instruct
37
+ - Number of Parameters: 3.09B
38
+ - Context Length: Full 32,768 tokens and generation 8192 tokens
39
+
40
+ For more details, please refer to our [paper](https://arxiv.org/pdf/2505.18499) and [GitHub](https://github.com/PKU-ML/G1/tree/main).
41
+
42
+
43
+ ## Requirements
44
+
45
+ The model is trained based on Qwen/Qwen2.5-3B-Instruct. The code of Qwen2.5 has been in the latest Hugging face `transformers` and we advise you to use the latest version of `transformers`.
46
+
47
+ With `transformers<4.37.0`, you will encounter the following error:
48
+ ```
49
+ KeyError: 'qwen2'
50
+ ```
51
+
52
+
53
+ ## Quickstart
54
+
55
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
56
+
57
+ ```python
58
+ from transformers import AutoModelForCausalLM, AutoTokenizer
59
+
60
+ INSTRUCTION_TEMPLATE = """
61
+ {instruction}
62
+
63
+ Solve the above problem efficiently and clearly. The last line of your response should be of the following format: 'Therefore, the final answer is: $\\boxed{{ANSWER}}$. I hope it is correct' (without quotes) where ANSWER is just the final number or expression that solves the problem. Think step by step before answering.
64
+ """.strip()
65
+
66
+ model_name = "PKU-ML/G1-3B"
67
+
68
+ model = AutoModelForCausalLM.from_pretrained(
69
+ model_name,
70
+ torch_dtype="auto",
71
+ device_map="auto"
72
+ )
73
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
74
+
75
+ prompt = "The task is to determine the degree centrality of a node in the graph.\n\n"\
76
+ "Degree centrality for a node is the fraction of nodes it is connected to.\n\n"\
77
+ "Here is an undirected graph containing nodes from 1 to 15. The edges are: (1, 15), (15, 11), (2, 3), (2, 6), (3, 6), (3, 7), (6, 7), (6, 8), (7, 8), (7, 14), (4, 10), (10, 5), (10, 12), (8, 14), (8, 9), (12, 11), (12, 13).\n\n"\
78
+ "Question: What is the degree centrality of node 2 in the graph?\n\n"\
79
+ "You need to format your answer as a float number."
80
+ messages = [
81
+ {"role": "user", "content": INSTRUCTION_TEMPLATE.format(instruction=prompt)}
82
+ ]
83
+ text = tokenizer.apply_chat_template(
84
+ messages,
85
+ tokenize=False,
86
+ add_generation_prompt=True
87
+ )
88
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
89
+
90
+ generated_ids = model.generate(
91
+ **model_inputs,
92
+ max_new_tokens=4096,
93
+ top_p=0.95,
94
+ top_k=30,
95
+ temperature=0.6
96
+ )
97
+ generated_ids = [
98
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
99
+ ]
100
+
101
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
102
+ print(response)
103
+ ```
104
+
105
+
106
+ ## Evaluation & Performance
107
+
108
+ Detailed evaluation results are reported in this [📑 paper](https://arxiv.org/pdf/2505.18499).
109
+
110
+
111
+ ## Citation
112
+
113
+ If you find our work helpful, feel free to give us a cite.
114
+
115
+ ```
116
+ @article{guo2025g1,
117
+ title={G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning},
118
+ author={Guo, Xiaojun and Li, Ang and Wang, Yifei and Jegelka, Stefanie and Wang, Yisen},
119
+ journal={arXiv preprint arXiv:2505.18499},
120
+ year={2025}
121
+ }
122
+ ```