metadata

pipeline_tag: text-generation
inference: true
widget:
  - text: 'def print_hello_world():'
    example_title: Hello world
    group: Python
datasets:
  - bigcode/commitpackft
  - bigcode/oasst-octopack
metrics:
  - code_eval
library_name: transformers
language:
  - zh
  - en
tags:
  - codegeex
  - glm
  - chatglm
model-index:
  - name: OctoGeeX
    results:
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 44.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 33.8
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 36.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 21.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 32.3
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 25.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 28.1
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 27.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 27.6
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 22.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 9.6
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 24.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 24
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 24.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 21.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 21
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 15.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 22.9
            verified: false

OctoGeeX

Play with the model on the TODO Playground.

Model Summary
Use
Limitations
Training
License
Citation

Model Summary

OctoGeeX is an instruction tuned model with 6B parameters created by fine-tuning CodeGeeX2 on CommitPackFT & OASST as described in the OctoPack paper.

Repository: bigcode/octopack
Paper: TODO
Languages: 100+ Programming languages

OctoPack🐙🎒:

Data	CommitPack	4TB of GitHub commits across 350 programming languages
	CommitPackFT	Filtered version of CommitPack for high-quality commit messages that resemble instructions
Model	OctoCoder	StarCoder (16B parameters) instruction tuned on CommitPackFT + OASST
	OctoGeeX	CodeGeeX2 (6B parameters) instruction tuned on CommitPackFT + OASST
Evaluation	HumanEvalPack	Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages

Use

Intended use

The model follows instructions provided in the input. We recommend prefacing your input with "Question: " and finishing with "Answer:", for example: "Question: Please write a function in Python that performs bubble sort.\n\nAnswer:"

Feel free to share your generations in the Community tab!

Generation

# pip install -q transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "bigcode/octogeex"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Question: Please write a function in Python that performs bubble sort.\n\nAnswer:", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Training

Model

Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective
Steps: 250k pretraining & 30 instruction tuning
Pretraining tokens: 1 trillion pretraining & 2M instruction tuning
Precision: bfloat16

Hardware

Pretraining:
- GPUs: 512 Tesla A100
- Training time: 24 days
Instruction tuning:
- GPUs: 8 Tesla A100
- Training time: 4 hours

Software

Orchestration: Megatron-LM/Transformers
Neural networks: PyTorch

协议｜ License

本仓库的代码依照 Apache-2.0 协议开源，模型的权重的使用则需要遵循 Model License。

The code in this repository is open-source under the MIT license. The model weights are licensed under the Model License.

Citation

TODO

bigcode
/

octogeex

OctoGeeX

Table of Contents

Model Summary

Use

Intended use

Generation

Training

Model

Hardware

Software

协议｜ License

Citation

OctoGeeX

Table of Contents

Model Summary

Use

Intended use

Generation

Training

Model

Hardware

Software

协议 ｜ License

Citation

协议｜ License