YAML Metadata Warning: The pipeline tag "conversational" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, any-to-any, other

BigCodeLLama 92b GGUF files πŸš€

Experimental 92B CodeLlaMA that should be better than stock

Models Merged with base codellama/CodeLlama-70b-Instruct-hf

Full model here: https://huggingface.co/nisten/BigCodeLlama-92b

Models Merged

The following models were included in the merge:

  • ../CodeLlama-70b-Python-hf
  • ../CodeLlama-70b-Instruct-hf

Configuration

The following YAML configuration was used to produce this model:

dtype: bfloat16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 69]
    model:
      model:
        path: ../CodeLlama-70b-Instruct-hf
- sources:
  - layer_range: [42, 80]
    model:
      model:
        path: ../CodeLlama-70b-Python-hf

To merge together the 6bit for example download both parts then do

cat BigCodeLlama-92b-q6.gguf.part0 BigCodeLlama-92b-q6.gguf.part1 > BigCodeLlama-92b-q6.gguf

Comparison over stock with question:

Plan and write code for building a city on mars via calculating aldrin cycler orbits in js for cargo shipments starting in year 2030, and after coding it in python and c++ output a table of calendar of deliver dates.

Don't ask for clarification just do the work smartly.

image/png

and our 6bit quant

image/png

Downloads last month
28
GGUF
Model size
92.1B params
Architecture
llama
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for nisten/BigCodeLlama-92b-GGUF

Quantized
(11)
this model