epfl-dlab
/

zip2zip-Llama-3.2-3B-Instruct-v0.1

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

zip2zip_config.json ADDED Viewed

	@@ -0,0 +1,275 @@

+{
+  "base_model_name_or_path": "meta-llama/Llama-3.2-3B-Instruct",
+  "compression": {
+    "disabled_ids": [
+      128000,
+      128001,
+      128002,
+      128003,
+      128004,
+      128005,
+      128006,
+      128007,
+      128008,
+      128009,
+      128010,
+      128011,
+      128012,
+      128013,
+      128014,
+      128015,
+      128016,
+      128017,
+      128018,
+      128019,
+      128020,
+      128021,
+      128022,
+      128023,
+      128024,
+      128025,
+      128026,
+      128027,
+      128028,
+      128029,
+      128030,
+      128031,
+      128032,
+      128033,
+      128034,
+      128035,
+      128036,
+      128037,
+      128038,
+      128039,
+      128040,
+      128041,
+      128042,
+      128043,
+      128044,
+      128045,
+      128046,
+      128047,
+      128048,
+      128049,
+      128050,
+      128051,
+      128052,
+      128053,
+      128054,
+      128055,
+      128056,
+      128057,
+      128058,
+      128059,
+      128060,
+      128061,
+      128062,
+      128063,
+      128064,
+      128065,
+      128066,
+      128067,
+      128068,
+      128069,
+      128070,
+      128071,
+      128072,
+      128073,
+      128074,
+      128075,
+      128076,
+      128077,
+      128078,
+      128079,
+      128080,
+      128081,
+      128082,
+      128083,
+      128084,
+      128085,
+      128086,
+      128087,
+      128088,
+      128089,
+      128090,
+      128091,
+      128092,
+      128093,
+      128094,
+      128095,
+      128096,
+      128097,
+      128098,
+      128099,
+      128100,
+      128101,
+      128102,
+      128103,
+      128104,
+      128105,
+      128106,
+      128107,
+      128108,
+      128109,
+      128110,
+      128111,
+      128112,
+      128113,
+      128114,
+      128115,
+      128116,
+      128117,
+      128118,
+      128119,
+      128120,
+      128121,
+      128122,
+      128123,
+      128124,
+      128125,
+      128126,
+      128127,
+      128128,
+      128129,
+      128130,
+      128131,
+      128132,
+      128133,
+      128134,
+      128135,
+      128136,
+      128137,
+      128138,
+      128139,
+      128140,
+      128141,
+      128142,
+      128143,
+      128144,
+      128145,
+      128146,
+      128147,
+      128148,
+      128149,
+      128150,
+      128151,
+      128152,
+      128153,
+      128154,
+      128155,
+      128156,
+      128157,
+      128158,
+      128159,
+      128160,
+      128161,
+      128162,
+      128163,
+      128164,
+      128165,
+      128166,
+      128167,
+      128168,
+      128169,
+      128170,
+      128171,
+      128172,
+      128173,
+      128174,
+      128175,
+      128176,
+      128177,
+      128178,
+      128179,
+      128180,
+      128181,
+      128182,
+      128183,
+      128184,
+      128185,
+      128186,
+      128187,
+      128188,
+      128189,
+      128190,
+      128191,
+      128192,
+      128193,
+      128194,
+      128195,
+      128196,
+      128197,
+      128198,
+      128199,
+      128200,
+      128201,
+      128202,
+      128203,
+      128204,
+      128205,
+      128206,
+      128207,
+      128208,
+      128209,
+      128210,
+      128211,
+      128212,
+      128213,
+      128214,
+      128215,
+      128216,
+      128217,
+      128218,
+      128219,
+      128220,
+      128221,
+      128222,
+      128223,
+      128224,
+      128225,
+      128226,
+      128227,
+      128228,
+      128229,
+      128230,
+      128231,
+      128232,
+      128233,
+      128234,
+      128235,
+      128236,
+      128237,
+      128238,
+      128239,
+      128240,
+      128241,
+      128242,
+      128243,
+      128244,
+      128245,
+      128246,
+      128247,
+      128248,
+      128249,
+      128250,
+      128251,
+      128252,
+      128253,
+      128254,
+      128255
+    ],
+    "initial_vocab_size": 128256,
+    "max_codebook_size": 2048,
+    "max_subtokens": 4
+  },
+  "encoder": {
+    "hidden_size": 3072,
+    "intermediate_size": null,
+    "num_heads": 24,
+    "num_hidden_layers": 2,
+    "position_encoding": "learnable",
+    "tie_encoders": true
+  },
+  "encoder_type": "transformer"
+}