Update README.md
Browse files
README.md
CHANGED
@@ -1,15 +1,99 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
-
|
4 |
-
|
5 |
-
- character
|
6 |
-
pipeline_tag: text-generation
|
7 |
-
library_name: transformers
|
8 |
tags:
|
9 |
-
-
|
10 |
-
- Qwen
|
11 |
-
- unsloth
|
12 |
- LLM
|
13 |
- PyTorch
|
14 |
-
-
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
model_name: Avern Prism 1.0X
|
4 |
+
version: 1.0X
|
|
|
|
|
|
|
5 |
tags:
|
6 |
+
- text-generation
|
|
|
|
|
7 |
- LLM
|
8 |
- PyTorch
|
9 |
+
- unsloth
|
10 |
+
- code
|
11 |
+
- Qwen
|
12 |
+
- Qwen2.5
|
13 |
+
- reasoning
|
14 |
+
- general-intelligence
|
15 |
+
- programming
|
16 |
+
- avern
|
17 |
+
- uk
|
18 |
+
library_name: transformers
|
19 |
+
pipeline_tag: text-generation
|
20 |
+
metrics:
|
21 |
+
- accuracy
|
22 |
+
- perplexity
|
23 |
+
- character
|
24 |
+
|
25 |
+
---
|
26 |
+
|
27 |
+
# Avern Prism 1.0X
|
28 |
+
|
29 |
+
**Avern Prism 1.0X** is a state-of-the-art language model developed by **Avern Technology UKI**, built on the **Qwen2.5 14B** architecture. Optimized using the **Unsloth** framework, Prism 1.0X is designed to perform at the intersection of **reasoning**, **coding**, and **general intelligence**, making it suitable for complex problem-solving, logical tasks, and a wide range of applications from software development to AI-driven research and creative tasks.
|
30 |
+
|
31 |
+
## Model Description
|
32 |
+
|
33 |
+
- **Base Model**: Qwen2.5 14B
|
34 |
+
- **Architecture**: Transformer (Decoder-only)
|
35 |
+
- **Training Framework**: PyTorch + Unsloth
|
36 |
+
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
|
37 |
+
- **Context Length**: Up to 4096 tokens
|
38 |
+
- **Use Cases**: Advanced reasoning, problem-solving, code generation, creative content generation, AI research, knowledge extraction, and more.
|
39 |
+
|
40 |
+
## Key Features
|
41 |
+
|
42 |
+
- **Reasoning**: Prism 1.0X is optimized for solving complex logical problems, answering deep conceptual questions, and providing step-by-step reasoning for math and algorithmic problems.
|
43 |
+
- **Code Generation**: It supports multi-language code generation (Python, JavaScript, C++, etc.), making it ideal for helping developers write, debug, and optimize code.
|
44 |
+
- **General Intelligence**: Prism 1.0X is designed with broad capabilities for general-purpose AI tasks such as understanding abstract concepts, creating creative content, and answering domain-specific queries across multiple fields.
|
45 |
+
- **Size**: 14B parameters, striking an optimal balance between computational power and versatility.
|
46 |
+
- **Adaptability**: Capable of being fine-tuned for specific domains, allowing customization for different applications in research, business, education, or entertainment.
|
47 |
+
|
48 |
+
## Intended Use
|
49 |
+
|
50 |
+
This model is ideal for:
|
51 |
+
- **Developers**: Assisting with code generation, algorithmic problem solving, and software development tasks.
|
52 |
+
- **Researchers**: Leveraging its broad general intelligence to assist with exploratory research, hypothesis generation, and complex problem-solving.
|
53 |
+
- **Educators and Students**: Providing tools for learning programming, mathematics, and critical thinking.
|
54 |
+
- **Creative Applications**: Writing, brainstorming, and idea generation for creative work.
|
55 |
+
- **AI Enthusiasts**: Building custom AI-driven applications with advanced reasoning and coding capabilities.
|
56 |
+
|
57 |
+
## Training Data
|
58 |
+
|
59 |
+
Prism 1.0X was fine-tuned on a combination of datasets:
|
60 |
+
- **Code**: Datasets featuring a wide variety of programming languages and coding tasks.
|
61 |
+
- **Reasoning**: Datasets for logical reasoning, problem-solving, mathematics, and algorithm design.
|
62 |
+
- **General Knowledge**: General-domain knowledge, creative writing, and abstract reasoning datasets, including encyclopedic knowledge and instructional content.
|
63 |
+
|
64 |
+
**Note**: The training data excludes proprietary or private data.
|
65 |
+
|
66 |
+
## Limitations
|
67 |
+
|
68 |
+
- **Reasoning and Accuracy**: While Prism 1.0X excels at reasoning, it may not always provide perfect solutions to highly specialized problems or new, unseen domains.
|
69 |
+
- **Hallucination Risk**: As with most large language models, Prism 1.0X may generate hallucinated or incorrect information, especially in highly abstract or speculative scenarios.
|
70 |
+
- **Context**: Though highly capable, it can still struggle with maintaining perfect context over long conversations or complex multi-step tasks without fine-tuning.
|
71 |
+
|
72 |
+
## How to Use
|
73 |
+
|
74 |
+
```python
|
75 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
76 |
+
|
77 |
+
model = AutoModelForCausalLM.from_pretrained("avernai/prism-1.0x")
|
78 |
+
tokenizer = AutoTokenizer.from_pretrained("avernai/prism-1.0x")
|
79 |
+
|
80 |
+
# Example: Code generation
|
81 |
+
prompt = "Write a Python function that calculates the Fibonacci sequence up to n."
|
82 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
83 |
+
outputs = model.generate(**inputs, max_new_tokens=150)
|
84 |
+
|
85 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
86 |
+
|
87 |
+
# Example: Logical reasoning
|
88 |
+
prompt = "What is the next number in the sequence: 2, 4, 8, 16, ?"
|
89 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
90 |
+
outputs = model.generate(**inputs, max_new_tokens=150)
|
91 |
+
|
92 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
93 |
+
|
94 |
+
# Example: General intelligence application
|
95 |
+
prompt = "Explain the theory of relativity in simple terms."
|
96 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
97 |
+
outputs = model.generate(**inputs, max_new_tokens=200)
|
98 |
+
|
99 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|