TroyDoesAI commited on
Commit
aac55ce
·
verified ·
1 Parent(s): 4cd55c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md CHANGED
@@ -1,3 +1,112 @@
1
  ---
2
  license: cc-by-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
  ---
4
+ What If I told you I found that some layers do absolutely nothing.
5
+
6
+ We are not training models efficiently for the sake of racing to be the biggest, but big things come in optimizing small packages.
7
+
8
+ Cutting these layers out do not change the output for my model at all individually, but cutting all of these redundant layers at once breaks the model completely.
9
+ So strategically balance extractions and see if we can recover after removing this redundancy.
10
+
11
+ Please try my Mermaid-Llama-3-8B and This Mermaid-Llama-3-Pruned-6B, you might be surprised.
12
+
13
+ 24/32 Layers Model.
14
+
15
+ ---
16
+ license: cc-by-4.0
17
+ ---
18
+
19
+ # Mermaid-Llama-3-7B
20
+
21
+ Introducing Mermaid-LLama-3-6B, a robust language model designed for Python code understanding and crafting captivating story flow maps.
22
+ Pruned down to 6 billion parameter to show we dont need the bloat.
23
+
24
+ See MergeKit Notes And Try Triming my model yourself and explore my world of trimming models to fit SMARTER Models with lower requirements f
25
+ or specific tasks. Mermaid is just a start, Hire me to solve your problem and I will build the smallest footprint model that solves just that problem.
26
+
27
+ I wish to specialize in packing models on Edge Devices.
28
+
29
+ Open For Hire See my links to my Linkedin for more.
30
+
31
+
32
+ ![MermaidLlama GIF](Mermaid_ShowCase/MermaidLlama.webp)
33
+
34
+ ## Key Features
35
+
36
+ 1. **Code Understanding:**
37
+ - Masters Python intricacies with finesse.
38
+ - Generates clear and accurate Mermaid Diagram Flow Charts.
39
+ - Ideal for developers seeking visual representations of their code logic.
40
+
41
+ 2. **Storytelling Capabilities:**
42
+ - Converts narrative inputs into captivating Mermaid Diagrams.
43
+ - Maps character interactions, plot developments, and narrative arcs.
44
+
45
+ 3. **Unmatched Performance:**
46
+ - Surpasses GPT-4 in generating well-organized Mermaid Diagrams.
47
+
48
+ 4. **Training Insights:**
49
+ - Trained on a diverse dataset, including 800 unique, hand-curated Mermaid Graph examples utilizing 478 complete Python programs.
50
+ - Exhibits emergent properties in story-to-flow map translations and step-by-step instruction flow maps.
51
+
52
+ ## Collaboration
53
+
54
+ Interested in enhancing Mermaid's capabilities? Contact [email protected] for collaboration opportunities.
55
+
56
+ ## Example Use Cases
57
+ - **Retrieval-Augmented Generation (RAG):** Utilize Mermaid-LLama-3-8B to create condensed knowledge graphs. This model excels in generating flow diagrams that enhance the retrieval process. These knowledge graphs are stored in a vector database, which allows for quick and efficient retrieval of contextually relevant information. When a query is received, the system retrieves a pertinent knowledge graph, appending it as context to the model. This enriched context enables Mermaid-LLama-3-8B to deliver more accurate and nuanced responses. This approach is particularly beneficial in applications requiring deep, context-aware interactions, such as sophisticated Q&A systems, dynamic data analysis, and complex decision-making tasks.
58
+ - **Code Documentation:** Automatic visual flow charts from Python code.
59
+ - **Storyboarding:** Visually appealing diagrams for storytelling.
60
+ - **Project Planning:** Visual project flow maps for effective team communication.
61
+ - **Learning Python:** Helps students visually understand Python code structures.
62
+ - **Game Design:** Visualizing game storylines for coherent narrative structure.
63
+
64
+ ## Proof of Concept
65
+
66
+ Stay tuned for the release of the VSCode Extension that displays the Live Flow Map every time a user stops typing for more than 10 seconds.
67
+
68
+ ## Training Specifications
69
+
70
+ - **LoRA Rank:** 2048
71
+ - **LoRA Alpha:** 4096
72
+ - **Batch Size:** 1
73
+ - **Micro Batch Size:** 1
74
+ - **Cutoff Length:** 4096
75
+ - **Save every n steps:** 1000
76
+ - **Epochs:** 3
77
+ - **Learning Rate:** 1e-6
78
+ - **LR Scheduler:** Cosine
79
+
80
+ **Target Modules:**
81
+ - Enable q_proj
82
+ - Enable v_proj
83
+ - Enable k_proj
84
+ - Enable o_proj
85
+ - Enable gate_proj
86
+ - Enable down_proj
87
+ - Enable up_proj
88
+
89
+ ## Getting Started
90
+
91
+ Start by downloading one of my models.
92
+
93
+ ![0 TroyDoesAI GIF](Mermaid_ShowCase/0_TroyDoesAI.gif)
94
+
95
+ Load the model.
96
+
97
+ ![1 Load Model in 4-bit Show Example Use GIF](Mermaid_ShowCase/1_LoadModel_in_4bit_Show_Example_Use.gif)
98
+
99
+ Use my prompt template to generate a Mermaid code block, which can be viewed in the Mermaid Live Editor or using the Mermaid CLI tool.
100
+
101
+ ![2 Loaded Model in Full Precision 16-bit Show Inference and Mermaid Live Editor GIF](Mermaid_ShowCase/2_Loaded_Model_in_Full_Precision_16bit_Show_Inference_and_Mermaid_Live_editor.gif)
102
+
103
+ Here we open the VLLM GUI Program while still running in Vram the Mermaid-Llama-8B to compare the flow diagram to the actual program and show the lightweight capabilites of small models on consumer hardware.
104
+
105
+ ![3 Open The Program VLLM Program With Full Precision Mermaid-Llama-8B Running to Evaluate Flow Map GIF](Mermaid_ShowCase/3_Open_The_Program_VLLM_Program_With_Full_Precision_Mermaid-Llama-8B-Running_to_evaluate_flow_map.gif)
106
+
107
+ ## More on my VLLM Class and inference GUI : https://github.com/Troys-Code/VLLM
108
+
109
+ ![Python RtdBsaz8gy GIF](Mermaid_ShowCase/python_RtdBsaz8gy.gif)
110
+ ---
111
+
112
+ Note: This model should be treated as an Auto-Complete Model, Do not try talking to it in chat you are gonna get garbage, those layers have been pruned and replaced, that is all you will hear of my secret sauce on training on small < 1000 entry datasets.