justinj92 commited on
Commit
bb92f5b
·
verified ·
1 Parent(s): d8620be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -1,6 +1,5 @@
1
  ---
2
  language:
3
- - ml
4
  - en
5
  base_model: Qwen/Qwen3-8B
6
  library_name: transformers
@@ -11,18 +10,21 @@ tags:
11
  - lora
12
  - merged
13
  license: apache-2.0
 
 
 
14
  ---
15
 
16
  # Delphermes-8B-cpt-epoch2
17
 
18
- This is a merged LoRA model based on Qwen/Qwen3-8B, fine-tuned for Malayalam language tasks.
19
 
20
  ## Model Details
21
 
22
  - **Base Model**: Qwen/Qwen3-8B
23
- - **Language**: Malayalam (ml), English (en)
24
  - **Type**: Merged LoRA model
25
- - **Library**: transformers
26
 
27
  ## Usage
28
 
@@ -39,7 +41,7 @@ model = AutoModelForCausalLM.from_pretrained(
39
  )
40
 
41
  # Example usage
42
- text = "നമസ്കാരം"
43
  inputs = tokenizer(text, return_tensors="pt")
44
  outputs = model.generate(**inputs, max_length=100)
45
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
@@ -48,4 +50,4 @@ print(response)
48
 
49
  ## Training Details
50
 
51
- This model was created by merging a LoRA adapter trained for Malayalam language understanding and generation.
 
1
  ---
2
  language:
 
3
  - en
4
  base_model: Qwen/Qwen3-8B
5
  library_name: transformers
 
10
  - lora
11
  - merged
12
  license: apache-2.0
13
+ datasets:
14
+ - NousResearch/Hermes-3-Dataset
15
+ - QuixiAI/dolphin
16
  ---
17
 
18
  # Delphermes-8B-cpt-epoch2
19
 
20
+ This is a merged LoRA model based on Qwen/Qwen3-8B, fine-tuned using Hermes3 & Dolphin synth data.
21
 
22
  ## Model Details
23
 
24
  - **Base Model**: Qwen/Qwen3-8B
25
+ - **Language**: English (en)
26
  - **Type**: Merged LoRA model
27
+ - **Library**: transformers, axolotl
28
 
29
  ## Usage
30
 
 
41
  )
42
 
43
  # Example usage
44
+ text = "Who are you?"
45
  inputs = tokenizer(text, return_tensors="pt")
46
  outputs = model.generate(**inputs, max_length=100)
47
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
 
50
 
51
  ## Training Details
52
 
53
+ This model was created by merging a LoRA adapter trained for understanding and generation.