Our Models

Model Card for VecTeus-v1.0

The Mistral-7B--based Large Language Model (LLM) is an noveldataset fine-tuned version of the Mistral-7B-v0.1

VecTeus has the following changes compared to Mistral-7B-v0.1.

  • 128k context window (8k context in v0.1)
  • Achieving both high quality Japanese and English generation
  • Can be generated NSFW
  • Memory ability that does not forget even after long-context generation

This model was created with the help of GPUs from the first LocalAI hackathon.

We would like to take this opportunity to thank

List of Creation Methods

  • Chatvector for multiple models
  • Simple linear merging of result models
  • Domain and Sentence Enhancement with LORA
  • Context expansion

Instruction format

Freed from templates. Congratulations

Example prompts to improve (Japanese)

  • BAD:ใ€€ใ‚ใชใŸใฏโ—‹โ—‹ใจใ—ใฆๆŒฏใ‚‹่ˆžใ„ใพใ™

  • GOOD: ใ‚ใชใŸใฏโ—‹โ—‹ใงใ™

  • BAD: ใ‚ใชใŸใฏโ—‹โ—‹ใŒใงใใพใ™

  • GOOD: ใ‚ใชใŸใฏโ—‹โ—‹ใ‚’ใ—ใพใ™

Performing inference

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Local-Novel-LLM-project/Vecteus-v1"
new_tokens = 1024

model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype=torch.float16, attn_implementation="flash_attention_2", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)

system_prompt = "ใ‚ใชใŸใฏใƒ—ใƒญใฎๅฐ่ชฌๅฎถใงใ™ใ€‚\nๅฐ่ชฌใ‚’ๆ›ธใ„ใฆใใ ใ•ใ„\n-------- "

prompt = input("Enter a prompt: ")
system_prompt += prompt + "\n-------- "
model_inputs = tokenizer([system_prompt], return_tensors="pt").to("cuda")


generated_ids = model.generate(**model_inputs, max_new_tokens=new_tokens, do_sample=True)
print(tokenizer.batch_decode(generated_ids)[0])

Merge recipe

  • VT0.1 = Ninjav1 + Original Lora

  • VT0.2 = Ninjav1 128k + Original Lora

  • VT0.2on0.1 = VT0.1 + VT0.2

  • VT1 = all VT Series + Lora + Ninja 128k and Normal

Other points to keep in mind

  • The training data may be biased. Be careful with the generated sentences.
  • Memory usage may be large for long inferences.
  • If possible, we recommend inferring with llamacpp rather than Transformers.
Downloads last month
181
Safetensors
Model size
7.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Local-Novel-LLM-project/Vecteus-v1

Finetunes
2 models
Merges
1 model
Quantizations
2 models

Collection including Local-Novel-LLM-project/Vecteus-v1