Update README.md

274ad5c verified 6 months ago

4 kB

	---
	library_name: transformers
	tags:
	- text-generation-inference
	- pretraining/SFT
	- code
	- math
	license: apache-2.0
	language:
	- en
	base_model:
	- Gensyn/Qwen2.5-0.5B-Instruct
	pipeline_tag: text-generation
	---

	![4.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/O97CXaIMZRnhzV7yZhZV4.png)

	# Lang-Exster-0.5B-Instruct

	> Lang-Exster-0.5B-Instruct is a general-purpose instruction-following LLM fine-tuned from Qwen2.5-0.5B. This model is optimized for lightweight deployments and instructional clarity, capable of performing a wide range of natural language and programming-related tasks with efficiency and interpretability.

	## Key Features

	1. Instruction Following & Explanation
	Trained to understand, follow, and respond to natural language instructions with clear, logical, and relevant output. Suitable for Q&A, step-by-step reasoning, and guided code generation.

	2. Lightweight General-Purpose Model
	Fine-tuned from Qwen2.5-0.5B, making it highly efficient for edge devices, local tools, and low-resource applications without sacrificing utility.

	3. Multi-Domain Task Handling
	Can perform across coding, writing, summarization, chat, translation, and educational queries, thanks to its broad general-purpose instruction tuning.

	4. Compact and Efficient
	At just 0.5B parameters, Lang-Exster is optimized for fast inference, low memory usage, and seamless integration into developer tools and workflows.

	5. Code Assistance (Lite)
	Capable of basic code generation, syntax checking, and conceptual explanations, especially useful for beginners and instructional applications.

	## Quickstart with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Lang-Exster-0.5B-Instruct"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Write a Python function that checks if a number is prime, and explain how it works."

	messages = [
	{"role": "system", "content": "You are an instructional assistant. Follow user instructions clearly and explain your reasoning."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## Intended Use

	- General-Purpose Assistant:
	Performs everyday tasks such as Q&A, summarization, light coding, language generation, and translation.

	- Educational Support:
	Aids learners in understanding topics through guided explanations, basic coding help, and concept breakdowns.

	- Lightweight Developer Integration:
	Ideal for command-line assistants, browser plugins, and desktop utilities with limited compute resources.

	- Instruction Clarity Demonstrator:
	Acts as a fine baseline for developing instruction-tuned capabilities in constrained environments.

	## Limitations

	1. Scale Limitations
	Being a 0.5B model, it has limited memory and may not handle deep context or long documents effectively.

	2. Reasoning Depth
	Provides surface-level reasoning and may struggle with highly technical, abstract, or creative prompts.

	3. Basic Code Generation
	Supports basic scripting and logic but may miss edge cases or advanced patterns in complex code.

	4. Prompt Design Sensitivity
	Performs best with clear, concise, and well-structured instructions.