lokinfey
/

Phi-3.5-moe-mlx-int4

Model card Files Files and versions Community

Phi-3.5-moe-mlx-int4 / README.md

lokinfey's picture

Update README.md

8ef782e verified 10 months ago

|

history blame contribute delete

2.34 kB

	---
	license: mit
	---

	# Phi-3.5-moe-mlx-int4

	<b><span style="text-decoration:underline">Note: This is unoffical version,just for test and dev.</span></b>

	This is a quantized INT4 model based on Apple MLX Framework Phi-3.5-MoE-Instruct. You can deploy it on Apple Silicon devices (M1,M2,M3).


	Installation

	```bash

	pip install -U mlx-lm

	```

	Conversion

	```bash

	python -m mlx_lm.convert --hf-path microsoft/Phi-3.5-MoE-instruct -q

	```

	Samples

	```python

	from mlx_lm import load, generate

	model, tokenizer = load("./phi-3.5-moe-mlx-int4")

	sys_msg = """You are a helpful AI assistant, you are an agent capable of using a variety of tools to answer a question. Here are a few of the tools available to you:

	- Blog: This tool helps you describe a certain knowledge point and content, and finally write it into Twitter or Facebook style content
	- Translate: This is a tool that helps you translate into any language, using plain language as required

	To use these tools you must always respond in JSON format containing `"tool_name"` and `"input"` key-value pairs. For example, to answer the question, "Build Muliti Agents with MOE models" you must use the calculator tool like so:



	{
	"tool_name": "Blog",
	"input": "Build Muliti Agents with MOE models"
	}



	Or to translate the question "can you introduce yourself in Chinese" you must respond:



	{
	"tool_name": "Search",
	"input": "can you introduce yourself in Chinese"
	}



	Remember just output the final result, ouput in JSON format containing `"agentid"`,`"tool_name"` , `"input"` and `"output"` key-value pairs .:


	[


	{ "agentid": "step1",
	"tool_name": "Blog",
	"input": "Build Muliti Agents with MOE models",
	"output": "........."
	},

	{ "agentid": "step2",
	"tool_name": "Search",
	"input": "can you introduce yourself in Chinese",
	"output": "........."
	},
	{
	"agentid": "final"
	"tool_name": "Result",
	"output": "........."
	}
	]



	The users answer is as follows.
	"""

	query ='Write something about Generative AI with MOE , translate it to Chinese'

	prompt = tokenizer.apply_chat_template(
	[{"role": "system", "content": sys_msg},{"role": "user", "content": query}],
	tokenize=False,
	add_generation_prompt=True,
	)

	response = generate(model, tokenizer, prompt=prompt,max_tokens=1024, verbose=True)

	```