QiMing
An AI that rewrites its own rules for greater intelligence.
结果 (Result) = 模型内容 (Model Content) × 数学的平方 (Math²)
"Logic is the soul of a model, for it defines:
- How it learns from data (The Power of Induction);
- How it reasons and decides (The Power of Deduction);
- Its capacity to align with human values (The Ethical Boundary);
- Its potential to adapt to future challenges (The Evolutionary Potential).
If a model pursues nothing but sheer scale or computational power, ignoring the depth and breadth of its logic, it risks becoming a "paper tiger"—imposing on the surface, yet hollow at its core. Conversely, a model built upon elegant logic, even with fewer parameters, can unleash its true vitality in our complex world."
DISCLAIMER
The content generated by this model is for reference purposes only. Users are advised to verify its accuracy independently before use.
This is a 20-billion-parameter foundation model (20B). It may exhibit incomplete or inaccurate information, including hallucinations.
If you find this AI too human-like, please remember: it is merely a more intelligent model — not an actual person.
Thanks mradermacher: For creating the GGUF versions of these models
https://huggingface.co/mradermacher/QiMing-Sales-20B-MXFP4-GGUF
https://huggingface.co/mradermacher/QiMing-Sales-20B-MXFP4-i1-GGUF
For developing the foundational model (aifeifei798/QiMing-Sales-20B-MXFP4) used in this project.
unsloth.ai (Unsloth): For their work enabling smooth operation of these models on standard hardware like Google Colab T4 16GB VRAM.
Thank Google Colab T4 16G
QiMing-Sales-20B-MXFP4
Model Description
QiMing-Sales-20B-MXFP4 is not just a sales chatbot; it is a sophisticated Cognitive Simulator designed for B2B sales expertise. Fine-tuned from the powerful gpt-oss-20B
foundational model, QiMing-Sales-20B-MXFP4 has been meticulously trained on a proprietary, synthetically generated dataset that embodies the core principles of modern sales science.
The model's key innovation lies in its architecture, which functions as a "Model Capability Control Layer". This allows it to dynamically adopt different professional personas—from a junior Intern to a strategic CEO—and apply specific, context-aware sales logic to a given situation.
Its core capabilities are structured around distinct "Logic Modules", enabling it to:
- Diagnosis & Amplification Logic (Pain -> Impact): Diagnose a client's surface-level pain and connect it to deep, strategic business impacts.
- Solution Matching Logic (Feature -> Value): Translate abstract technical features into tangible, quantifiable business value.
- Objection Handling Logic (Objection -> Reframe): Reframe customer objections into opportunities for deeper value discussions.
- Action Guidance Logic (Status -> Next Step): Propose clear, actionable next steps to maintain deal momentum.
- Comparison & Selection Logic (Comparison -> Recommendation): Provide consultative advice by comparing solutions based on the client's core needs.
- Strategic Synthesis Logic (Strategic Synthesis): Combine multiple logic modules to address complex, high-level strategic challenges from decision-makers.
This allows QiMing-Sales-20B-MXFP4 to go beyond generic advice, providing responses that are contextually-aware, role-appropriate, and strategically-sound, making it a powerful co-pilot for sales professionals at all levels.
Intended Uses
- Sales Training & Role-Playing: Simulate various customer interactions for training new and experienced sales staff.
- Sales Call & Email Generation: Assist in drafting scripts, emails, and proposals that are logically sound and persuasive.
- Strategic Deal Coaching: Act as a sparring partner for sales leaders to brainstorm strategies for complex deals.
- Sales Enablement Content Creation: Generate high-quality content, such as case studies and value propositions.
Limitations and Ethical Considerations
- Knowledge Cutoff: The model's knowledge is based on its training data and does not have access to real-time information.
- Potential for Hallucination: Like all LLMs, QiMing-Sales-20B-MXFP4 can occasionally generate plausible but incorrect information ("hallucinate"). All outputs, especially quantitative metrics, should be verified by a human expert.
- Bias: The model's "expert" persona is defined by its training data, which reflects a specific B2B sales philosophy. This perspective may not be universally applicable to all industries or cultures.
- Not a Replacement for Human Judgment: The model is intended to be a co-pilot, not an autonomous agent. Final sales decisions and client communications should always be reviewed and owned by a human professional.
Training Procedure
Training Data
QiMing-Sales-20B-MXFP4 was not trained on scraped web data. It was fine-tuned on a dataset of approximately 500 high-quality, structured JSON examples, synthetically generated by a proprietary, Python-based "Meta-Prompt Factory".
This generator systematically combines elements from three core components:
- SALES_WORLDS: A knowledge base defining realistic B2B scenarios across industries like SaaS, High-End Manufacturing, and Professional Services.
- ROLE_HIERARCHY: A framework that maps sales roles (from Intern to CEO) to the specific logic modules they are expected to master.
- LOGIC_MODULES: A library of core sales reasoning patterns (e.g., Pain Diagnosis, Objection Handling).
Each data point consists of a structured instruction
, input
, and an ideal output
, guided by a gold-standard few-shot example
. This methodology ensures that the model learns not just to mimic language, but to internalize and apply complex, role-specific reasoning frameworks.
Training Dataset
Fine-tuning
The model was fine-tuned from the gpt-oss-20B
base model using the generated instruction-following dataset. The training process focused on teaching the model to accurately interpret the "Model Capability Control Layer" (instruction
), apply the specified role and logic to the provided context (input
), and generate a high-quality, structured response (output
).
Highlights
- Permissive Apache 2.0 license: Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
- Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
- Full chain-of-thought: Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
- Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
- Agentic capabilities: Use the models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
- MXFP4 quantization: The models were post-trained with MXFP4 quantization of the MoE weights, making
QiMing-Sales-20B-MXFP4
model run within 16GB of memory. All evals were performed with the same MXFP4 quantization.
Inference examples
Transformers
You can use QiMing-Sales-20B-MXFP4
with Transformers. If you use the Transformers chat template, it will automatically apply the harmony response format. If you use model.generate
directly, you need to apply the harmony format manually using the chat template or use our openai-harmony package.
To get started, install the necessary dependencies to setup your environment:
pip install -U transformers kernels torch
Once, setup you can proceed to run the model by running the snippet below:
from transformers import pipeline
import torch
model_id = "aifeifei798/QiMing-Sales-20B-MXFP4"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
Alternatively, you can run the model via Transformers Serve
to spin up a OpenAI-compatible webserver:
transformers serve
transformers chat localhost:8000 --model-name-or-path aifeifei798/QiMing-Sales-20B-MXFP4
Learn more about how to use gpt-oss with Transformers.
vLLM
vLLM recommends using uv for Python dependency management. You can use vLLM to spin up an OpenAI-compatible webserver. The following command will automatically download the model and start the server.
uv pip install --pre vllm==0.10.1+gptoss \
--extra-index-url https://wheels.vllm.ai/gpt-oss/ \
--extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
--index-strategy unsafe-best-match
vllm serve aifeifei798/QiMing-Sales-20B-MXFP4
Learn more about how to use gpt-oss with vLLM.
PyTorch / Triton
To learn about how to use this model with PyTorch and Triton, check out our reference implementations in the gpt-oss repository.
LM Studio
If you are using LM Studio you can use the following commands to download.
# QiMing-Sales-20B-MXFP4
lms get aifeifei798/QiMing-Sales-20B-MXFP4
Check out our awesome list for a broader collection of gpt-oss resources and inference partners.
Download the model
You can download the model from Hugging Face CLI:
# QiMing-Sales-20B-MXFP4
huggingface-cli download aifeifei798/QiMing-Sales-20B-MXFP4 --local-dir QiMing-Sales-20B-MXFP4/
pip install gpt-oss
python -m gpt_oss.chat QiMing-Sales-20B-MXFP4/
Reasoning levels
You can adjust the reasoning level that suits your task across three levels:
- Low: Fast responses for general dialogue.
- Medium: Balanced speed and detail.
- High: Deep and detailed analysis.
The reasoning level can be set in the system prompts, e.g., "Reasoning: high".
Tool use
The gpt-oss models are excellent for:
- Web browsing (using built-in browsing tools)
- Function calling with defined schemas
- Agentic operations like browser tasks
Fine-tuning
QiMing-Sales-20B-MXFP4 models can be fine-tuned for a variety of specialized use cases.
This smaller model QiMing-Sales-20B-MXFP4
can be fine-tuned on consumer hardware
Citation
If you use QiMing-Sales-20B-MXFP4 in your research or application, please cite the model creator.
@software{qiming_sales_2025,
author = {aifeifei798},
title = {QiMing-Sales-20B-MXFP4: A Cognitive Simulator for B2B Sales Expertise},
month = {September},
year = {2025},
url = {https://huggingface.co/aifeifei798/QiMing-Sales-20B-MXFP4}
}
- Downloads last month
- 4