metadata

license: apache-2.0

RWKV v7 Potato Model Card

Model Overview

Name: RWKV v7 Potato
Architecture: RWKV v7 with MoLE (Mixture of LoRA Experts)
Base Model RWKV-x070-World-0.4B-v2.9-20250107-ctx4096
Parameter Count: 0.6B(540M)
License: Apache 2.0

Technical Specifications

Training Approch: LoRA(r=256)
Expert Configuration:
- Total LoRA Experts: 4
- Active LoRA Experts: 2(Shared Expert0 + n)
End Token: \n\n\x17
Inference: only supported latest RWKV-Infer

Language Support

English
Japanese
Chinese

Dataset

CJE 900k pairs Pre-instruct tuning

Purpose and Use Case

This model serves as a proof-of-concept experiment to investigate the effectiveness of Mixture of LoRA Experts (MoLE) architecture in small-parameter Language Learning Models (LLMs).

Limitations and Known Issues

The model's small parameter count (0.6B) significantly impacts its performance:

Responses are consistently inaccurate
Not suitable for production use or tasks requiring reliability
Should be considered an experimental research model only
Inference is slow due to LoRA's real-time merging

Research Context

This implementation explores the viability of MoLE architecture in resource-constrained environments, specifically examining how expert mixture mechanisms perform in small-scale language models.

License Information

This model is released under the Apache 2.0 license, allowing for both academic and commercial use with appropriate attribution.