|
|
--- |
|
|
license: mit |
|
|
pipeline_tag: time-series-forecasting |
|
|
--- |
|
|
|
|
|
# VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones |
|
|
|
|
|
This repository hosts the **VisionTS++** model, a state-of-the-art time series foundation model based on continual pre-training of a visual Masked AutoEncoder (MAE) on large-scale time series data. It excels in multivariate and probabilistic time series forecasting by bridging modality gaps between vision and time series data. |
|
|
|
|
|
The model was introduced in the paper: |
|
|
[**VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Vision Backbones**](https://arxiv.org/abs/2508.04379) |
|
|
|
|
|
Official GitHub repository: [https://github.com/HALF111/VisionTSpp](https://github.com/HALF111/VisionTSpp) |
|
|
|
|
|
Experience **VisionTS++** directly in your browser on the [Hugging Face Space](https://huggingface.co/spaces/Lefei/VisionTSpp)! You can upload your own custom time series CSV file for zero-shot forecasting. |
|
|
|
|
|
## About |
|
|
VisionTS++ is built upon continual pre-training of a vision model on large-scale time series, addressing key discrepancies in cross-modal transfer from vision to time series. It introduces three key innovations: |
|
|
|
|
|
1. **Vision-model-based filtering**: Identifies high-quality sequences to stabilize pre-training and mitigate the data-modality gap. |
|
|
2. **Colorized multivariate conversion**: Encodes multivariate series as multi-subfigure RGB images to enhance cross-variate modeling. |
|
|
3. **Multi-quantile forecasting**: Uses parallel reconstruction heads to generate quantile forecasts for probabilistic predictions without parametric assumptions. |
|
|
|
|
|
These innovations allow VisionTS++ to achieve state-of-the-art performance in both in-distribution and out-of-distribution forecasting, demonstrating that vision models can effectively generalize to Time Series Forecasting with appropriate adaptation. |
|
|
|
|
|
<div align="center"> |
|
|
<img src="figures/teaser.png" style="width:80%;" /> |
|
|
</div> |
|
|
|
|
|
<div align="center"> |
|
|
<img src="figures/main_figure.png" style="width:100%;" /> |
|
|
</div> |
|
|
|
|
|
## Installation |
|
|
|
|
|
The VisionTS++ model is available through the `visionts` package on PyPI. |
|
|
|
|
|
First, install the package: |
|
|
|
|
|
```shell |
|
|
pip install visionts |
|
|
``` |
|
|
|
|
|
If you want to develop the inference code, you can also build from source: |
|
|
|
|
|
```shell |
|
|
git clone https://github.com/HALF111/VisionTSpp.git |
|
|
cd VisionTSpp |
|
|
pip install -e . |
|
|
``` |
|
|
|
|
|
For detailed inference examples and usage with clear visualizations of image reconstruction, please refer to the `demo.ipynb` notebook in the [official GitHub repository](https://github.com/HALF111/VisionTSpp/blob/main/demo.ipynb). |
|
|
|
|
|
## Citation |
|
|
If you're using VisionTS++ or VisionTS in your research or applications, please cite them using this BibTeX: |
|
|
|
|
|
```bibtex |
|
|
@misc{chen2024visionts, |
|
|
title={VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters}, |
|
|
author={Mouxiang Chen and Lefei Shen and Zhuo Li and Xiaoyun Joy Wang and Jianling Sun and Chenghao Liu}, |
|
|
year={2024}, |
|
|
eprint={2408.17253}, |
|
|
archivePrefix={arXiv}, |
|
|
url={https://arxiv.org/abs/2408.17253}, |
|
|
} |
|
|
|
|
|
@misc{shen2025visiontspp, |
|
|
title={VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones}, |
|
|
author={Lefei Shen and Mouxiang Chen and Xu Liu and Han Fu and Xiaoxue Ren and Jianling Sun and Zhuo Li and Chenghao Liu}, |
|
|
year={2025}, |
|
|
eprint={2508.04379}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CV}, |
|
|
url={https://arxiv.org/abs/2508.04379}, |
|
|
} |
|
|
``` |