ReLaX-VQA
Official Code for the following paper:
X. Wang, A. Katsenou, and D. Bull. ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment
Performance
We evaluate the performance of ReLaX-VQA on four datasets. ReLaX-VQA has three different versions based on the training and testing strategies:
- ReLaX-VQA: Trained and tested on each dataset with an 80%-20% random split.
- ReLaX-VQA (w/o FT): Trained on LSVQ, and the frozen model was tested on other datasets.
- ReLaX-VQA (w/ FT): Trained on LSVQ, and the frozen model was fine-tuned on other datasets.
Spearmanβs Rank Correlation Coefficient (SRCC)
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8643 | 0.8535 | 0.7655 | 0.8014 |
ReLaX-VQA (w/o FT) | 0.7845 | 0.8312 | 0.7664 | 0.8104 |
ReLaX-VQA (w/ FT) | 0.8974 | 0.8720 | 0.8468 | 0.8469 |
Pearsonβs Linear Correlation Coefficient (PLCC)
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8895 | 0.8473 | 0.8079 | 0.8204 |
ReLaX-VQA (w/o FT) | 0.8336 | 0.8427 | 0.8242 | 0.8354 |
ReLaX-VQA (w/ FT) | 0.9294 | 0.8668 | 0.8876 | 0.8652 |
More results can be found in reported_result.ipynb.
Proposed Model
The figure shows the overview of the proposed ReLaX-VQA framework. The architectures of ResNet-50 Stack (I) and ResNet-50 Pool (II) are provided in Fig.2 in the paper.

Usage
π Install Requirement
The repository is built with Python 3.10.14 and can be installed via the following commands:
git clone https://github.com/xinyiW915/ReLaX-VQA.git
cd ReLaX-VQA
conda create -n relaxvqa python=3.10.14 -y
conda activate relaxvqa
pip install -r requirements.txt
π₯ Download UGC Datasets
The corresponding raw video datasets can be downloaded from the following sources:
LSVQ, KoNViD-1k, LIVE-VQC, YouTube-UGC, CVD2014.
The metadata for the experimented UGC dataset is available under ./metadata
.
Once downloaded, place the datasets in ./ugc_original_videos
or any other storage location of your choice.
Ensure that the video_path
in the get_video_paths
function inside main_relaxvqa_feats.py
is updated accordingly.
π¬ Test Demo
Run the pre-trained models to evaluate the quality of a single video.
The model weights provided in ./model
contain the best-performing saved weights from training.
To evaluate the quality of a specific video, run the following command:
python demo_test_gpu.py
-device <DEVICE>
-train_data_name <TRAIN_DATA_NAME>
-is_finetune <True/False>
-save_path <MODEL_PATH>
-video_type <DATASET_NAME>
-video_name <VIDEO_NAME>
-framerate <FRAMERATE>
Or simply try our demo video by running:
python demo_test_gpu.py
π§ͺ How to Use the Pretrained Model
You can download and load the model using huggingface_hub
:
from huggingface_hub import hf_hub_download
import torch
# Download the pretrained model file
model_path = hf_hub_download(
repo_id="xinyiW915/ReLaX-VQA",
filename="model/lsvq_train_relaxvqa_byrmse_trained_median_model_param_onLSVQ_TEST.pth"
)
state_dict = torch.load(model_path)
fixed_state_dict = fix_state_dict(state_dict)
model.load_state_dict(fixed_state_dict) # Use this with your model class
Training
Steps to train ReLaX-VQA from scratch on different datasets.
Extract Features
Run the following command to extract features from videos:
python main_relaxvqa_feats.py -device gpu -video_type youtube_ugc
Train Model
Train our model using extracted features:
python model_regression_simple.py -data_name youtube_ugc -feature_path ../features/ -save_path ../model/
For LSVQ, train the model using:
python model_regression.py -data_name lsvq_train -feature_path ../features/ -save_path ../model/
Fine-Tuning
To fine-tune the pre-trained model on a new dataset, modify train_data_name
to match the dataset used for training, and test_data_name
to specify the dataset for fine-tuning.
python model_finetune.py
Ablation Study
A detailed analysis of different components in ReLaX-VQA.
Spatio-Temporal Fragmentation & DNN Layer Stacking
Key techniques used in ReLaX-VQA:
Fragmentation with DNN layer stacking:
python feature_fragment_layerstack.py
Fragmentation with DNN layer pooling:
python feature_fragment_pool.py
Frame with DNN layer stacking:
python feature_layerstack.py
Frame with DNN layer pooling:
python feature_pool.py
Other Utilities
Excluding Greyscale Videos
We exclude greyscale videos in our experiments. You can use check_greyscale.py
to filter out greyscale videos from the VQA dataset you want to use.
python check_greyscale.py
Metadata Extraction
For easy extraction of metadata from your VQA dataset, use:
python extract_metadata_NR.py
Acknowledgment
This work was funded by the UKRI MyWorld Strength in Places Programme (SIPF00006/1) as part of my PhD study.
Citation
If you find this paper and the repo useful, please cite our paper π:
@article{wang2024relax,
title={ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment},
author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
year={2024},
eprint={2407.11496},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2407.11496},
}
Contact:
Xinyi WANG, [email protected]