ReLaX-VQA

Official Code for the following paper:

X. Wang, A. Katsenou, and D. Bull. ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Performance

We evaluate the performance of ReLaX-VQA on four datasets. ReLaX-VQA has three different versions based on the training and testing strategies:

ReLaX-VQA: Trained and tested on each dataset with an 80%-20% random split.
ReLaX-VQA (w/o FT): Trained on LSVQ, and the frozen model was tested on other datasets.
ReLaX-VQA (w/ FT): Trained on LSVQ, and the frozen model was fine-tuned on other datasets.

Spearman’s Rank Correlation Coefficient (SRCC)

Model	CVD2014	KoNViD-1k	LIVE-VQC	YouTube-UGC
ReLaX-VQA	0.8643	0.8535	0.7655	0.8014
ReLaX-VQA (w/o FT)	0.7845	0.8312	0.7664	0.8104
ReLaX-VQA (w/ FT)	0.8974	0.8720	0.8468	0.8469

Pearson’s Linear Correlation Coefficient (PLCC)

Model	CVD2014	KoNViD-1k	LIVE-VQC	YouTube-UGC
ReLaX-VQA	0.8895	0.8473	0.8079	0.8204
ReLaX-VQA (w/o FT)	0.8336	0.8427	0.8242	0.8354
ReLaX-VQA (w/ FT)	0.9294	0.8668	0.8876	0.8652

More results can be found in reported_result.ipynb.

Proposed Model

The figure shows the overview of the proposed ReLaX-VQA framework. The architectures of ResNet-50 Stack (I) and ResNet-50 Pool (II) are provided in Fig.2 in the paper.

Usage

📌 Install Requirement

The repository is built with Python 3.10.14 and can be installed via the following commands:

git clone https://github.com/xinyiW915/ReLaX-VQA.git
cd ReLaX-VQA
conda create -n relaxvqa python=3.10.14 -y
conda activate relaxvqa
pip install -r requirements.txt

📥 Download UGC Datasets

The corresponding raw video datasets can be downloaded from the following sources:
LSVQ, KoNViD-1k, LIVE-VQC, YouTube-UGC, CVD2014.

The metadata for the experimented UGC dataset is available under ./metadata.

Once downloaded, place the datasets in ./ugc_original_videos or any other storage location of your choice.
Ensure that the video_path in the get_video_paths function inside main_relaxvqa_feats.py is updated accordingly.

🎬 Test Demo

Run the pre-trained models to evaluate the quality of a single video.

The model weights provided in ./model contain the best-performing saved weights from training.

To evaluate the quality of a specific video, run the following command:

python demo_test_gpu.py 
    -device <DEVICE> 
    -train_data_name <TRAIN_DATA_NAME> 
    -is_finetune <True/False> 
    -save_path <MODEL_PATH> 
    -video_type <DATASET_NAME> 
    -video_name <VIDEO_NAME> 
    -framerate <FRAMERATE>

Or simply try our demo video by running:

python demo_test_gpu.py

🧪 How to Use the Pretrained Model

You can download and load the model using huggingface_hub:

from huggingface_hub import hf_hub_download
import torch

# Download the pretrained model file
model_path = hf_hub_download(
    repo_id="xinyiW915/ReLaX-VQA",
    filename="model/lsvq_train_relaxvqa_byrmse_trained_median_model_param_onLSVQ_TEST.pth"
)

state_dict = torch.load(model_path)
fixed_state_dict = fix_state_dict(state_dict) 
model.load_state_dict(fixed_state_dict)  # Use this with your model class

Training

Steps to train ReLaX-VQA from scratch on different datasets.

Extract Features

Run the following command to extract features from videos:

python main_relaxvqa_feats.py -device gpu -video_type youtube_ugc

Train Model

Train our model using extracted features:

python model_regression_simple.py -data_name youtube_ugc -feature_path ../features/ -save_path ../model/

For LSVQ, train the model using:

python model_regression.py -data_name lsvq_train -feature_path ../features/ -save_path ../model/

Fine-Tuning

To fine-tune the pre-trained model on a new dataset, modify train_data_name to match the dataset used for training, and test_data_name to specify the dataset for fine-tuning.

python model_finetune.py

Ablation Study

A detailed analysis of different components in ReLaX-VQA.

Spatio-Temporal Fragmentation & DNN Layer Stacking

Key techniques used in ReLaX-VQA:

Fragmentation with DNN layer stacking:
```
python feature_fragment_layerstack.py
```
Fragmentation with DNN layer pooling:
```
python feature_fragment_pool.py
```
Frame with DNN layer stacking:
```
python feature_layerstack.py
```
Frame with DNN layer pooling:
```
python feature_pool.py
```

Other Utilities

Excluding Greyscale Videos

We exclude greyscale videos in our experiments. You can use check_greyscale.py to filter out greyscale videos from the VQA dataset you want to use.

python check_greyscale.py

Metadata Extraction

For easy extraction of metadata from your VQA dataset, use:

python extract_metadata_NR.py

Acknowledgment

This work was funded by the UKRI MyWorld Strength in Places Programme (SIPF00006/1) as part of my PhD study.

Citation

If you find this paper and the repo useful, please cite our paper 😊:

@article{wang2024relax,
      title={ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment},
      author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
      year={2024},
      eprint={2407.11496},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2407.11496}, 
}

Contact:

Xinyi WANG, [email protected]

xinyiW915
/

ReLaX-VQA