GitHub - bytedance/Q-Insight: Q-Insight Family: Q-Insight, VQ-Insight and RALI (NeurIPS 2025 Spotlight, AAAI 2026 Oral, and ICLR 2026 Oral)

Q-Insight Family

🚩 Updates

2026.02.06 The code and pretrained model of RALI and VQ-Insight are released!
2026.02.06 RALI has been accepted at ICLR 2026 as an oral presentation!
2025.11.08 VQ-Insight has been accepted at AAAI 2026 as an oral presentation!
2025.09.19 Q-Insight has been accepted at NeurIPS 2025 as a spotlight (Top 3%)!
2025.05.30 Released training and testing code, along with the pretrained model.
2025.05.26 Released our v2 paper.
2025.03.28 Released the Q-Insight technical report.

🔥 Introduction

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

Weiqi Li, Xuanyu Zhang, Shijie Zhao, Yabin Zhang, Junlin Li, Li Zhang and Jian Zhang

PLCC comparisons between our proposed Q-Insight and existing IQA metrics (left) and three example applications of our Q-Insight (right) are presented. Q-Insight demonstrates significantly improved performance compared to existing methods, especially on out-of-domain datasets. Additionally, Q-Insight effectively supports quality score regression, image degradation perception, and zero-shot image comparison reasoning tasks.

VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Xuanyu Zhang*, Weiqi Li*, Shijie Zhao, Junlin Li, Li Zhang, Jian Zhang

We propose a reasoning-style vision-language model VQ-Insight, which accurately performs AIGC video preference comparison, AIGC video multi-dimension scoring, and natural video scoring, accompanied by detailed and reasonable reasoning processes. Our VQ-Insight can be applied to post-training of video generation models and zero-shot content repairing.

Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment

Shijie Zhao*, Xuanyu Zhang*, Weiqi Li, Junlin Li, Li Zhang, Tianfan Xue, Jian Zhang

We revisit the reasoning mechanism in MLLM-based IQA model (such as Q-Insight) and propose a CLIP-based lightweight image scorer RALI. We verifies that through RL training, MLLMs leverage their reasoning capability to convert redundant visual representations into compact, cross-domain aligned text representations. This conversion is the source of the generalization exhibited by these reasoning-based IQA models. RALI uses only about 4% of Q-Insight’s parameters and inference time, while achieving comparable accuracy.

🔧 Dependencies and Installation

git clone https://github.com/bytedance/Q-Insight.git
bash setup.sh

To run VQ-Insight, install additional pacakages.

cd src/eval/qwen-vl-utils
pip install -e .[decord]

⚡ Quick Inference

Supported Tasks

Score Regression (Q-Insight)

cd src/eval/
python demo_score.py

Degradation Perception (Q-Insight)

cd src/eval/
python demo_dist.py

Image Comparison Reasoning (Q-Insight)

cd src/eval/
python demo_comparison.py

Natural Video Scoring (VQ-Insight)

cd src/eval/
python demo_vqinsight_score.py \
  --video_path "../../assets/demo_natural.mp4" \
  --video_type natural

AIGC Video Multi-Dimension Scoring (VQ-Insight)

cd src/eval/
python demo_vqinsight_score.py \
  --video_path "../../assets/demo_aigc.mp4" \
  --video_type aigc

AIGC Video Comparison (VQ-Insight)

cd src/eval/
python demo_vqinsight_comp.py \
  --video_a "../../assets/demo_comp1.mp4" \
  --video_b "../../assets/demo_comp2.mp4" \
  --model_name_or_path Bytedance/Q-Insight

Score Regression (RALI)

Please download the RALI pretrained weights from the link. After downloading, place the checkpoint under Q-Insight/checkpoints, so that the directory structure becomes:

Q-Insight/
├── checkpoints/
│   ├── ckpt.pt
│   ├── pca.pkl
│   ├── basis.npz
│   └── best/
│       ├── config.json
│       ├── pytorch_model.bin (or *.safetensors)
│       ├── preprocessor_config.json
│       └── ...
├── src/
├── assets/
└── README.md

Then run the following code:

cd src/eval/
python demo_rali_score.py

📖 Dataset Preparation for Training

Score Regression

Download meta files from Data-DeQA-Score and the source images from the KONIQ dataset. Arrange the folders in ./src/open-r1-multimodal/dataas follows:

|-- Data-DeQA-Score
  |-- KONIQ
    |-- images/*.jpg
    |-- metas

Degradation Perception

Download the refA_sd_brief subset from KADIS-700K. Arrange the folders in ./src/open-r1-multimodal/data as follows:

|-- KADIS-700K
  |-- refA_sd_brief
    |-- dist_imgs/*.jpg
    |-- metas/train_dist.json

Image Comparison Reasoning

Download the validation dataset of DiffIQA. Arrange the folders in ./src/open-r1-multimodal/data as follows:

|-- DiffIQA
  |-- ValidationImage
    |-- images
    |-- train_comparison.json

Training

Score Regression and Degradation Perception

cd src/open-r1-multimodal/
bash run_qinsight_score_and_dist.sh

Image Comparison Reasoning

cd src/open-r1-multimodal/
bash run_qinsight_comparison.sh

✏️ To Do List

Acknowledgement

We appreciate the releasing codes and data of VLM-R1, DepictQA and DeQA-Score.

Citation

If Q-Insight Family is helpful, please help to ⭐ the repo.

If you find the code helpful in your research or work, please cite the following papers:

@article{li2025qinsight,
  title={Q-Insight: Understanding Image Quality via Visual Reinforcement Learning},
  author={Li, Weiqi and Zhang, Xuanyu and Zhao, Shijie and Zhang, Yabin and Li, Junlin and Zhang, Li and Zhang, Jian},
  journal={Proceedings of the Advances in Neural Information Processing Systems (NeurIPS)},
  year={2025}
}

@article{zhang2025vqinsight,
  title={VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning},
  author={Zhang, Xuanyu and Li, Weiqi and Zhao, Shijie and Li, Junlin and Zhang, Li and Zhang, Jian},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
  year={2026}
}

@article{zhao2025reasoning,
  title={Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment},
  author={Zhao, Shijie and Zhang, Xuanyu and Li, Weiqi and Li, Junlin and Zhang, Li and Xue, Tianfan and Zhang, Jian},
  journal={Proceedings of the International Conference on Learning Representations (ICLR)},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_NOTICE.txt		THIRD_PARTY_NOTICE.txt
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q-Insight Family

🚩 Updates

🔥 Introduction

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment

🔧 Dependencies and Installation

⚡ Quick Inference

Supported Tasks

Score Regression (Q-Insight)

Degradation Perception (Q-Insight)

Image Comparison Reasoning (Q-Insight)

Natural Video Scoring (VQ-Insight)

AIGC Video Multi-Dimension Scoring (VQ-Insight)

AIGC Video Comparison (VQ-Insight)

Score Regression (RALI)

📖 Dataset Preparation for Training

Score Regression

Degradation Perception

Image Comparison Reasoning

Training

Score Regression and Degradation Perception

Image Comparison Reasoning

✏️ To Do List

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

bytedance/Q-Insight

Folders and files

Latest commit

History

Repository files navigation

Q-Insight Family

🚩 Updates

🔥 Introduction

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment

🔧 Dependencies and Installation

⚡ Quick Inference

Supported Tasks

Score Regression (Q-Insight)

Degradation Perception (Q-Insight)

Image Comparison Reasoning (Q-Insight)

Natural Video Scoring (VQ-Insight)

AIGC Video Multi-Dimension Scoring (VQ-Insight)

AIGC Video Comparison (VQ-Insight)

Score Regression (RALI)

📖 Dataset Preparation for Training

Score Regression

Degradation Perception

Image Comparison Reasoning

Training

Score Regression and Degradation Perception

Image Comparison Reasoning

✏️ To Do List

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages