Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 32 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,32 @@
# Nerfies

This is the repository that contains source code for the [Nerfies website](https://nerfies.github.io).

If you find Nerfies useful for your work please cite:
```
@article{park2021nerfies
author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
title = {Nerfies: Deformable Neural Radiance Fields},
journal = {ICCV},
year = {2021},
}
```

# Website License
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
# V-skip

<div align="center">

[![arXiv](https://img.shields.io/badge/arXiv-2601.13879-b31b1b.svg)](https://arxiv.org/pdf/2601.13879)
[![Project Page](https://img.shields.io/badge/Project-Page-green)](https://dongxu-zhang.github.io/v-skip.github.io/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

**V-Skip: Efficient Multimodal Reasoning via Dual-Path Anchoring**
<br>
[Dongxu Zhang](https://dongxu-zhang.github.io/)<sup>1,*</sup>, [Yiding Sun](https://github.com/Issac-Sun)<sup>1,*</sup>, [Cheng Tan](https://chengtan9907.github.io/)<sup>3</sup>, [Wenbiao Yan](#)<sup>4</sup>, [Ning Yang](http://ningyangcasia.cn/)<sup>2,†</sup>, [Jihua Zhu](https://gr.xjtu.edu.cn/web/zhujh)<sup>1,†</sup>, [Haijun Zhang](https://scce.ustb.edu.cn/shiziduiwu/jiaoshixinxi/2018-04-13/100.html)<sup>5</sup>

<sup>1</sup>State Key Laboratory of Human-Machine Hybrid Augmented Intelligence, XJTU, <sup>2</sup>CASIA, <sup>3</sup>Shanghai AI Laboratory, <sup>4</sup>HITSZ, <sup>5</sup>USTB
</div>

---

## 🚀 Introduction

This repository contains the official implementation (and project page source) for the paper **"Chain-of-Thought Compression Should Not Be Blind: V-Skip for Efficient Multimodal Reasoning via Dual-Path Anchoring"**.

**V-Skip** is a novel token pruning framework designed for Multimodal Large Language Models (MLLMs). It solves the **"Visual Amnesia"** problem found in standard text-centric compression methods. By employing a dual-path gating mechanism (Linguistic Surprisal + Visual Attention Flow), V-Skip preserves visually salient tokens while reducing latency.

![V-Skip Teaser](./static/images/fig1.png)
*Figure 1: Comparison of compression paradigms. V-Skip successfully rescues visual anchors (e.g., "red") that are blindly pruned by text-only methods.*

## 📈 Key Results
- **Speedup:** Achieves **2.9x** inference speedup on Qwen2-VL.
- **Accuracy:** Outperforms baselines by over **30%** on the DocVQA benchmark.
- **Robustness:** Effectively prevents object hallucination caused by over-pruning.

## 🛠️ Usage
433 changes: 115 additions & 318 deletions index.html

Large diffs are not rendered by default.

33 changes: 33 additions & 0 deletions static/images/XJTU_emblem.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/fig1.pdf
Binary file not shown.
Binary file added static/images/fig1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/qualitative.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/qualitative2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading