1+ <!DOCTYPE html>
2+ < html >
3+
4+ < head >
5+ < meta charset ="utf-8 ">
6+ <!-- Meta tags for social media banners, these should be filled in appropriatly as they are your "business card" -->
7+ <!-- Replace the content tag with appropriate information -->
8+ < meta name ="description " content ="Image Reconstruction as a Tool for Feature Analysis ">
9+ < meta property ="og:title " content ="Image Reconstruction as a Tool for Feature Analysis " />
10+ < meta property ="og:description "
11+ content ="A novel approach for interpreting vision features via image reconstruction " />
12+ < meta property ="og:url " content ="https://fusionbrainlab.github.io/feature_analysis " />
13+ <!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X630-->
14+ < meta property ="og:image " content ="static/images/v1_vs_v2.png " />
15+ < meta property ="og:image:width " content ="1200 " />
16+ < meta property ="og:image:height " content ="630 " />
17+
18+
19+ < meta name ="twitter:title " content ="Image Reconstruction as a Tool for Feature Analysis ">
20+ < meta name ="twitter:description " content ="A novel approach for interpreting vision features via image reconstruction ">
21+ <!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X600-->
22+ < meta name ="twitter:image " content ="static/images/v1_vs_v2.png ">
23+ < meta name ="twitter:card " content ="summary_large_image ">
24+ <!-- Keywords for your paper to be indexed by-->
25+ < meta name ="keywords " content ="computer vision, feature analysis, image reconstruction, vision encoders ">
26+ < meta name ="viewport " content ="width=device-width, initial-scale=1 ">
27+
28+
29+ < title > Image Reconstruction as a Tool for Feature Analysis</ title >
30+ < link rel ="icon " type ="image/x-icon " href ="static/images/favicon.ico ">
31+ < link href ="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro " rel ="stylesheet ">
32+
33+ < link rel ="stylesheet " href ="static/css/bulma.min.css ">
34+ < link rel ="stylesheet " href ="static/css/bulma-carousel.min.css ">
35+ < link rel ="stylesheet " href ="static/css/bulma-slider.min.css ">
36+ < link rel ="stylesheet " href ="static/css/fontawesome.all.min.css ">
37+ < link rel ="stylesheet " href ="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css ">
38+ < link rel ="stylesheet " href ="static/css/index.css ">
39+
40+ < script src ="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js "> </ script >
41+ < script src ="https://documentcloud.adobe.com/view-sdk/main.js "> </ script >
42+ < script defer src ="static/js/fontawesome.all.min.js "> </ script >
43+ < script src ="static/js/bulma-carousel.min.js "> </ script >
44+ < script src ="static/js/bulma-slider.min.js "> </ script >
45+ < script src ="static/js/index.js "> </ script >
46+ </ head >
47+
48+ < body >
49+
50+
51+ < section class ="hero ">
52+ < div class ="hero-body ">
53+ < div class ="container is-max-desktop ">
54+ < div class ="columns is-centered ">
55+ < div class ="column has-text-centered ">
56+ < h1 class ="title is-1 publication-title "> Image Reconstruction as a Tool for Feature Analysis</ h1 >
57+ < div class ="is-size-5 publication-authors ">
58+ <!-- Paper authors -->
59+ < span class ="author-block ">
60+ < a href ="mailto:allakhverdov@2a2i.org " target ="_blank "> Eduard Allakhverdov</ a >
61+ </ span >
62+ < span class ="author-block ">
63+ < a href ="mailto:d.tarasov@airi.net " target ="_blank "> Dmitrii Tarasov</ a >
64+ </ span >
65+ < span class ="author-block ">
66+ < a href ="mailto:goncharova@airi.net " target ="_blank "> Elizaveta Goncharova</ a >
67+ </ span >
68+ < span class ="author-block ">
69+ < a href ="mailto:kuznetsov@airi.net " target ="_blank "> Andrey Kuznetsov</ a >
70+ </ span >
71+ </ div >
72+ < div class ="is-size-5 publication-authors ">
73+ < span class ="author-block ">
74+ AIRI, Moscow, Russia< br >
75+ MIPT, Dolgoprudny, Russia
76+ </ span >
77+ < span class ="author-block ">
78+ AIRI, Moscow, Russia
79+ </ span >
80+ < span class ="author-block ">
81+ AIRI, Moscow, Russia
82+ </ span >
83+ < span class ="author-block ">
84+ AIRI, Moscow, Russia
85+ </ span >
86+ </ div >
87+
88+ < div class ="column has-text-centered ">
89+ < div class ="publication-links ">
90+
91+ <!-- Github link -->
92+ < span class ="link-block ">
93+ < a href ="https://github.com/FusionBrainLab/feature_analysis " target ="_blank "
94+ class ="external-link button is-normal is-rounded is-dark ">
95+ < span class ="icon ">
96+ < i class ="fab fa-github "> </ i >
97+ </ span >
98+ < span > Code</ span >
99+ </ a >
100+ </ span >
101+
102+ <!-- ArXiv abstract Link -->
103+ < span class ="link-block ">
104+ < a href ="https://arxiv.org/abs/<ARXIV PAPER ID> " target ="_blank "
105+ class ="external-link button is-normal is-rounded is-dark ">
106+ < span class ="icon ">
107+ < i class ="ai ai-arxiv "> </ i >
108+ </ span >
109+ < span > arXiv</ span >
110+ </ a >
111+ </ span >
112+ </ div >
113+ </ div >
114+ </ div >
115+ </ div >
116+ </ section >
117+
118+
119+
120+ <!-- Paper abstract -->
121+ < section class ="section hero is-light ">
122+ < div class ="container is-max-desktop ">
123+ < div class ="columns is-centered has-text-centered ">
124+ < div class ="column is-four-fifths ">
125+ < h2 class ="title is-3 "> Abstract</ h2 >
126+ < div class ="content has-text-justified ">
127+ < p >
128+ Vision encoders are increasingly used in modern applications, from vision-only models to multimodal
129+ systems such as vision-language models. Despite their remarkable success, it remains unclear how these
130+ architectures represent features internally. Here, we propose a novel approach for interpreting vision
131+ features via image reconstruction. We compare two related model families, SigLIP and SigLIP2, which differ
132+ only in their training objective, and show that encoders pre-trained on image-based tasks retain
133+ significantly more image information than those trained on non-image tasks such as contrastive learning.
134+ We further apply our method to a range of vision encoders, ranking them by the informativeness of their
135+ feature representations. Finally, we demonstrate that manipulating the feature space yields predictable
136+ changes in reconstructed images, revealing that orthogonal rotations — rather than spatial transformations
137+ — control color encoding. Our approach can be applied to any vision encoder, shedding light on the inner
138+ structure of its feature space.
139+ </ div >
140+ </ div >
141+ </ div >
142+ </ div >
143+ </ section >
144+ <!-- End paper abstract -->
145+
146+ <!-- Прописать явно контрибушны: -->
147+
148+ <!-- (1) interpretability metric -->
149+ <!-- Текстовое объяснение -->
150+ <!-- Базовые результаты: siglip vs siglip2 -->
151+ <!-- Нужно четко подчеркнуть какие различия между модельками -->
152+ <!-- И как это влияет на реконструкцию -->
153+ <!-- -->
154+
155+ <!-- (2) Feature-space transformations -->
156+ <!-- Текстовое объяснение -->
157+ <!-- Визуализация фреймворка: обобщил оператор в пр-ве картинок и в пр-ве фичей -->
158+ <!-- Сделать видос с визуализацией фреймворка -->
159+ <!-- Примеры работы с RGB -->
160+ <!-- Примеры работы с отключением одного канала (ожелтением) -->
161+ <!-- -->
162+
163+
164+ <!-- Image carousel -->
165+ < section class ="hero is-small ">
166+ < div class ="hero-body ">
167+ < div class ="container ">
168+ < div id ="results-carousel " class ="carousel results-carousel ">
169+ < div class ="item ">
170+ <!-- Your image here -->
171+ < img src ="static/images/v1_vs_v2.png " alt ="Comparison of SigLIP and SigLIP2 reconstructions " />
172+ < h2 class ="subtitle has-text-centered ">
173+ Comparison of image reconstructions from SigLIP and SigLIP2 feature spaces.
174+ </ h2 >
175+ </ div >
176+ < div class ="item ">
177+ <!-- Your image here -->
178+ < img src ="static/images/rb_swap.png " alt ="Red-Blue channel swap visualization " />
179+ < h2 class ="subtitle has-text-centered ">
180+ Visualization of feature space manipulation through red-blue channel swap.
181+ </ h2 >
182+ </ div >
183+ </ div >
184+ </ div >
185+ </ div >
186+ </ section >
187+ <!-- End image carousel -->
188+
189+
190+
191+
192+ <!--BibTex citation -->
193+ < section class ="section " id ="BibTeX ">
194+ < div class ="container is-max-desktop content ">
195+ < h2 class ="title "> BibTeX</ h2 >
196+ < pre > < code > @article{feature_analysis,
197+ title={Image Reconstruction as a Tool for Feature Analysis},
198+ author={Allakhverdov, Eduard and Tarasov, Dmitrii and Goncharova, Elizaveta and Kuznetsov, Andrey},
199+ journal={arXiv preprint},
200+ year={2024}
201+ }</ code > </ pre >
202+ </ div >
203+ </ section >
204+ <!--End BibTex citation -->
205+
206+
207+ < footer class ="footer ">
208+ < div class ="container ">
209+ < div class ="columns is-centered ">
210+ < div class ="column is-8 ">
211+ < div class ="content ">
212+
213+ < p >
214+ This page was built using the < a href ="https://github.com/eliahuhorwitz/Academic-project-page-template "
215+ target ="_blank "> Academic Project Page Template</ a > which was adopted from the < a
216+ href ="https://nerfies.github.io " target ="_blank "> Nerfies</ a > project page.
217+ You are free to borrow the source code of this website, we just ask that you link back to this page in the
218+ footer. < br > This website is licensed under a < a rel ="license "
219+ href ="http://creativecommons.org/licenses/by-sa/4.0/ " target ="_blank "> Creative
220+ Commons Attribution-ShareAlike 4.0 International License</ a > .
221+ </ p >
222+
223+ </ div >
224+ </ div >
225+ </ div >
226+ </ div >
227+ </ footer >
228+
229+ <!-- Statcounter tracking code -->
230+
231+ <!-- You can add a tracker to track page visits by creating an account at statcounter.com -->
232+
233+ <!-- End of Statcounter Code -->
234+
235+ </ body >
236+
237+ </ html >
0 commit comments