Publications

TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models

Published in IEEE/CVF International Conference on Computer Vision (ICCV), 2025

Recommended citation: Rahmanzadehgervi P, Nguyen HH, Liu R, Mai L, Nguyen AT. TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2025 (pp. 22551-22562). https://arxiv.org/pdf/2412.18675

Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence

Published in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025

Recommended citation: Nguyen, H.H., Rahmanzadehgervi, P., Mai, L. and Nguyen, A.T., 2025, February. Improving zero-shot object-level change detection by incorporating visual correspondence. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 8826-8833). IEEE. https://ieeexplore.ieee.org/abstract/document/10943983

Vision language models are blind

Published in Proceedings of the Asian Conference on Computer Vision, 2024

Recommended citation: Rahmanzadehgervi, P., Bolton, L., Taesiri, M.R. and Nguyen, A.T., 2024. Vision language models are blind. In Proceedings of the Asian Conference on Computer Vision (pp. 18-34). https://link.springer.com/chapter/10.1007/978-981-96-0917-8_17

Vision-Based Obstacle Avoidance in Drone Navigation using Deep Reinforcement Learning

Published in 11th International Conference on Computer Engineering and Knowledge (ICCKE), 2021

Recommended citation: P. R. Gervi, A. Harati and S. K. Ghiasi-Shirazi, "Vision-Based Obstacle Avoidance in Drone Navigation using Deep Reinforcement Learning," 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), Mashhad, Iran, Islamic Republic of, 2021, pp. 363-368, doi: 10.1109/ICCKE54056.2021.9721451. https://ieeexplore.ieee.org/abstract/document/9721451

Pooyan Rahmanzadehgervi

Publications

TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models

Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence

Vision language models are blind

Vision-Based Obstacle Avoidance in Drone Navigation using Deep Reinforcement Learning