Feilong Tang 唐飞龙

Feilong Tang (Chinese: 唐飞龙) is a PhD student at the AIM for Health Lab at Monash University, advised by A/Prof. Zongyuan Ge. He received his Bachelor's degree from the University of Liverpool (2019 – 2023). He also had internship experience at Shanghai AI Lab, HKUST, MBZUAI, and DeepGlint.

His research focuses on video understanding and multimodal large language models (MLLMs). He is particularly interested in:

  • Video Understanding and Temporal Reasoning in Multimodal LLMs
  • Next-generation Vision Transformer (ViT) to address urgent needs in modern MLLMs
  • Hallucination Mitigation in Multimodal Large Language Models

News

  • 2026.04📰 Paper accepted to Nature CommunicationsPopulation-scale Characterization of the Oral Microbiome and Associations with Metabolic Health.
  • 2026.04🎉 Two papers accepted to ACL 2026 Findings.
  • 2026.02🚀 New preprint: OneVision-Encoder — Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence. arXiv | Code
  • 2026.02🎉 Three papers accepted to CVPR 2026, including Thinking in Uncertainty as a Highlight.
  • 2026.01🎉 Three papers accepted to ICLR 2026, including one Oral.
  • 2025.12🏆 Co-authored paper Reconvla receives the Best Paper Award at AAAI 2026.
  • 2025.09🎉 Three papers accepted to NeurIPS 2025 — UniViT (first author), Towards Dynamic 3D Reconstruction (co-first), Decoding Causal Structure (co-first).
  • 2025.09🚀 New preprint: LLaVA-OneVision-1.5 — Fully Open Framework for Democratized Multimodal Training.
  • 2025.07🎉 One paper accepted to ICCV 2025 — Hierarchical Retrieval-Augmented Learning (co-first author).
  • 2025.06🎉 Two papers accepted to MICCAI 2025.
  • 2025.05🎉 One paper accepted to ACL 2025 – MMRC benchmark (co-first author).
  • 2025.05📰 Paper in Communications Medicine (Nature Portfolio) — Forecasting DR progression (co-first). IF 6.3, Q1.
  • 2025.02🏆 Five papers accepted to CVPR 2025, including first-author paper Seeing Far and Clearly accepted as an Oral (top ~0.8%).
  • 2025.01📄 Three papers accepted to AAAI 2025.
  • 2025.01📄 One paper accepted to ICLR 2025 – Intervening Anchor Token.
  • 2024.08🏅 SAM2-UNet received the Best Paper Award at ICCV 2024 Workshop.
  • 2024.07📄 One paper accepted to ECCV 2024 – OphNet.
  • 2024.03📄 One paper accepted to AAAI 2024 – SFC (co-first author).
  • 2024.02📄 One paper accepted to CVPR 2024 – Hunting Attributes (first author).

Selected Publications

* equal contribution  |  corresponding author  |  Bold = myself

  1. OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Technical report, 2026 Paper Code Homepage Bilibili YouTube GitHub ~300 stars · 100k+ views on X/Twitter Feilong Tang, Xiang An, Yunyao Yan, et al.
  2. UniViT: Unifying Image and Video Understanding in One Vision Encoder NeurIPS 2025 Paper Feilong Tang*, Xiang An*, Haolin Yang*, et al.
  3. Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding CVPR 2025 Oral (top ~0.8%) Paper Code Feilong Tang*, Chengzhi Liu*, Zhongxing Xu*, et al.
  4. Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery NeurIPS 2025 Paper Code Ming Hu*, Zhengdi Yu*, Feilong Tang*, et al.
  5. Decoding Causal Structure: End-to-End Mediation Pathways Inference NeurIPS 2025 Paper Yulong Li*, Xiwei Liu*, Feilong Tang*, Ming Hu, Zongyuan Ge, Imran Razzak, Eran Segal
  6. Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining ICCV 2025 Paper Code Ming Hu*, Kun Yuan*, Yaling Shen*, Feilong Tang*, et al.
  7. A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation ACL 2025 Paper Code Haochen Xue*, Feilong Tang*, Ming Hu, et al.

All Publications

* equal contribution  |  corresponding author  |  Bold = myself

2026

  1. OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Preprint (Technical Report), 2026 Paper Code Homepage Bilibili YouTube GitHub ~300 stars · 100k+ views on X/Twitter Feilong Tang, Xiang An, Yunyao Yan, et al., Zongyuan Ge, Jiankang Deng
  2. Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development Preprint (Survey), 2026 Paper Code Zhongying Deng, Cheng Tang, Ziyan Huang, Jiashi Lin, Ying Chen, et al., Feilong Tang, et al., Yu Qiao, Junjun He
  3. Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding CVPR 2026 Highlight Paper Code Zhongxing Xu, Zhonghua Wang, Zhe Qian, Dachuan Shi, Feilong Tang, Ming Hu, Shiyan Su, Xiaocheng Zou, Wei Feng, Dwarikanath Mahapatra, Yifan Peng, Mingquan Lin, Zongyuan Ge
  4. Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data CVPR 2026 Paper Code Xinlin Zhuang, Feilong Tang, Haolin Yang, Xiwei Liu, Ming Hu, Huifa Li, Haochen Xue, Junjun He, Zongyuan Ge, Yichen Li, Ying Qian, Imran Razzak
  5. Seeing Through the Shift: Causality-Inspired Robust Generalized Category Discovery CVPR 2026 Wei Feng, Yiwen Jiang, Sijin Zhou, Zhuang Qi, Zhongxing Xu, Zhonghua Wang, Feilong Tang, Zongyuan Ge
  6. Population-scale Characterization of the Oral Microbiome and Associations with Metabolic Health Nature Communications, 2026 Paper DOI Haochen Xue, Anastasia Godneva, Feilong Tang, Huifa Li, Yulong Li, Ming Hu, Ruobing Li, Jionglong Su, Eran Segal, Imran Razzak
  7. TAGS: A Test-Time Generalist–Specialist Framework with Retrieval-Augmented Reasoning and Verification ACL 2026 Findings Paper Jianghao Wu, Feilong Tang, Yulong Li, Ming Hu, Haochen Xue, Shoaib Jameel, Zongyuan Ge, Yutong Xie, Imran Razzak
  8. PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics ACL 2026 Findings Paper Yaling Shen, Stephanie Fong, Yiwen Jiang, Zimu Wang, Feilong Tang, Qingyang Xu, Xiangyu Zhao, Zhongxing Xu, Jiahe Liu, Jinpeng Hu, Dominic Dwyer, Zongyuan Ge

2025

  1. LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Preprint, 2025 Paper Code Xiang An, Yin Xie, Kaicheng Yang, Wenkang Zhang, Xiuwei Zhao, Zheng Cheng, Changrui Chen, Zizhen Yan, Ziyong Feng, Ziwei Liu, Bo Li, Jiankang Deng, et al.
  2. UniViT: Unifying Image and Video Understanding in One Vision Encoder NeurIPS 2025 Paper Feilong Tang, Xiang An, Haolin Yang, Behzad Bozorgtabar, Jiankang Deng, Zongyuan Ge
  3. Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery NeurIPS 2025 Paper Code Ming Hu*, Zhengdi Yu*, Feilong Tang*, Tolga Birdal, Kaijing Zhou, Zongyuan Ge
  4. Decoding Causal Structure: End-to-End Mediation Pathways Inference NeurIPS 2025 Paper Yulong Li*, Xiwei Liu*, Feilong Tang*, Ming Hu, Zongyuan Ge, Imran Razzak, Eran Segal
  5. Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding CVPR 2025 Oral (top ~0.8%) Paper Code Feilong Tang, Chengzhi Liu, Zhongxing Xu, Ming Hu, et al., Xuelian Cheng, Imran Razzak, Zongyuan Ge
  6. Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining ICCV 2025 Paper Code Ming Hu*, Kun Yuan*, Yaling Shen*, Feilong Tang*, et al.
  7. Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation MICCAI 2025 Paper Code Xinkun Wang*, Yifang Wang*, Senwei Liang*, Feilong Tang, et al.
  8. MSWAL: 3D Multi-class Segmentation of Whole Abdominal Lesions Dataset MICCAI 2025 Paper Code Zhaodong Wu, Qiaochu Zhao, et al., Feilong Tang
  9. A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation ACL 2025 Paper Code Haochen Xue*, Feilong Tang*, Ming Hu, et al.
  10. Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs ICLR 2025 Paper Code Feilong Tang, Zile Huang, Chengzhi Liu, Qiang Sun, Harry Yang, Ser-Nam Lim
  11. Density-Aware Contrastive Learning for Multi-organ Semi-supervised Segmentation AAAI 2025 Paper arXiv Feilong Tang, Zhongxing Xu, Ming Hu, Zongyuan Ge
  12. Toward Modality Gap: Vision Prototype Learning for Text-supervised Semantic Segmentation with CLIP AAAI 2025 Paper Zhongxing Xu*, Feilong Tang*, Jionglong Su, Zongyuan Ge
  13. Towards Realistic Semi-supervised Medical Image Classification AAAI 2025 Paper Wenxue Li*, Lie Ju*, Feilong Tang*, Peng Xia, Xinyu Xiong, Ming Hu, Lei Zhu, Zongyuan Ge
  14. Forecasting the diabetic retinopathy progression using GAN Communications Medicine (Nature Portfolio), 2025 Paper JCR 2024 IF 6.3 · Q1 · Rank 33/195 (Medicine, General & Internal) Huiyu Qiao*, Feilong Tang*, Lie Ju, Wei Feng, Zongyuan Ge, Qiansu Yang

2024

  1. Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation CVPR 2024 Paper Code Feilong Tang, Zhongxing Xu, Zhaojun Qu, Wei Feng, Xingjian Jiang, Zongyuan Ge
  2. SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation 🏆 ICCV Workshop 2024 Best Paper Award Paper Code Xinyu Xiong, Zihuang Wu, Shuangyi Tan, Wenxue Li, Feilong Tang, et al.
  3. OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding ECCV 2024 Paper Code Ming Hu, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, et al.
  4. SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation AAAI 2024 Paper arXiv Code Xinqiao Zhao*, Feilong Tang*, Xiaoyang Wang, Jimin Xiao
  5. Polyp-Mamba: Polyp Segmentation with Visual Mamba MICCAI 2024 Paper Zhongxing Xu*, Feilong Tang*, Zhe Chen, Jionglong Su
  6. Sight for sore heads: Using CNNs to diagnose migraines ARVO 2024 Trinh Matt*, Feilong Tang*, Angelica Ly, Annita Duong, Fiona Stapleton, Zongyuan Ge, Imran Razzak

2023

  1. DuAT: Dual-aggregation Transformer Network for Medical Image Segmentation PRCV 2023 Paper Code 295+ citations Feilong Tang, Zhongxing Xu, Qiming Huang, Jinfeng Wang, Xianxu Hou, Jionglong Su, Jingxin Liu

2022

  1. Stepwise Feature Fusion: Local Guides Global MICCAI 2022 Paper 387+ citations Jinfeng Wang*, Qiming Huang*, Feilong Tang*, Jia Meng, Jionglong Su, Sifan Song

Education

Internship & Research Experience

Miscellaneous


© 2025 Feilong Tang  ·  Last updated: Mar 2026  ·  Google Scholar

Feilong Tang
Feilong Tang
PhD Student, Monash University
Affiliation
AIM for Health Lab,
Monash University
Research
Video Understanding, MLLMs
  GitHub