Yibin Yan

I am a 1st-year PhD student at Shanghai Jiao Tong University, supervised by Weidi Xie.

I am passionate about advancing Multimodal Perception and Video Representation through innovative research. If you’d like to collaborate or have any questions, feel free to reach out to me via email.

News

  • [2025.04] StreamFormer is now available on Arxiv!
  • [2024.10] EchoSight has been accepted to the EMNLP Findings!
  • [2024.07] Open sourcing EchoSight with code and paper!

Preprint

Learning Streaming Video Representation via Multitask Training

[Project Page] [Code Releasing soon!] [Paper]

EchoSight: Advancing Visual-Language Models with Wiki Knowledge

[Project Page] [Code] [Paper]