|
Fadime Sener
I am a Senior AI Research Scientist at Meta Superintelligence Labs. Previously, I was at Meta Reality Labs, the National University of Singapore (NUS), and the University of Bonn, where I received my PhD with Angela Yao and Juergen Gall. I hold MSc and BSc degrees in Computer Engineering from Bilkent and Hacettepe University.
My research focuses on computer vision and multimodal learning. During my PhD, I worked on unsupervised video understanding through vision and language. At Meta, I worked on online action recognition and streaming video understanding with multimodal LLMs, and on multimedia generation and autoregressive video generation. Recently, I've been exploring streaming video recognition with real-time edge models as tools in agentic pipelines, focused on accurate understanding and timely, proactive response.
Email /
Google Scholar /
LinkedIn
|
|
|
|
Decouple and Cache: KV Cache Construction for Streaming Video Understanding
Zhanzhong Pang,
Dibyadip Chatterjee,
Fadime Sener,
Angela Yao
ICML, 2026
Paper · Code
|
|
|
On Discriminative vs. Generative Classifiers: Rethinking MLLMs for Action Understanding
Zhanzhong Pang,
Dibyadip Chatterjee,
Fadime Sener,
Angela Yao
ICLR, 2026
Paper · Code
|
|
|
Don't Pause! Every prediction matters in a streaming video
Dibyadip Chatterjee,
Zhanzhong Pang,
Fadime Sener,
Yale Song,
Angela Yao
Under review, NeurIPS 2026
Paper · Project
|
|
|
OSMO: Open-vocabulary Self-eMOtion Tracking
CVPR, 2026
|
|
|
TrustCLIP: Learning Private Visual Features via Adversarial Reconstruction
Under review, ECCV 2026
|
|
|
SneakPeek: Future-Guided Instructional Streaming Video Generation
Cheeun Hong,
German Barquero,
Fadime Sener,
Markos Georgopoulos,
Edgar Schönfeld,
Stefan Popov,
Yuming Du,
Oscar Mañas,
Albert Pumarola
Under review, ECCV 2026
Paper
|
|
|
PALM: A Dataset and Baseline for Learning Multi-subject Hand Prior
Zicong Fan,
Edoardo Remelli,
David Dimond,
Fadime Sener,
Liuhao Ge,
Bugra Tekin,
Cem Keskin,
Shreyas Hampali
3DV, 2026
Paper · Code
|
|
|
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee,
Edoardo Remelli,
Yale Song,
Bugra Tekin,
Abhay Mittal,
Bharat Lal Bhatnagar,
Necati Cihan Camgoz,
Shreyas Hampali,
Eric Sauser,
Shugao Ma,
Angela Yao,
Fadime Sener
ICCV, 2025
2nd, EgoExo4D Fine-grained Keystep Recognition Challenge, CVPR 2025
Paper · Project · intern project
|
|
|
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang,
Fadime Sener,
Angela Yao
CVPR, 2025
Paper · Code
|
|
|
Spatial and temporal beliefs for mistake detection in assembly tasks
Guodong Ding,
Fadime Sener,
Shugao Ma,
Angela Yao
CVIU, 2025
Paper
|
|
|
Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment
Zhanzhong Pang,
Fadime Sener,
Shrinivas Ramasubramanian,
Angela Yao
ECCV, 2024
Paper · Code
|
|
|
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Zhanzhong Pang,
Fadime Sener,
Shrinivas Ramasubramanian,
Angela Yao
BMVC, 2024
Paper · Code
|
|
|
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil,
Dibyadip Chatterjee,
Fadime Sener,
Shugao Ma,
Angela Yao
ECCV, 2024
Paper · Code · Project
|
|
|
DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions
Sammy Christen,
Shreyas Hampali,
Fadime Sener,
Edoardo Remelli,
Tomas Hodan,
Eric Sauser,
Shugao Ma,
Bugra Tekin
SIGGRAPH Asia, 2024
Paper · Project
|
|
|
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva,
Fadime Sener,
Edoardo Remelli,
Bugra Tekin,
Eric Sauser,
Bernt Schiele,
Shugao Ma
CVPR, 2024
Paper · Code · intern project
|
|
|
Opening the Vocabulary of Egocentric Actions
Dibyadip Chatterjee,
Fadime Sener,
Shugao Ma,
Angela Yao
NeurIPS, 2023
Paper · Code · Project
|
|
|
Temporal Action Segmentation: An Analysis of Modern Techniques
Guodong Ding,
Fadime Sener,
Angela Yao
TPAMI, 2023
Paper · Survey
|
|
|
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation
Takehiko Ohkawa,
Kun He,
Fadime Sener,
Tomas Hodan,
Luan Tran,
Cem Keskin
CVPR, 2023
EgoVis 2023/2024 Distinguished Paper Winner
Paper · Code · Project
|
|
|
Transferring Knowledge from Text to Video: Zero-Shot Anticipation for Procedural Actions
Fadime Sener,
Rishabh Saraf,
Angela Yao
TPAMI, 2022
Paper
|
|
|
Transformed ROIs for Capturing Visual Transformations in Videos
Abhinav Rai,
Fadime Sener,
Angela Yao
CVIU, 2022
Paper
|
|
|
Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities
Fadime Sener,
Dibyadip Chatterjee,
Daniel Shelepov,
Kun He,
Dipika Singhania,
Robert Wang,
Angela Yao
CVPR, 2022
EgoVis 2022/2023 Distinguished Paper Winner
Paper · Code · Project · Video
|
|
|
Technical report: Temporal aggregate representations
Fadime Sener,
Dibyadip Chatterjee,
Angela Yao
Arxiv, EPIC-KITCHENS-Challenges, 2021
Paper · Code
|
|
|
Temporal aggregate representations for long-range video understanding
Fadime Sener,
Dipika Singhania,
Angela Yao
ECCV, 2020
1st in action anticipation, 2nd in action recognition, EPIC-KITCHENS 2020 Challenge
Paper · Code
|
|
|
Unsupervised Learning of Action Classes with Continuous Temporal Embedding
Anna Kukleva,
Hilde Kuehne,
Fadime Sener,
Jürgen Gall
CVPR, 2019
Paper · Code
|
|
|
Learning Style Compatibility for Furniture
Divyansh Aggarwal,
Elchin Valiyev,
Fadime Sener,
Angela Yao
GCPR, 2019
Paper
|
|
|
Zero-Shot Anticipation for Instructional Activities
Fadime Sener,
Angela Yao
ICCV, 2019
Paper
|
|
|
Unsupervised Learning and Segmentation of Complex Activities from Video
Fadime Sener,
Angela Yao
CVPR, 2018
Spotlight
Paper
|
|
|
DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books
Samet Hicsonmez,
Nermin Samet,
Fadime Sener,
Pinar Duygulu
ACM ICMR, 2017
Paper
|
|
|
Two-person interaction recognition via spatial multiple instance embedding
Fadime Sener,
Nazli Ikizler-Cinbis
JVCIR, 2015
Paper
|
|
|
Ensemble of multiple instance classifiers for image re-ranking
Fadime Sener,
Nazli Ikizler-Cinbis
IMAVIS, 2014
Paper
|
|
|
Identification of illustrators
Fadime Sener,
Nermin Samet,
Pinar Duygulu
ECCV'W, 2012
Paper
|
|
|
On recognizing actions in still images via multiple features
Fadime Sener,
Nazli Ikizler-Cinbis
ECCV'W, 2012
Best poster, ENS/INRIA Visual Recognition and Machine Learning Summer School, 2013
Paper
|
|