Fadime Sener

Fadime Sener

I am a Senior AI Research Scientist at Meta Superintelligence Labs. Previously, I was at Meta Reality Labs, the National University of Singapore (NUS), and the University of Bonn, where I received my PhD with Angela Yao and Juergen Gall. I hold MSc and BSc degrees in Computer Engineering from Bilkent and Hacettepe University.

My research focuses on computer vision and multimodal learning. During my PhD, I worked on unsupervised video understanding through vision and language. At Meta, I worked on online action recognition and streaming video understanding with multimodal LLMs, and on multimedia generation and autoregressive video generation. Recently, I've been exploring streaming video recognition with real-time edge models as tools in agentic pipelines, focused on accurate understanding and timely, proactive response.

Email / Google Scholar / LinkedIn

News

06/26 Organizing the second Vision-based Assistants in the Real-World workshop at CVPR 2026.
07/25 Started working at Meta Superintelligence Labs as Senior AI Research Scientist.
06/25 ProVideLLM ranked 2nd in EgoExo4D Fine-grained Keystep Recognition Challenge at CVPR 2025.
06/25 Organized Vision-based Assistants in the Real-World workshop at CVPR 2025.
06/25 AssemblyHands was awarded the 2023/2024 Distinguished Paper Award at the EgoVis workshop.
06/24 Assembly101 was awarded the 2022/2023 Distinguished Paper Award at the EgoVis workshop.
10/22 Organized ATLAS: AcTion Localization And Segmentation in Video tutorial at ECCV 2022.
10/22 Organized Human Body, Hands, and Activities from Egocentric and Multi-view Cameras (HBHA) workshop at ECCV 2022.
03/22 Released Assembly101 at CVPR 2022.
07/21 Defended my PhD! 🎉 👩‍🎓
06/21 Outstanding reviewer at CVPR 2021
10/20 Started working at Meta Reality Labs as an AI Research Scientist
06/20 1st in action anticipation and 2nd in action recognition at EPIC-KITCHENS 2020 Challenge with Temporal aggregate representations for long-range video understanding.
01/20 Started working as Research Assistant at the National University of Singapore: NUS
07/16 Attended International Computer Vision Summer School (ICVSS) 2016
10/15 Started working as Research Assistant at the University of Bonn
06/13 Best poster award at INRIA CVML Summer School 2013 with On recognizing actions in still images via multiple features

Publications

2026
	Decouple and Cache: KV Cache Construction for Streaming Video Understanding Zhanzhong Pang, Dibyadip Chatterjee, Fadime Sener, Angela Yao ICML, 2026 Paper · Code
	On Discriminative vs. Generative Classifiers: Rethinking MLLMs for Action Understanding Zhanzhong Pang, Dibyadip Chatterjee, Fadime Sener, Angela Yao ICLR, 2026 Paper · Code
	Don't Pause! Every prediction matters in a streaming video Dibyadip Chatterjee, Zhanzhong Pang, Fadime Sener, Yale Song, Angela Yao Under review, NeurIPS 2026 Paper · Project
	OSMO: Open-vocabulary Self-eMOtion Tracking CVPR, 2026
	TrustCLIP: Learning Private Visual Features via Adversarial Reconstruction Under review, ECCV 2026
	SneakPeek: Future-Guided Instructional Streaming Video Generation Cheeun Hong, German Barquero, Fadime Sener, Markos Georgopoulos, Edgar Schönfeld, Stefan Popov, Yuming Du, Oscar Mañas, Albert Pumarola Under review, ECCV 2026 Paper
	PALM: A Dataset and Baseline for Learning Multi-subject Hand Prior Zicong Fan, Edoardo Remelli, David Dimond, Fadime Sener, Liuhao Ge, Bugra Tekin, Cem Keskin, Shreyas Hampali 3DV, 2026 Paper · Code
2025
	Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding Dibyadip Chatterjee, Edoardo Remelli, Yale Song, Bugra Tekin, Abhay Mittal, Bharat Lal Bhatnagar, Necati Cihan Camgoz, Shreyas Hampali, Eric Sauser, Shugao Ma, Angela Yao, Fadime Sener ICCV, 2025 2nd, EgoExo4D Fine-grained Keystep Recognition Challenge, CVPR 2025 Paper · Project · intern project
	Context-Enhanced Memory-Refined Transformer for Online Action Detection Zhanzhong Pang, Fadime Sener, Angela Yao CVPR, 2025 Paper · Code
	Spatial and temporal beliefs for mistake detection in assembly tasks Guodong Ding, Fadime Sener, Shugao Ma, Angela Yao CVIU, 2025 Paper
2024
	Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment Zhanzhong Pang, Fadime Sener, Shrinivas Ramasubramanian, Angela Yao ECCV, 2024 Paper · Code
	Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation Zhanzhong Pang, Fadime Sener, Shrinivas Ramasubramanian, Angela Yao BMVC, 2024 Paper · Code
	On the Utility of 3D Hand Poses for Action Recognition Md Salman Shamil, Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao ECCV, 2024 Paper · Code · Project
	DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions Sammy Christen, Shreyas Hampali, Fadime Sener, Edoardo Remelli, Tomas Hodan, Eric Sauser, Shugao Ma, Bugra Tekin SIGGRAPH Asia, 2024 Paper · Project
	X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization Anna Kukleva, Fadime Sener, Edoardo Remelli, Bugra Tekin, Eric Sauser, Bernt Schiele, Shugao Ma CVPR, 2024 Paper · Code · intern project
2023
	Opening the Vocabulary of Egocentric Actions Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao NeurIPS, 2023 Paper · Code · Project
	Temporal Action Segmentation: An Analysis of Modern Techniques Guodong Ding, Fadime Sener, Angela Yao TPAMI, 2023 Paper · Survey
	AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation Takehiko Ohkawa, Kun He, Fadime Sener, Tomas Hodan, Luan Tran, Cem Keskin CVPR, 2023 EgoVis 2023/2024 Distinguished Paper Winner Paper · Code · Project
2022
	Transferring Knowledge from Text to Video: Zero-Shot Anticipation for Procedural Actions Fadime Sener, Rishabh Saraf, Angela Yao TPAMI, 2022 Paper
	Transformed ROIs for Capturing Visual Transformations in Videos Abhinav Rai, Fadime Sener, Angela Yao CVIU, 2022 Paper
	Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities Fadime Sener, Dibyadip Chatterjee, Daniel Shelepov, Kun He, Dipika Singhania, Robert Wang, Angela Yao CVPR, 2022 EgoVis 2022/2023 Distinguished Paper Winner Paper · Code · Project · Video
2021
	Technical report: Temporal aggregate representations Fadime Sener, Dibyadip Chatterjee, Angela Yao Arxiv, EPIC-KITCHENS-Challenges, 2021 Paper · Code
2020
	Temporal aggregate representations for long-range video understanding Fadime Sener, Dipika Singhania, Angela Yao ECCV, 2020 1st in action anticipation, 2nd in action recognition, EPIC-KITCHENS 2020 Challenge Paper · Code
2019
	Unsupervised Learning of Action Classes with Continuous Temporal Embedding Anna Kukleva, Hilde Kuehne, Fadime Sener, Jürgen Gall CVPR, 2019 Paper · Code
	Learning Style Compatibility for Furniture Divyansh Aggarwal, Elchin Valiyev, Fadime Sener, Angela Yao GCPR, 2019 Paper
	Zero-Shot Anticipation for Instructional Activities Fadime Sener, Angela Yao ICCV, 2019 Paper
2018
	Unsupervised Learning and Segmentation of Complex Activities from Video Fadime Sener, Angela Yao CVPR, 2018 Spotlight Paper
2017
	DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books Samet Hicsonmez, Nermin Samet, Fadime Sener, Pinar Duygulu ACM ICMR, 2017 Paper
2015
	Two-person interaction recognition via spatial multiple instance embedding Fadime Sener, Nazli Ikizler-Cinbis JVCIR, 2015 Paper
2014
	Ensemble of multiple instance classifiers for image re-ranking Fadime Sener, Nazli Ikizler-Cinbis IMAVIS, 2014 Paper
2012
	Identification of illustrators Fadime Sener, Nermin Samet, Pinar Duygulu ECCV'W, 2012 Paper
	On recognizing actions in still images via multiple features Fadime Sener, Nazli Ikizler-Cinbis ECCV'W, 2012 Best poster, ENS/INRIA Visual Recognition and Machine Learning Summer School, 2013 Paper