Learning to Remember:Exploring Multimodal Memory Mechanisms in Long Video Understanding | Reading Group

Learning to Remember:Exploring Multimodal Memory Mechanisms in Long Video Understanding | Reading Group

Learning to Remember: Exploring Multimodal Memory Mechanisms in Long Video Understanding

keywords: Memory, Long Video Understanding, VLA

论文分享主题是就是长视频理解领域的记忆模块应用,也是与我目前研究的方向比较契合

完整PPT如下:

参考文献:

1.[CVPR 2024]《MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding》
https://arxiv.org/pdf/2404.05726

2.[NeurIPS 2025]《VideoLucy: Deep Memory Backtracking for Long Video Understanding》
https://arxiv.org/abs/2510.12422

3.[arXiv 2025]《MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation》
https://arxiv.org/abs/2508.19236

Learning to Remember:Exploring Multimodal Memory Mechanisms in Long Video Understanding | Reading Group

https://fanchenlex.github.io/reandings/RG研一秋/

Author

Wenzhuo Li

Posted on

2026-01-04

Updated on

2026-01-19

Licensed under

WeChatQQGoogle ScholarDailyLogRSS