Learning to Remember:Exploring Multimodal Memory Mechanisms in Long Video Understanding | Reading Group
Learning to Remember: Exploring Multimodal Memory Mechanisms in Long Video Understanding
keywords: Memory, Long Video Understanding, VLA
论文分享主题是就是长视频理解领域的记忆模块应用,也是与我目前研究的方向比较契合
完整PPT如下:
参考文献:
1.[CVPR 2024]《MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding》
https://arxiv.org/pdf/2404.05726
2.[NeurIPS 2025]《VideoLucy: Deep Memory Backtracking for Long Video Understanding》
https://arxiv.org/abs/2510.12422
3.[arXiv 2025]《MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation》
https://arxiv.org/abs/2508.19236
Learning to Remember:Exploring Multimodal Memory Mechanisms in Long Video Understanding | Reading Group








