Learning to capture long-range relations is fundamental to image/video recognition. Existing CNN models generally rely on increasing depth to model such relations which is highly inefficient. On this seminar, we will discuss the “double attention block”, a novel (or not so novel) component from Facebook AI Research that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos. And also discuss their differences with SEnet’s.
And as a warmup we will have a little talk on how to read a paper since this is a kick-off for our yet another season of seminars.
Speaker: Rauf Kurbanov.
Presentation language: Russian.
Date and time: January 23rd, 18:30-20:00.
Location: Times, room 405.
Chen, Yunpeng, et al. “A^ 2-Nets: Double Attention Networks.” Advances in Neural Information Processing Systems. 2018.
Hu, Jie, Li Shen, and Gang Sun. “Squeeze-and-excitation networks.” arXiv preprint arXiv:1709.01507 7 (2017).
Keshav, S. "How to read a paper." ACM SIGCOMM Computer Communication Review 37.3 (2007): 83-84.