Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ECCV (17. : 2022 : Tel Aviv; Online) Computer vision – ECCV 2022 ; Part 36
1. Verfasser: Guo, Qingpei (VerfasserIn)
Weitere Verfasser: Yao, Kaisheng (VerfasserIn), Chu, Wei (VerfasserIn)
Pages:2022
Format: UnknownFormat
Sprache:eng
Veröffentlicht: 2022
Schlagworte:
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Titel Jahr Verfasser
GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features 2022 Nguyen, Van-Quang
Trace Controlled Text to Image Generation 2022 Yan, Kun
Video Question Answering with Iterative Video-Text Co-tokenization 2022 Piergiovanni, AJ
Learning Disentanglement with Decoupled Labels for Vision-Language Navigation 2022 Cheng, Wenhao
Selective Query-Guided Debiasing for Video Corpus Moment Retrieval 2022 Yoon, Sunjae
New Datasets and Models for Contextual Reasoning in Visual Dialog 2022 Zhang, Yifeng
FindIt: Generalized Localization with Natural Language Queries 2022 Kuo, Weicheng
FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation 2022 Zhou, Kaiwen
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning 2022 Hessel, Jack
Speaker-Adaptive Lip Reading with User-Dependent Padding 2022 Kim, Minsu
Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance 2022 Choi, Myungsub
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation 2022 Dinh, Tan M.
ASSISTER: Assistive Navigation via Conditional Instruction Generation 2022 Huang, Zanming
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding 2022 Shi, Cheng
Contrastive Vision-Language Pre-training with Limited Resources 2022 Cui, Quan
Classification-Regression for Chart Comprehension 2022 Levy, Matan
AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant 2022 Wong, Benita
Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation 2022 Lin, Chuang
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels 2022 Ghiasi, Golnaz
NewsStories: Illustrating Articles with Visual Summaries 2022 Tan, Reuben
Alle Artikel auflisten