X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ECCV (17. : 2022 : Tel Aviv; Online) Computer vision – ECCV 2022 ; Part 36
1. Verfasser: Cai, Zhaowei (VerfasserIn)
Weitere Verfasser: Kwon, Gukyeong (VerfasserIn), Ravichandran, Avinash (VerfasserIn), Bas, Erhan (VerfasserIn), Tu, Zhuowen (VerfasserIn), Bhotika, Rahul (VerfasserIn), Soatto, Stefano (VerfasserIn)
Pages:2022
Format: UnknownFormat
Sprache:eng
Veröffentlicht: 2022
Schlagworte:
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Titel Jahr Verfasser
Object-Centric Unsupervised Image Captioning 2022 Meng, Zihang
Learning Linguistic Association Towards Efficient Text-Video Retrieval 2022 Fang, Sheng
Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing 2022 Boecking, Benedikt
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input 2022 Guo, Qingpei
Video Graph Transformer for Video Question Answering 2022 Xiao, Junbin
Rethinking Data Augmentation for Robust Visual Question Answering 2022 Chen, Long
Word-Level Fine-Grained Story Visualization 2022 Li, Bowen
Webly Supervised Concept Expansion for General Purpose Vision Models 2022 Kamath, Amita
Unifying Event Detection and Captioning as Sequence Generation via Pre-training 2022 Zhang, Qi
Fine-Grained Visual Entailment 2022 Thomas, Christopher
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection 2022 Hong, Joanna
Language-Driven Artistic Style Transfer 2022 Fu, Tsu-Jui
GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features 2022 Nguyen, Van-Quang
Trace Controlled Text to Image Generation 2022 Yan, Kun
Video Question Answering with Iterative Video-Text Co-tokenization 2022 Piergiovanni, AJ
Learning Disentanglement with Decoupled Labels for Vision-Language Navigation 2022 Cheng, Wenhao
Selective Query-Guided Debiasing for Video Corpus Moment Retrieval 2022 Yoon, Sunjae
New Datasets and Models for Contextual Reasoning in Visual Dialog 2022 Zhang, Yifeng
FindIt: Generalized Localization with Natural Language Queries 2022 Kuo, Weicheng
FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation 2022 Zhou, Kaiwen
Alle Artikel auflisten