Image-text retrieval requires the system to bridge the heterogenous gap
...
In recent years, AI-generated music has made significant progress, with
...
In this paper, we develop an approximation scheme for solving bilevel
pr...
Attention-based neural networks, such as Transformers, have become ubiqu...
A Stackelberg congestion game (SCG) is a bilevel program in which a lead...
Knowledge-based visual question answering requires the ability of associ...
Encrypted traffic classification requires discriminative and robust traf...
It is prevalent to utilize external knowledge to help machine answer
que...
Question answering systems usually use keyword searches to retrieve pote...
Pre-trained language models like BERT achieve superior performances in
v...
In a social system, the self-interest of agents can be detrimental to th...
We propose a novel Bi-directional Cognitive Knowledge Framework (BCKF) f...
Scene graphs are semantic abstraction of images that encourage visual
un...
To conduct a radiomics or deep learning research experiment, the radiolo...
Knowledge-based Visual Question Answering (KVQA) requires external knowl...
Visual dialogue is a challenging task that needs to extract implicit
inf...
Visual Dialogue task requires an agent to be engaged in a conversation w...
Fact-based Visual Question Answering (FVQA) requires external knowledge
...
Fact-based Visual Question Answering (FVQA) requires external knowledge
...
We propose an optimization algorithm to compute the optimal sensor locat...
Different from Visual Question Answering task that requires to answer on...
Recently, resilience is increasingly used as a concept for understanding...
Visual relation reasoning is a central component in recent cross-modal
a...
Blind image deconvolution is the problem of recovering the latent image ...
We initiate a study of the classification of approximation complexity of...
Feature modeling of different modalities is a basic problem in current
r...
Feature representation of different modalities is the main focus of curr...
Cross-modal information retrieval aims to find heterogeneous data of var...