MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields
Multimodal learning has attracted the interest of the machine learning community due to its great potential in a variety of applications. To help achieve this potential, we propose a multimodal benchmark MuG with eight datasets allowing researchers to test the multimodal perceptron capabilities of their models. These datasets are collected from four different genres of games that cover tabular, textual, and visual modalities. We conduct multi-aspect data analysis to provide insights into the benchmark, including label balance ratios, percentages of missing features, distributions of data within each modality, and the correlations between labels and input modalities. We further present experimental results obtained by several state-of-the-art unimodal classifiers and multimodal classifiers, which demonstrate the challenging and multimodal-dependent properties of the benchmark. MuG is released at https://github.com/lujiaying/MUG-Bench with the data, documents, tutorials, and implemented baselines. Extensions of MuG are welcomed to facilitate the progress of research in multimodal learning problems.
READ FULL TEXT