Multimodal Relation Extraction with Efficient Graph Alignment

Changmeng Zheng, Junhao Feng, Ze Fu, Yi Cai*, Qing Li, Tao Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

80 Citations (Scopus)

Abstract

Relation extraction (RE) is a fundamental process in constructing knowledge graphs. However, previous methods on relation extraction suffer sharp performance decline in short and noisy social media texts due to a lack of contexts. Fortunately, the related visual contents (objects and their relations) in social media posts can supplement the missing semantics and help to extract relations precisely. We introduce the multimodal relation extraction (MRE), a task that identifies textual relations with visual clues. To tackle this problem, we present a large-scale dataset which contains 15000+ sentences with 23 pre-defined relation categories. Considering that the visual relations among objects are corresponding to textual relations, we develop a dual graph alignment method to capture this correlation for better performance. Experimental results demonstrate that visual contents help to identify relations more precisely against the text-only baselines. Besides, our alignment method can find the correlations between vision and language, resulting in better performance. Our dataset and code are available at https://github.com/thecharm/Mega.

Original languageEnglish
Title of host publicationMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages5298-5306
Number of pages9
ISBN (Electronic)9781450386517
DOIs
Publication statusPublished - 17 Oct 2021
Event29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, China
Duration: 20 Oct 202124 Oct 2021

Publication series

NameMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

Conference

Conference29th ACM International Conference on Multimedia, MM 2021
Country/TerritoryChina
CityVirtual, Online
Period20/10/202124/10/2021

Keywords

  • graph alignment
  • multimodal dataset
  • multimodal relation extraction

Fingerprint

Dive into the research topics of 'Multimodal Relation Extraction with Efficient Graph Alignment'. Together they form a unique fingerprint.

Cite this