DocOIE: A Document-level Context-Aware Dataset for OpenIE

Kuicai Dong; Yilin Zhao; Aixin Sun; Jung-Jae Kim; Xiaoli Li

Back

Conference proceeding

DocOIE: A Document-level Context-Aware Dataset for OpenIE

Kuicai Dong, Yilin Zhao, Aixin Sun, Jung-Jae Kim and Xiaoli Li

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, pp.2377-2389

01/01/2021

Abstract

Computer Science

Computer Science, Artificial Intelligence

Computer Science, Theory & Methods

Science & Technology

Technology

Open Information Extraction (OpenIE) aims to extract structured relational tuples (subject, relation, object) from sentences, and plays a critical role in many NLP applications. Existing solutions perform extraction at sentence level, without referring to any additional contextual information. In reality, however, a sentence typically exists as part of a document rather than standalone; we often need to access relevant contextual information around the sentence before we can accurately interpret it. As there is no document-level context-aware OpenIE dataset available, we manually annotate 800 sentences from 80 documents in two domains (Healthcare and Transportation) to form a DocOIE dataset for evaluation. In addition, we propose DocIE, a document-level context-aware OpenIE model. Our experimental results demonstrate that incorporating documentlevel context is helpful in improving OpenIE performance. Both the DocOIE dataset and DocIE model are available online.

Metrics

1 Record Views

Details

Title: DocOIE: A Document-level Context-Aware Dataset for OpenIE
Creators - without role: Kuicai Dong - Supreme Council Of Health
Yilin Zhao - Supreme Council Of Health
Aixin Sun - Supreme Council Of Health
Jung-Jae Kim - ASTAR, Inst Infocomm Res, Singapore, Singapore
Xiaoli Li - Supreme Council Of Health
Contributors - without role: F Xia
C Zong
W Li
R Navigli
Publication Details: FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, pp.2377-2389
Publisher: Assoc Computational Linguistics-Acl
Number of pages: 13
Grant note: A19E2b0098; A18A2b0046 / Agency for Science, Technology and Research (A*STAR) under its AME Programmatic Funding Scheme; Agency for Science Technology & Research (A*STAR)
Identifiers: 9911017509846
Academic Unit: ISTD Pillar
Language: English
Resource Type: Conference proceeding

DocOIE: A Document-level Context-Aware Dataset for OpenIE

Abstract

Metrics

Details

Singapore University of Technology and Design Social media