Abstract
Open Information Extraction (OpenIE) aims to extract structured relational tuples (subject, relation, object) from sentences, and plays a critical role in many NLP applications. Existing solutions perform extraction at sentence level, without referring to any additional contextual information. In reality, however, a sentence typically exists as part of a document rather than standalone; we often need to access relevant contextual information around the sentence before we can accurately interpret it. As there is no document-level context-aware OpenIE dataset available, we manually annotate 800 sentences from 80 documents in two domains (Healthcare and Transportation) to form a DocOIE dataset for evaluation. In addition, we propose DocIE, a document-level context-aware OpenIE model. Our experimental results demonstrate that incorporating documentlevel context is helpful in improving OpenIE performance. Both the DocOIE dataset and DocIE model are available online.