Preservation of emails poses particular challenges to future discovery as alternative historical sources. Emails represent communications between individuals and contain a wealth of information when viewed as an organisation-wide collection. Existing search tools can extract named entities and keyword searches but are less effective when it comes to extracting patterns and contextual information across multiple custodians. To address this, we present EMCODIST, a discovery tool for searching the contextual information across emails using attention-based models of Natural Language Processing (NLP). The EMCODIST aims to steer end-users to personalise their searches towards a concept. In this paper, we explain the definition of the ‘context’ for emails which is also suitable for object-oriented computational modelling. The tool is evaluated based on the relevancy of the emails extracted.
|Number of pages||2290|
|Journal||IEEE International Conference on Big Data (Big Data)|
|Publication status||Published - 2021|