Evaluation of a NERD process
People and organisations, places and events - so-called entities - play an important role when searching for documents and in evaluating their relevance. In the last few years technical processes have been developed which can automatically trace any mention of these entities in texts and assign them to descriptive data records. The resulting information helps provide improved possibilities for searches in text documents. The processes used are called "Named Entity Recognition and Disambiguation" processes (or "NERD" for short). The AIDA-Software of the Max-Planck-Institut für Informatik in Saarbrücken is just such a NERD process.
The aim of the Evaluation of a NERD Process project was to assess the potential of using such a process for Nationalbibliografie searches. The first step involved processing full texts from the stocks of the German National Library using the MPI's AIDA software. In the second step, prototypes were to be developed which make use of the entity information obtained.
Step 1: Introduction of the NERD process
The recognition and disambiguation of full texts from the stocks of the German National Library represent major challenges, as the texts can include highly heterogeneous and in some cases very extensive content. Adjustments and extensions to the AIDA software were therefore be undertaken as part of the project. The intention was to integrate not only these technical aspects but also the comprehensive entity information which already existed in the Integrated Authority File (GND) into the NERD process.
Step 2: Use of entity information
A core aspect of the "Evaluation of a NERD Process" project was to explore possible applications for entity information in searches. The primary objective of the project was to convert actual use scenarios for entity information into publicly accessible software prototypes. The following search aspects were evaluated in prototypes:
April 2013 - May 2014
Last update: 18.09.2014