KIM – a semantic platform for information extraction and retrieval
The KIM platform provides a novel Knowledge and Information Management framework and services for automatic semantic annotation, indexing, and retrieval of documents. It provides a mature and semantically enabled infrastructure for scalable and customizable information extraction (IE) as well as annotation and document management, based on GATE. 1 Our understanding is that a system for semantic annotation should be based upon a simple model of real-world entity concepts, complemented with quasi-exhaustive instance knowledge. To ensure efficiency, easy sharing, and reusability of the metadata we introduce an upper-level ontology. Based on the ontology, a large-scale instance base of entity descriptions is maintained. The knowledge resources involved are handled by use of state-of-the-art Semantic Web technology and standards, including RDF(S) repositories, ontology middleware and reasoning. From a technical point of view, the platform allows KIM-based applications to use it for automatic semantic annotation, for content retrieval based on semantic queries, and for semantic repository access. As a framework, KIM also allows various IE modules, semantic repositories and information retrieval engines to be plugged into it. This paper presents the KIM platform, with an emphasis on its architecture, interfaces, front-ends, and other technical issues.
(Received July 1 2003)
(Revised February 24 2004)
1 General Architecture for Text Engineering (GATE) (http://gate.ac.uk), leading NLP and IE platform developed at the University of Sheffield.