Document classifier

The purpose of the Gaika tool is to determine the subject matter of written documents by categorizing documents.

Gaika extracts the terms from a text, and contrasts these terms with a dictionary or lexicon including classification according to subject matter. As a result, it determines if the text or document in question is related to a specific subject (law, biology, etc.).

This classification resource is a fundamental part in the different processes carried out using PLN: information extraction (IE), information retrieval (IR), document classification/ categorization, automatic summarization of documents, etc.