pt.tumba.cage

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package pt.tumba.cage

This package implements a utility for extracting named entities from text through lexicons and a variety of orthographic and contextual features.

See:
Description

Class Summary
Cage	Extracting named entities (names, places, dates, and other words and phrases that establish the meaning of a body of text) is critical to software systems that process large amounts of unstructured data coming from sources such as email, document files, and the Web.
DefaultWordFinder	A word finder for normal text documents, which searches text for sequences of words and text blocks.This class also defines common methods and behaviour for the various word finding subclasses.
ExtractAbbrev	A simple algorithm for extracting of abbreviations and their definitions from text.
MathEvaluator	A Mathematic expression evaluator.
NamedEntity	A Named Entity recognized in the text.
Numex	Implements methods for recognizing numeric expressions in both Portuguese and English texts.
StringUtils	A collection of `String` handling utility methods.
TeXWordFinder	A word finder for TeX and LaTeX documents, which searches text for sequences of letters, but ignores any commands and environments as well as Math environments.
XMLWordFinder	A word finder for XMLdocuments, which searches text for sequences of letters, but ignores tags.