[ Home ] – [ Research ] – [ Teaching ] – [ Resources ] – [ Misc ]
-
Description: UIMA-connectors aims mainly at offering solutions to build the bridge between some text language formats and the UIMA structure data, namely the CAS. In comparison, the Tika project aims at detecting and extracting metadata and structured text content from various type
MIME documents. UIMA-connectors is more dedicated to perfom mapping from/to text formats to/from CAS, providing solutions for handling language formats such as eXtended Markup Language (
XML), Comma Separated Value (CSV), whitespace token and newline sentence… or applications of these formats such as Message Understanding Conferences (MUC)…
License: Apache 2.0
-
Description: For now, UIMA Text Segmenter is a UIMA wrapper for the java implementations of the segmentation algorithms C99 and TextTiling?, written by Freddy Choi, and used in the experiments described in choi_2000naacl (see wiki pages). These algorithms offer a text segmentation analysis at the discourse level essentially based on lexical cohesion measures.
License: Apache 2.0
-
Description: This project offers a software solution to create (we say also _map_ ), update or delete annotations according to rules expressed over other annotation patterns. One major use case is to ensure permit type system interoperability by offering mechanisms to transform anotations to anothers. This is a rule-based system. The rules are expressed thanks to the
W3C XPath language. The processor is based on the Apache JXPath engine and the type system agnostic ability is made possible thanks to the Java Sun java.lang.reflex
API.
License: Apache 2.0
-
Description: LinguaStream plugin Runtime Exec (accroître les moyens d’interaction de LS avec le système d'exploitation et les logiciels disponibles en ligne de commande tels que cat, echo, find, env, grep, netcat, ssh, perl… )
System: Testé sous Ubuntu
License:
Last release: 071106