Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Natural Language Toolkit (NLTK)

Compare

  Analyzed 11 months ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

234K lines of code

42 current contributors

almost 1 year since last commit

45 users on Open Hub

Activity Not Available
5.0
 
I Use This

Apache UIMA Java SDK

Compare

Claimed by Apache Software Foundation Analyzed 11 months ago

Apache UIMA is an Apache-licensed open source implementation of the UIMA specification (that specification is, in turn, being developed concurrently by a technical committee within OASIS, a standards organization). We invite and encourage you to participate in both the implementation and ... [More] specification efforts. UIMA is a component framework for analysing unstructured content such as text, audio and video. It comprises an SDK and tooling for composing and running analytic components written in Java and C++, with some support for Perl, Python and TCL. [Less]

371K lines of code

3 current contributors

about 1 year since last commit

19 users on Open Hub

Activity Not Available
5.0
 
I Use This

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed 11 months ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

157K lines of code

8 current contributors

11 months since last commit

12 users on Open Hub

Activity Not Available
5.0
 
I Use This

TreeTagger for Java

Compare

  Analyzed 11 months ago

TreeTagger for Java is a Java wrapper around the popular TreeTagger package by Helmut Schmid. It was written with a focus on platform-independence and easy integration into applications. It is written in Java 5 and has been tested on OS X, Ubuntu Linux, and Windows.

2.67K lines of code

0 current contributors

over 2 years since last commit

12 users on Open Hub

Activity Not Available
5.0
 
I Use This

LanguageTool

Compare

  Analyzed 11 months ago

LanguageTool is an Open Source language checker for English, German, Polish, Dutch, and other languages. It's rule based, i.e. it will find errors for which a rule is defined in an XML configuration files. Rules for more complicated errors can be written in Java.

1.24M lines of code

37 current contributors

11 months since last commit

11 users on Open Hub

Activity Not Available
4.66667
   
I Use This

CMU Sphinx

Compare

  Analyzed 11 months ago

CMUSphinx represents Carnegie Mellon University's development of open source, large-vocabulary, speaker-independent continuous speech recognition engines. The distribution contains a library (libsphinx5) and some small examples that link against it.

486K lines of code

9 current contributors

12 months since last commit

7 users on Open Hub

Activity Not Available
4.33333
   
I Use This
Licenses: No declared licenses

DKPro Core

Compare

  Analyzed 11 months ago

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released ... [More] continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines. [Less]

158K lines of code

8 current contributors

about 1 year since last commit

6 users on Open Hub

Activity Not Available
4.75
   
I Use This

Treex - NLP Framework

Compare

  Analyzed 11 months ago

Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to ... [More] significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. [Less]

242K lines of code

4 current contributors

12 months since last commit

4 users on Open Hub

Activity Not Available
5.0
 
I Use This

MeCab

Compare

  Analyzed 11 months ago

MeCab is a fast and customizable Japanese morphological analyzer. MeCab is designed for generic purpose and applied to variety of NLP tasks, such as Kana-Kanji conversion. MeCab provides parameter estimation functionalities based on CRFs and HMM

291K lines of code

0 current contributors

over 1 year since last commit

3 users on Open Hub

Activity Not Available
0.0
 
I Use This
Licenses: No declared licenses

matxin

Compare

  Analyzed 11 months ago

Machine translation engine based on a dependency grammar and XML interchange format. The Spanish-Basque (es-eu) translation direction is currently supported.

3.41M lines of code

0 current contributors

over 7 years since last commit

3 users on Open Hub

Activity Not Available
5.0
 
I Use This
Licenses: No declared licenses