Welcome!

April 18, 2010

ScalaNLP is a collection of libraries for Natural Language Processing, Machine Learning, and Statistics. We have a number of subprojects, each with a different focus:

  1. Scalala is a high performance numeric linear algebra library for Scala, with rich Matlab-like operators on vectors and matrices; a library of numerical routines; and support for plotting functions and data. Scalala can be used interactively, or as a library.
  2. ScalaNLP-Data consists of support classes for data and text processing and computation pipelining.
  3. ScalaNLP-Learn includes commonly used learning and optimization algorithms, such as L-BFGS and a logistic classifier. It also contains statistical distributions and sampling routines.
  4. ScalaNLP-FST is a finite-state toolkit for language processing. It implements many common algorithms on weighted string transducers.
  5. (Unreleased) ScalaNLP-Parser will be a high performance parser in Scala.

Getting Started

See the Download page for information about downloading the different modules of ScalaNLP.  Once you’ve done that, you might want to check out the documentation for Scalala or the incomplete Tour of ScalaNLP-Data and ScalaNLP-Learn. Once you’ve dived in, you might want to join our mailing lists.