MaltParser

Introduction

MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data and to parse new data using an induced model. MaltParser is developed by Johan Hall, Jens Nilsson and Joakim Nivre, who are members of the MALT group at the School of Mathematics and Systems Engineering (MSI) at Växjö University, Sweden.

MaltParser 1.0.0 and later releases constitute a complete reimplementation of MaltParser in Java and are distributed with an open source license. The previous versions 0.1-0.4 of MaltParser were implemented in C. The Java implementation (version 1.0.0 and later releases) replaces the C implementation (version 0.x) and MaltParser 0.x will not be supported and updated any more.

Inductive Dependency Parsing

MaltParser can be characterized as a data-driven parser-generator. While a traditional parser-generator constructs a parser given a grammar, a data-driven parser-generator constructs a parser given a treebank. MaltParser is an implementation of inductive dependency parsing, where the syntactic analysis of a sentence amounts to the derivation of a dependency structure, and where inductive machine learning is used to guide the parser at nondeterministic choice points (Nivre, 2006). The parsing methodology is based on three essential components:

  1. Deterministic parsing algorithms for building labeled dependency graphs (Kudo and Matsumoto,2002; Yamada and Matsumoto, 2003; Nivre,2003)
  2. History-based models for predicting the next parser action at nondeterministic choice points (Black et al., 1992; Magerman, 1995; Ratnaparkhi, 1997; Collins, 1999)
  3. Discriminative learning to map histories to parser actions (Kudo and Matsumoto, 2002; Yamada and Matsumoto, 2003; Nivre et al., 2004; Hall et al., 2006)

MaltParser 1.0.0

MaltParser implements four deterministic parsing algorithms:

MaltParser allows users to define feature models of arbitrary complexity.

MaltParser currently includes one machine learning package (interfaces to other learning packages will be included in later releases):

References