MaltParser in the CoNLL Shared Task 2007 - Multilingual Track

Last modified:

This web page presents the settings and results for two parsers (the Single Malt Parser and the Blended Parser) in the multilingual track of the CoNLL 2007 shared task on dependency parsing.

In the multilingual track of the CoNLL 2007 shared task on dependency parsing, a single parser must be trained to handle data from ten different languages. For more information about the task and the data sets, see Nivre et al. (2007b) and the CoNLL 2007 shared task web site.

We used the freely available MaltParser system, which performs deterministic, classifier-based parsing with history-based feature models and discriminative learning. In order to maximize parsing accuracy, optimization has been carried out in two stages, leading to two different, but related parsers:

The two parsers are further described in Hall et al. (2007).

The Single Malt Parser

The Single Malt Parser is similar to the parser used in Nivre et al. (2006b) (result page), which parses a sentence deterministically in a single left-to-right pass over the input, with post-processing to recover non-projective dependencies, and which has been tuned for each language by optimizing parameters of the parsing algorithm, the feature model, and (to some degree) the learning algorithm.

The table below describes the settings and results for the Single Malt Parser when used to parse the blind test data.

LAS UAS LACC
Language Parser FM SVM Pseudo RES POS RES POS RES POS
Arabic NIVRE -a E -o 3 -p 0 ara.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000 0 -cr 0 74.753 84.213 85.732
Basque NIVRE -a E -o 3 -p 1 bas.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000 1 -cr 2 74.995 80.616 80.985
Catalan NIVRE -a E -o 2 -p 1 cat.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000 none -cr 0 87.744 92.206 92.194
Chinese NIVRE -a S -o 2 -p 0 chi.par -s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 0.1 -S 0 none -cr 0 83.513 87.605 86.033
Czech NIVRE -a E -o 3 -p 1 cze.par -s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F C1 -T 1000 1 -cr 3 77.226 82.356 84.555
English NIVRE -a E -o 3 -p 0 eng.par -s 0 -t 1 -d 2 -g 0.18 -c 0.4 -r 0.4 -e 1.0 -S 2 -F C1 -T 1000 none -cr 0 85.8112 86.7712 90.5312
Greek NIVRE -a E -o 3 -p 1 gre.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0 1 -cr 0 74.216 80.669 84.164
Hungarian NIVRE -a E -o 1 -p 1 hun.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0 6 -cr 0 78.093 81.716 89.983
Italian NIVRE -a E -o 2 -p 0 ita.par -s 0 -t 1 -d 2 -g 0.1 -c 0.5 -r 0.6 -e 1.0 -S 2 -F C1 -T 1000 none -cr 1 82.485 86.265 88.836
Turkish NIVRE -a E -o 2 -p 0 tur.par -s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.3 -e 0.5 -T 100 -S 2 -F C0 1 -cr 2 79.243 85.045 87.242
Average         79.805 84.746 87.022

Table 1. Each language has specific settings for parsing algorithm, feature model (FM), support vector machines (SVM) and pseudo-projective parsing (Pseudo; first parameter is marking strategy). For more details about different settings, please visit the user guide for MaltParser and Pseudo-Projective Parsing. The last six columns describe the results and positions in the CoNLL Shared Task 2007 for each language. Results are given for three evaluation metrics: Labeled Attachment Score (LAS), Unlabeled Attachment Score (UAS) and Label Accuracy (Lacc). The last row presents the average result over all ten languages.

The Blended Parser

The second parser is an ensemble system, which combines the output of six deterministic parsers, each of which is a variation of the Single Malt Parser with parameter settings extrapolated from the optimization of the Single Malt Parser and from many other previous experiments.

Using one of the NIVRE algorithms for a single parser below (in either parsing direction), the Parser option flags o and p have the same values as the Single Malt Parser for the same language. The Pseudo options for Blended Malt below equal the Pseudo options for Single Malt with the NIVRE algorithms.

The table below describes the settings and results for the Blended Parser when used to parse the blind test data

LAS UAS LACC
Language Direction Parser FM SVM RES POS RES POS RES POS
Arabic L->R NIVRE -a E See settings and results for Single Malt
R->L arabic_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F C1 -T 1000
L->R NIVRE -a S arabic_aS.par
R->L
L->R COVINGTON -g A -r 1 arabic_cov.par
R->L
Ensemble result   76.521 85.812 86.551
Basque L->R NIVRE -a E See settings and results for Single Malt
R->L basque_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F C1 -T 1000
L->R NIVRE -a S basque_aS.par
R->L
L->R COVINGTON -g A -r 1 basque_cov.par
R->L
Ensemble result 76.941 82.841 82.522
Catalan L->R NIVRE -a E See settings and results for Single Malt
R->L catalan_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000
L->R NIVRE -a S catalan_aS.par
R->L
L->R COVINGTON -g A -r 1 catalan_cov.par
R->L
Ensemble result 88.701 93.123 93.021
Chinese
L->R NIVRE -a S See settings and results for Single Malt
R->L chinese_aS.par -s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.2 -e 0.1 -S 2 -F C1 -T 1000
L->R NIVRE -a E chinese_aE.par
R->L
L->R COVINGTON -g A -r 0 chinese_cov.par
R->L
Ensemble result (official)   75.8215 84.5212 78.7815
Ensemble result (corrected)   84.67(2) 88.70(3) 86.98(2)
Czech L->R NIVRE -a E See settings and results for Single Malt
R->L czech_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F P1 -T 1000
L->R NIVRE -a S czech_aS.par
R->L
L->R COVINGTON -g A -r 1 czech_cov.par -s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F C1 -T 1000
R->L
Ensemble result   77.983 83.594 84.256
English L->R NIVRE -a E See settings and results for Single Malt
R->L english_aE.par -s 0 -t 1 -d 2 -g 0.18 -c 0.4 -r 0.4 -e 1.0 -S 2 -F C1 -T 1000
L->R NIVRE -a S english_aS.par
R->L
L->R COVINGTON -g A -r 0 english_cov.par
R->L
Ensemble result   88.115 88.935 92.165
Greek L->R NIVRE -a E See settings and results for Single Malt
R->L greek_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
L->R NIVRE -a S greek_aS.par
R->L
L->R COVINGTON -g A -r 1 greek_cov.par
R->L
Ensemble result   74.652 81.224 81.6416
Hungarian L->R NIVRE -a E See settings and results for Single Malt
R->L hungarian_aE.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
L->R NIVRE -a S hungarian_aS.par
R->L
L->R COVINGTON -g A -r 0 hungarian_cov.par -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000
R->L -s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
Ensemble result   80.271 83.551 90.851
Italian L->R NIVRE -a E See settings and results for Single Malt
R->L italian_aE.par -s 0 -t 1 -d 2 -g 0.1 -c 0.5 -r 0.6 -e 1.0 -S 2 -F C1 -T 1000
L->R NIVRE -a S italian_aS.par
R->L
L->R COVINGTON -g A -r 0 italian_cov.par
R->L
Ensemble result   84.401 87.772 89.623
Turkish L->R NIVRE -a E See settings and results for Single Malt
R->L turkish_aE.par -s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.3 -e 0.5 -T 100 -S 2 -F C0
L->R NIVRE -a S turkish_aS.par
R->L
L->R COVINGTON -g A -r 0 turkish_cov.par
R->L
Ensemble result   79.792 85.772 87.331
 
Average (official)   80.321 85.712 86.676
Average (corrected)   81.20(1) 86.13(2) 87.49(1)

MaltParser CoNLL Shared Task 2007 Group

Johan HallVäxjö University, Sweden
Jens Nilsson
Joakim NivreVäxjö University and Uppsala University, Sweden
Gülşen EryiğitIstanbul Technical University, Turkey
Beata MegyesiUppsala University, Sweden
Mattias Nilsson
Markus Saers

More information

More information about parsing algorithms, learning algorithms and feature models can be found in the following publications: