MaltParser in the CoNLL Shared Task 2007 - Multilingual Track

Last modified:

This web page presents the settings and results for two parsers (the Single Malt Parser and the Blended Parser) in the multilingual track of the CoNLL 2007 shared task on dependency parsing.

In the multilingual track of the CoNLL 2007 shared task on dependency parsing, a single parser must be trained to handle data from ten different languages. For more information about the task and the data sets, see Nivre et al. (2007b) and the CoNLL 2007 shared task web site.

We used the freely available MaltParser system, which performs deterministic, classifier-based parsing with history-based feature models and discriminative learning. In order to maximize parsing accuracy, optimization has been carried out in two stages, leading to two different, but related parsers:

The Single Malt Parser
The Blended Parser

The two parsers are further described in Hall et al. (2007).

The Single Malt Parser

The Single Malt Parser is similar to the parser used in Nivre et al. (2006b) (result page), which parses a sentence deterministically in a single left-to-right pass over the input, with post-processing to recover non-projective dependencies, and which has been tuned for each language by optimizing parameters of the parsing algorithm, the feature model, and (to some degree) the learning algorithm.

The table below describes the settings and results for the Single Malt Parser when used to parse the blind test data.

					LAS		UAS		L_ACC
Language	Parser	FM	SVM	Pseudo	RES	POS	RES	POS	RES	POS
Arabic	NIVRE -a E -o 3 -p 0	ara.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000	0 -cr 0	74.75	3	84.21	3	85.73	2
Basque	NIVRE -a E -o 3 -p 1	bas.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000	1 -cr 2	74.99	5	80.61	6	80.98	5
Catalan	NIVRE -a E -o 2 -p 1	cat.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000	none -cr 0	87.74	4	92.20	6	92.19	4
Chinese	NIVRE -a S -o 2 -p 0	chi.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 0.1 -S 0	none -cr 0	83.51	3	87.60	5	86.03	3
Czech	NIVRE -a E -o 3 -p 1	cze.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F C1 -T 1000	1 -cr 3	77.22	6	82.35	6	84.55	5
English	NIVRE -a E -o 3 -p 0	eng.par	-s 0 -t 1 -d 2 -g 0.18 -c 0.4 -r 0.4 -e 1.0 -S 2 -F C1 -T 1000	none -cr 0	85.81	12	86.77	12	90.53	12
Greek	NIVRE -a E -o 3 -p 1	gre.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0	1 -cr 0	74.21	6	80.66	9	84.16	4
Hungarian	NIVRE -a E -o 1 -p 1	hun.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0	6 -cr 0	78.09	3	81.71	6	89.98	3
Italian	NIVRE -a E -o 2 -p 0	ita.par	-s 0 -t 1 -d 2 -g 0.1 -c 0.5 -r 0.6 -e 1.0 -S 2 -F C1 -T 1000	none -cr 1	82.48	5	86.26	5	88.83	6
Turkish	NIVRE -a E -o 2 -p 0	tur.par	-s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.3 -e 0.5 -T 100 -S 2 -F C0	1 -cr 2	79.24	3	85.04	5	87.24	2
Average					79.80	5	84.74	6	87.02	2

Table 1. Each language has specific settings for parsing algorithm, feature model (FM), support vector machines (SVM) and pseudo-projective parsing (Pseudo; first parameter is marking strategy). For more details about different settings, please visit the user guide for MaltParser and Pseudo-Projective Parsing. The last six columns describe the results and positions in the CoNLL Shared Task 2007 for each language. Results are given for three evaluation metrics: Labeled Attachment Score (LAS), Unlabeled Attachment Score (UAS) and Label Accuracy (L_acc). The last row presents the average result over all ten languages.

The Blended Parser

The second parser is an ensemble system, which combines the output of six deterministic parsers, each of which is a variation of the Single Malt Parser with parameter settings extrapolated from the optimization of the Single Malt Parser and from many other previous experiments.

Using one of the NIVRE algorithms for a single parser below (in either parsing direction), the Parser option flags o and p have the same values as the Single Malt Parser for the same language. The Pseudo options for Blended Malt below equal the Pseudo options for Single Malt with the NIVRE algorithms.

The table below describes the settings and results for the Blended Parser when used to parse the blind test data

					LAS		UAS		L_ACC
Language	Direction	Parser	FM	SVM	RES	POS	RES	POS	RES	POS
Arabic	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	arabic_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F C1 -T 1000
	L->R	NIVRE -a S	arabic_aS.par
	R->L	NIVRE -a S	arabic_aS.par
	L->R	COVINGTON -g A -r 1	arabic_cov.par
	R->L	COVINGTON -g A -r 1	arabic_cov.par
	Ensemble result				76.52	1	85.81	2	86.55	1
Basque	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	basque_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F C1 -T 1000
	L->R	NIVRE -a S	basque_aS.par
	R->L	NIVRE -a S	basque_aS.par
	L->R	COVINGTON -g A -r 1	basque_cov.par
	R->L	COVINGTON -g A -r 1	basque_cov.par
	Ensemble result				76.94	1	82.84	1	82.52	2
Catalan	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	catalan_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000
	L->R	NIVRE -a S	catalan_aS.par
	R->L	NIVRE -a S	catalan_aS.par
	L->R	COVINGTON -g A -r 1	catalan_cov.par
	R->L	COVINGTON -g A -r 1	catalan_cov.par
	Ensemble result				88.70	1	93.12	3	93.02	1
Chinese
	L->R	NIVRE -a S	See settings and results for Single Malt
	R->L	NIVRE -a S	chinese_aS.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.2 -e 0.1 -S 2 -F C1 -T 1000
	L->R	NIVRE -a E	chinese_aE.par
	R->L	NIVRE -a E	chinese_aE.par
	L->R	COVINGTON -g A -r 0	chinese_cov.par
	R->L	COVINGTON -g A -r 0	chinese_cov.par
	Ensemble result (official)				75.82	15	84.52	12	78.78	15
	Ensemble result (corrected)				84.67	(2)	88.70	(3)	86.98	(2)
Czech	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	czech_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F P1 -T 1000
	L->R	NIVRE -a S	czech_aS.par
	R->L	NIVRE -a S	czech_aS.par
	L->R	COVINGTON -g A -r 1	czech_cov.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.25 -r 0.3 -e 1.0 -S 2 -F C1 -T 1000
	R->L	COVINGTON -g A -r 1	czech_cov.par
	Ensemble result				77.98	3	83.59	4	84.25	6
English	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	english_aE.par	-s 0 -t 1 -d 2 -g 0.18 -c 0.4 -r 0.4 -e 1.0 -S 2 -F C1 -T 1000
	L->R	NIVRE -a S	english_aS.par
	R->L	NIVRE -a S	english_aS.par
	L->R	COVINGTON -g A -r 0	english_cov.par
	R->L	COVINGTON -g A -r 0	english_cov.par
	Ensemble result				88.11	5	88.93	5	92.16	5
Greek	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	greek_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
	L->R	NIVRE -a S	greek_aS.par
	R->L	NIVRE -a S	greek_aS.par
	L->R	COVINGTON -g A -r 1	greek_cov.par
	R->L	COVINGTON -g A -r 1	greek_cov.par
	Ensemble result				74.65	2	81.22	4	81.64	16
Hungarian	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	hungarian_aE.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
	L->R	NIVRE -a S	hungarian_aS.par
	R->L	NIVRE -a S	hungarian_aS.par
	L->R	COVINGTON -g A -r 0	hungarian_cov.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 2 -F P1 -T 1000
	R->L	COVINGTON -g A -r 0	hungarian_cov.par	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1.0 -S 0
	Ensemble result				80.27	1	83.55	1	90.85	1
Italian	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	italian_aE.par	-s 0 -t 1 -d 2 -g 0.1 -c 0.5 -r 0.6 -e 1.0 -S 2 -F C1 -T 1000
	L->R	NIVRE -a S	italian_aS.par
	R->L	NIVRE -a S	italian_aS.par
	L->R	COVINGTON -g A -r 0	italian_cov.par
	R->L	COVINGTON -g A -r 0	italian_cov.par
	Ensemble result				84.40	1	87.77	2	89.62	3
Turkish	L->R	NIVRE -a E	See settings and results for Single Malt
	R->L	NIVRE -a E	turkish_aE.par	-s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.3 -e 0.5 -T 100 -S 2 -F C0
	L->R	NIVRE -a S	turkish_aS.par
	R->L	NIVRE -a S	turkish_aS.par
	L->R	COVINGTON -g A -r 0	turkish_cov.par
	R->L	COVINGTON -g A -r 0	turkish_cov.par
	Ensemble result				79.79	2	85.77	2	87.33	1

Average (official)					80.32	1	85.71	2	86.67	6
Average (corrected)					81.20	(1)	86.13	(2)	87.49	(1)

MaltParser CoNLL Shared Task 2007 Group

Johan Hall	Växjö University, Sweden
Jens Nilsson	Växjö University, Sweden
Joakim Nivre	Växjö University and Uppsala University, Sweden
Gülşen Eryiğit	Istanbul Technical University, Turkey
Beata Megyesi	Uppsala University, Sweden
Mattias Nilsson
Markus Saers

More information

More information about parsing algorithms, learning algorithms and feature models can be found in the following publications:

Chang, C.-C. and Lin, C.-J. (2005) LIBSVM: A Library for Support Vector Machines. URL: http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
Covington, M. A. (2001) A Fundamental Algorithm for Dependency Parsing. In Proceedings of the 39th Annual ACM Southeast Conference, pp. 95-102.
Daelemans, W. and Van den Bosch, A. (2005) Memory-Based Language Processing. Cambridge University Press.
Hall, J. (2006) MaltParser - An Architecture for Inductive Labeled Dependency Parsing. (Licentiate Thesis) MSI report 06050. Växjö university: School of Mathematics and Systems Engineering.
Hall, J., Nilsson, J., Nivre, J., Eryigit, G., Megyesi, B., Nilsson, M., Saers, M. (2007) Single Malt or Blended? A Study in Multilingual Parser Optimization. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, June 28-30, 2007, Prauge, Czech Republic, pp. 933-939
McDonald, R. and Nivre, J. (2007) Characterizing the Errors of Data-Driven Dependency Parsing Models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Conference on Computational Natural Language Learning (EMNLP-CoNLL 2007), June 28-30, 2007, Prauge, Czech Republic, pp. 122-131
McDonald, R., Pereira F., Ribarov, K. and Hajic, J. (2005) Non-Projective Dependency Parsing using Spanning Tree Algorithms. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 523-530
Nivre, J. (2003) An Efficient Algorithm for Projective Dependency Parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT 03), Nancy, France, 23-25 April 2003, pp. 149-160.
Nivre, J., Hall, J. and Nilsson, J. (2004) Memory-Based Dependency Parsing. In Ng, H. T. and Riloff, E. (eds.) Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL), May 6-7, 2004, Boston, Massachusetts, pp. 49-56.
Nivre, J. (2004) Incrementality in Deterministic Dependency Parsing. In Incremental Parsing: Bringing Engineering and Cognition Together. Workshop at ACL-2004, Barcelona, Spain, July 25, 2004.
Nivre, J. and Scholz, M. (2004) Deterministic Dependency Parsing of English Text. In Proceedings of COLING 2004, Geneva, Switzerland, August 23-27, 2004.
Nivre, J. and Nilsson, J. (2005) Pseudo-Projective Dependency Parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 99-106.
Nivre, J., Hall, J. and Nilsson, J. (2006a) MaltParser: A Data-Driven Parser-Generator for Dependency Parsing. In Proceedings of LREC.
Nivre, J., Hall, J., Nilsson, J., Eryiğit, G. and Marinov, S. (2006b) Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL).
Nivre, J. (2006) Inductive Dependency Parsing. Springer.
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryiğit, G., Kübler, S., Marinov, S., Marsi, E. (2007a) MaltParser: A Language-Independent System for Data-Driven Dependency Parsing. Natural Language Engineering, June 2007 13(2), pp. 95-135
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D. (2007b) The CoNLL 2007 Shared Task on Dependency Parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, June 28-30, 2007, Prauge, Czech Republic, pp. 915-932.
Sagae, K. and Lavie, A. (2006) Parser combination by reparsing. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume Short Papers, pages 129-132