MaltParser in the CoNLL-X Shared Task

CoNLL-X Shared Task: Multi-lingual Dependency Parsing

MaltParser 0.4 was used in the CoNLL-X Shared Task on multi-lingual dependency parsing in the system that obtained the second best overall score, not significantly worse than the best score, and that achieved top results for nine languages out of thirteen (with results significantly better than any other system for Japanese, Swedish and Turkish). In this system, MaltParser was combined with pseudo-projective parsing, which requires preprocessing of training data and post-processing of parser output (Nivre and Nilsson 2005). The complete system is described in Nivre et al. (2006).

This web page summarizes our results in the shared task and gives the necessary information to reproduce the MaltParser results.

MaltParser 0.4

MaltParser 0.4 can be downloaded here (MaltParser 0.4: User Guide and Download). The pre- and post-processing tools of pseudo-projective parsing are necessary to reproduce the MaltParser results in the shared task and can be downloaded here (Pseudo-Projective Parsing). The following settings were kept constant for all languages:

Parsing algorithm	NIVRE
Parser option	-a E (arc-eager)
Projectivization	Marking strategy for pseudo-projective parsing: 1
Learner	SVM

Settings and Results

				LAS			UAS		L_ACC
Language	FM	P-options	SVM-options	MP	AV	POS	MP	AV	MP	AV
Arabic	ara5	-a E -o 3	-s 0 -t 1 -d 2 -g 0.16 -c 0.3 -r 0 -e 1.0 -S 0	66.71	59.94	1-3-4	77.52	73.48	80.34	75.12
Bulgarian	bul2	-a E -o 2	-s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.3 -e 0.1 -S 2 -F C1 -T 1000	87.41	79.98	1-2	91.72	85.89	90.44	84.38
Chinese	chi4	-a E -o 2	-s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.3 -e 0.1 -S 0	86.92	78.32	2-3	90.54	84.85	89.01	81.66
Czech	cze4	-a E -o 3	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1 -F P1 -S 2 -T 200	78.42	67.17	2	84.80	77.01	85.40	76.59
Danish	dan3	-a E -o 2	-s 0 -t 1 -d 2 -g 0.2 -c 0.6 -r 0.3 -e 1.0 -S 0	84.77	78.31	1-2	89.80	84.52	89.16	84.50
Dutch	dut5	-a E -o 2	-s 0 -t 1 -d 2 -g 0.16 -c 0.3 -r 0.0 -e 1 -S 0	78.59	70.73	1-2-3	81.35	75.07	83.69	77.57
German	ger3	-a E -o 2	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1 -S 2 -F P1 -T 1000	85.82	78.58	2-3	88.76	82.60	91.03	86.26
Japanese	jap1	-a E -o 2	-s 0 -t 1 -d 2 -g 0.19 -c 0.6 -r 0 -e 0.1 -S 0	91.65	85.86	1	93.10	89.05	94.34	89.90
Portuguese	por4	-a E -o 3	-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 0.1 -S 0	87.60	80.63	1-2	91.22	86.46	91.54	85.35
Slovene	slo4	-a E -o 3	-s 0 -t 1 -d 2 -g 0.20 -c 0.1 -r 0.8 -e 0.1 -S 2 -F C1 -T 600	70.30	65.16	NA	78.72	76.53	80.54	76.31
Spanish	spa2	-a E -o 2	-s 0 -t 1 -d 2 -g 0.20 -c 0.5 -r 0 -e 0.01 -S 2 -F P1 -T 1000	81.29	73.52	1-2-3	84.67	77.76	90.06	85.71
Swedish	swe3	-a E -o 2	-s 0 -t 1 -d 2 -g 0.2 -c 0.4 -r 0 -e 0.1 -S 0	84.58	76.44	1	89.50	84.21	87.39	80.00
Turkish	tur1	-a E -o 2	-s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.6 -e 0.01 -S 2 -F C1 -T 100	65.68	55.95	1	75.82	69.35	78.49	69.59
Average (12 lang)				80.19		1-2	85.48		86.75

Explanation

FM	Feature model (right-click/save-as)
P-options	Parser options
SVM-options	LIBSVM options together with MaltParser specific options
LAS	Labeled attachment score
UAS	Unlabeled attachment score
L_ACC	Label accuracy
MP	MaltParser 0.4
AV	Average in the CoNLL-X Shared Task
POS	Our position in the shared task (in boldface). Higher and lower positions indicate differences that are not statistically significant.

MaltParser CoNLL-X Shared Task Group

Joakim Nivre	Växjö University, Sweden
Johan Hall
Jens Nilsson
Gülşen Eryiğit	Istanbul Technical University, Turkey
Svetoslav Marinov	University of Skövde, Sweden

More information

More information about parsing algorithms, learning algorithms and feature models can be found in the following publications:

Chang, C.-C. and Lin, C.-J. (2005) LIBSVM: A Library for Support Vector Machines. URL: http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
Covington, M. A. (2001) A Fundamental Algorithm for Dependency Parsing. In Proceedings of the 39th Annual ACM Southeast Conference, pp. 95-102.
Daelemans, W. and Van den Bosch, A. (2005) Memory-Based Language Processing. Cambridge University Press.
Nivre, J. (2003) An Efficient Algorithm for Projective Dependency Parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT 03), Nancy, France, 23-25 April 2003, pp. 149-160.
Nivre, J., Hall, J. and Nilsson, J. (2004) Memory-Based Dependency Parsing. In Ng, H. T. and Riloff, E. (eds.) Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL), May 6-7, 2004, Boston, Massachusetts, pp. 49-56.
Nivre, J. (2004) Incrementality in Deterministic Dependency Parsing. In Incremental Parsing: Bringing Engineering and Cognition Together. Workshop at ACL-2004, Barcelona, Spain, July 25, 2004.
Nivre, J. and Scholz, M. (2004) Deterministic Dependency Parsing of English Text. In Proceedings of COLING 2004, Geneva, Switzerland, August 23-27, 2004.
Nivre, J. and Nilsson, J. (2005) Pseudo-Projective Dependency Parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 99-106.
Nivre, J., Hall, J. and Nilsson, J. (2006) MaltParser: A Data-Driven Parser-Generator for Dependency Parsing. In Proceedings of LREC.
Nivre, J., Hall, J., Nilsson, J., Eryiğit, G. and Marinov, S. (2006) Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL).
Nivre, J. (2006) Inductive Dependency Parsing. Springer.