MaltParser in the CoNLL-X Shared Task
CoNLL-X Shared Task: Multi-lingual Dependency Parsing
MaltParser 0.4 was used in the CoNLL-X Shared Task on multi-lingual dependency parsing in the system that obtained the second best overall score, not significantly worse than the best score, and that achieved top results for nine languages out of thirteen (with results significantly better than any other system for Japanese, Swedish and Turkish). In this system, MaltParser was combined with pseudo-projective parsing, which requires preprocessing of training data and post-processing of parser output (Nivre and Nilsson 2005). The complete system is described in Nivre et al. (2006).
This web page summarizes our results in the shared task and gives the necessary information to reproduce the MaltParser results.
MaltParser 0.4
MaltParser 0.4 can be downloaded here (MaltParser 0.4: User Guide and Download). The pre- and post-processing tools of pseudo-projective parsing are necessary to reproduce the MaltParser results in the shared task and can be downloaded here (Pseudo-Projective Parsing). The following settings were kept constant for all languages:
Parsing algorithm | NIVRE |
Parser option | -a E (arc-eager) |
Projectivization | Marking strategy for pseudo-projective parsing: 1 |
Learner | SVM |
Settings and Results
|
LAS |
UAS |
LACC |
Language |
FM |
P-options |
SVM-options |
MP |
AV |
POS |
MP |
AV |
MP |
AV |
Arabic | ara5
| -a E -o 3 |
-s 0 -t 1 -d 2 -g 0.16 -c 0.3 -r 0 -e 1.0 -S 0 |
66.71 | 59.94 | 1-3-4 |
77.52 | 73.48 |
80.34 | 75.12 |
Bulgarian | bul2 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.3 -e 0.1 -S 2 -F C1 -T 1000 |
87.41 | 79.98 | 1-2 |
91.72 | 85.89 |
90.44 | 84.38 |
Chinese | chi4 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.3 -r 0.3 -e 0.1 -S 0 |
86.92 | 78.32 | 2-3 |
90.54 | 84.85 |
89.01 | 81.66 |
Czech |
cze4 | -a E -o 3 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1 -F P1 -S 2 -T 200 |
78.42 | 67.17 | 2 |
84.80 | 77.01 |
85.40 | 76.59 |
Danish | dan3 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.6 -r 0.3 -e 1.0 -S 0 |
84.77 | 78.31 | 1-2 |
89.80 | 84.52 |
89.16 | 84.50 |
Dutch |
dut5 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.16 -c 0.3 -r 0.0 -e 1 -S 0 |
78.59 | 70.73 | 1-2-3 |
81.35 | 75.07 |
83.69 | 77.57 |
German |
ger3 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 1 -S 2 -F P1 -T 1000 |
85.82 | 78.58 | 2-3 |
88.76 | 82.60 |
91.03 | 86.26 |
Japanese | jap1 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.19 -c 0.6 -r 0 -e 0.1 -S 0 |
91.65 | 85.86 | 1 |
93.10 | 89.05 |
94.34 | 89.90 |
Portuguese | por4 | -a E -o 3 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.5 -r 0 -e 0.1 -S 0 |
87.60 | 80.63 | 1-2 |
91.22 | 86.46 |
91.54 | 85.35 |
Slovene | slo4 | -a E -o 3 |
-s 0 -t 1 -d 2 -g 0.20 -c 0.1 -r 0.8 -e 0.1 -S 2 -F C1 -T 600 |
70.30 | 65.16 | NA |
78.72 | 76.53 |
80.54 | 76.31 |
Spanish |
spa2 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.20 -c 0.5 -r 0 -e 0.01 -S 2 -F P1 -T 1000 |
81.29 | 73.52 | 1-2-3 |
84.67 | 77.76 |
90.06 | 85.71 |
Swedish |
swe3 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.2 -c 0.4 -r 0 -e 0.1 -S 0 |
84.58 | 76.44 | 1 |
89.50 | 84.21 |
87.39 | 80.00 |
Turkish | tur1 | -a E -o 2 |
-s 0 -t 1 -d 2 -g 0.12 -c 0.7 -r 0.6 -e 0.01 -S 2 -F C1 -T 100 |
65.68 | 55.95 | 1 |
75.82 | 69.35 |
78.49 | 69.59 |
Average (12 lang) | | |
|
80.19 | | 1-2 |
85.48 | |
86.75 | |
Explanation
FM | Feature model (right-click/save-as) |
P-options | Parser options |
SVM-options | LIBSVM options together with MaltParser specific options |
LAS | Labeled attachment score |
UAS | Unlabeled attachment score |
LACC | Label accuracy |
MP | MaltParser 0.4 |
AV | Average in the CoNLL-X Shared Task |
POS | Our position in the shared task (in boldface).
Higher and lower positions indicate differences that are not statistically
significant.
|
MaltParser CoNLL-X Shared Task Group
Joakim Nivre | Växjö University, Sweden |
Johan Hall |
Jens Nilsson |
Gülşen Eryiğit | Istanbul Technical University, Turkey |
Svetoslav Marinov | University of Skövde, Sweden |
More information
More information about parsing algorithms, learning algorithms and feature models can be found in the following publications:
- Chang, C.-C. and Lin, C.-J. (2005)
LIBSVM: A Library for Support
Vector Machines. URL:
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
- Covington, M. A. (2001)
A Fundamental Algorithm for Dependency Parsing.
In Proceedings of the 39th Annual ACM Southeast Conference,
pp. 95-102.
- Daelemans, W. and Van den Bosch, A. (2005)
Memory-Based Language Processing.
Cambridge University Press.
- Nivre, J. (2003)
An Efficient
Algorithm for Projective Dependency Parsing. In Proceedings
of the 8th International Workshop on Parsing Technologies (IWPT 03),
Nancy, France, 23-25 April 2003, pp. 149-160.
- Nivre, J., Hall, J. and Nilsson, J. (2004)
Memory-Based
Dependency Parsing.
In Ng, H. T. and Riloff, E. (eds.)
Proceedings of the Eighth Conference on Computational Natural
Language Learning (CoNLL), May 6-7, 2004, Boston, Massachusetts,
pp. 49-56.
- Nivre, J. (2004)
Incrementality
in Deterministic Dependency Parsing. In Incremental Parsing:
Bringing Engineering and Cognition Together. Workshop at ACL-2004,
Barcelona, Spain, July 25, 2004.
- Nivre, J. and Scholz, M. (2004)
Deterministic
Dependency Parsing of English Text. In Proceedings of COLING 2004,
Geneva, Switzerland, August 23-27, 2004.
- Nivre, J. and Nilsson, J. (2005)
Pseudo-Projective Dependency Parsing.
In Proceedings of the 43rd Annual Meeting of the Association for
Computational Linguistics (ACL), pp. 99-106.
- Nivre, J., Hall, J. and Nilsson, J. (2006) MaltParser: A Data-Driven
Parser-Generator for Dependency Parsing. In Proceedings of LREC.
- Nivre, J., Hall, J., Nilsson, J., Eryiğit, G. and Marinov, S. (2006)
Labeled
Pseudo-Projective Dependency Parsing with Support Vector Machines.
In Proceedings of the Tenth Conference on Computational Natural Language
Learning (CoNLL).
- Nivre, J. (2006) Inductive Dependency Parsing. Springer.