README

This glossary contains transfer entries for German-to-English, annotated with probabilities. It is derived from a parallel corpus of 3.8 mio sentences, with a focus on European Administrative Domain (Europarl, JRC-Acquis). 

Three probabilities have been computed: 
	Package probability, i.e. translation probability related on the transfer package (i.e. the other translations of the given source language term)
	Target probability, i.e. translation probability related to the term used as target term by any source term
	Corpus probability, i.e. translation probability related to the occurrence of the target term anywhere in the corpus.

Probabilities can be 0. The creation of the Glossary is described in the PANACEA report D5.6: Transfer Selection Support.


The file is <tab> delimited; the different columns mean:
1.	German term
2.	German part of speech
3.	English translation
4.	English part of speech
5.	(irrelevant)
6.	(irrelevant)
7.	Package probability
8.	Target probability
9.	Corpus probability

All entries give lemmas,; the glossary contains about 22.000 entries.
