- begrensde verzameling teksten voor linguïstisch onderzoek
- a collection of writings
"he edited the Hemingway corpus"
- Tatoeba's corpus is heterogeneous in many dimensions.
- We created a freely available English-Japanese bilingual corpus.
- I would prefer to have a list of Italian words which aren't in the corpus.
- Do any online learning resources query this corpus for example material?
- The corpus is not structured as a table but as a graph.
- Some people say the corpus is intended for learners, some think it's more for scientific purposes.
- I wish there were more Native American languages in the Tatoeba Corpus.
- I wonder why the names Tom and Mary are often used in sentences that are in the Tatoeba Corpus.
- These and perhaps other sentences need to be removed from the corpus. They are from a copyrighted book.
- As you contribute more sentences to the Tatoeba Corpus in your native language, the percentage of sentences in your native language with errors will likely decrease.
- One way to lower the number of errors in the Tatoeba Corpus would be to encourage people to only translate into their native languages instead of visa versa.
- The Tatoeba corpus contains so many contributions that it can not be seriously damaged or denatured by the injection of any conceivable amount of noise.
- One way to lower the number of errors in the Tatoeba Corpus would be to encourage people to only translate into their native languages.
- Were we to populate the corpus with unnatural sentences or inaccurate translations, this resource wouldn't be of much use, now would it?
- One way to lower the number of errors in the Tatoeba Corpus would be to encourage people to only translate into their native languages instead of the other way around.