Update 1.3.0

Feb 03, 2017

  • scrapped SoMaJo German tokenizer
  • reverted to Moses built-in tokenizer
  • supplemented tokenizer with true unsupervised compound splitting for any language - ask us if you want to build a custom compound splitting dictionary for your language.
  • ships with German compound splitting dictionary (136,450 unique word lexicon with 3 or more instances in EuroParl German corpus)


WARNING TO ALL GERMAN USERS:

This update requires that you rebuild all of your engines. Please plan carefully and apply this update only when you have time to rebuild all of your engines. Please contact us for instructions if you need to gradually migrate engines over time.


1 person likes this
Login or Signup to post a comment