You're eager to get started. Here are a few steps that will help you get the most of your Slate Desktop experience:
- Export your translation units from your CAT's native translation memory to translation memory exchange (TMX) files. With large translation memories, this can take several hours in itself.
- Before the export, normalize inconsistent metadata before converting to TMX. If a subject is saved as "IT" in some and "Information Technology" in others, normalize the metadata to one or the other. Also, normalize casing for consistency. This can be done with Slate Desktop's Curating TMX translation units process, but it might be easier with your TM's native tools.
- During the export, preserve metadata about translation units. Consider preserving the translator's name, client, subject or other information that relates to the subject (e.g. department in the company), original translated file's name, etc.
- Export your translation units to several smaller TMX files consistent with the metadata categories. Again, your TM's native tools might be easier than using Slate Desktop's Curating TMX translation units process.
- If your TM tools give you the option, configure the export process to replace variables/placeholders with meaningful content.
- I do not know if cleaning your TMs with tools such as XBench will help. You might want to experiment. To save time, read these articles that describe Slate Desktop's processing, and compare these processes with XBench or other processes:
- The intention of all processing should be to remove or correct errors and reduce the sentences to their core semantic components.
- It is possible to over-process your TMs. For example, if you convert all source language flat double quote instances from " to “ and ” (as appropriate), then your engine's source language vocabulary will be devoid of the " token. When you translate a sentence the flat double quote " the engine won't know how to translate it. Slate Desktop can support custom processing that elminiates these risks, and we can help under a professional services relationship. As a general rule with SMT, it's best to normalize the target half of your TM, but allow the natural variations to remain in the source language half.
Note that curating your translation units gives you the opportunity to re-mix and prioritize how they influence an engine's performance. This does not guarantee your engines will have better linguistic performance, but as long as they are in a big-mama collection, there's no way to experiment to find out.