JBI Studios' Blog on Voice-Over, Dubbing, and Multimedia Localization.

4 Tips to Avoid Text Corruption When Subtitling Right-to-Left Languages

Text-based formats like SRT and WebVTT have revolutionized subtitling and captioning, making the process more adaptable, streamlined and cost-effective, especially for multilingual releases. Right-to-left languages, including Arabic, Farsi, Urdu and Hebrew have been no exception. That said, these languages still require special attention to avoid issues in text-based deliverables.

This post lists the 4 tips multimedia localization professionals must know to ensure the integrity of their right-to-left subtitling projects.

[Average read time: 3 minutes]

Why is right-to-left subtitling susceptible to text corruption?

One main reason – the multimedia localization process itself. There are several steps between translation and final implementation, including linguistic QA, changes rounds, first implementation pass, functional testing and QA, functional bug fixes (if any), and final QA. Each of these steps has the potential for text corruption.

Moreover, text-based deliverables – including SRT, WebVTT, STL, DFXP, TTML, CAP and ITT – don’t get burned to picture. The burning process effectively “locks” in the subtitles text, ensuring that no more edits can be made. Text-based deliverables, on the other hand, are editable at all stages of the post-production workflow – that means that issues can be introduced even towards the end of a project, especially when non-linguistic elements like time-codes and metadata get tweaked.

So – what do video localization professionals need to do to deliver immaculate right-to-left subtitles? Let’s jump right in.

1. Control how your translation and deliverable files are edited.

It’s critical to ensure that any files undergoing translation get opened on systems with proper language support, since saving to the wrong encoding or even a misplaced space can introduce issues. Be extra careful when small tweaks get made to the time-codes or metadata, as mentioned above, especially after a functional review. If these tweaks are done by editors who don’t speak the target language, or on machines that don’t have all necessary language drivers installed, they may introduce corruption issues.

2. Pay special attention to the left-to-right elements inside your text.

Translations often retain English-language terms within the localized text, especially on corporate or e-Learning projects. This can include brand names, proper names, and street or web addresses. Right-to-left text has to switch directionality to display each string of Latin characters, and then revert back to right-to-left, so that this is often where otherwise clean text can have issues.

It may not be possible to avoid English-language terms in the subtitle translations, but it’s important to pay attention to them. For example, if you’re planning to add line-breaks manually, avoid them inside left-to-right strings. Remember to check the spacing at the beginning and end of these strings, especially if there’s a right-to-left comma adjacent to one of them. And of course, be extra careful when making edits to them, especially towards the end of a workflow.

3. Know what to look for.

How can you tell that something’s wrong? The big give-away is punctuation marks that are out of place, like a period that’s not at the end of a sentence, or a comma directly following a period. The following image shows an Arabic subtitling SRT file that’s OK, and one that has issues (period highlighted in yellow).

arabic-subtitling-srt-text-corrupted
This is a great tell-tale in Arabic (and in languages that use the Arabic script, like Urdu and Farsi), as well as in Hebrew. In fact, the modern versions of both Arabic and Hebrew scripts have widely adopted Latin punctuation, including question and exclamation marks, which Western eyes can spot easily. Remember, though, that these marks may be mirrored for example, you'll see a ؟ symbol in Arabic and Urdu text. Hebrew, on the other hand, keeps questions marks in the same direction as English, so expect to see ? in Hebrew subtitling files.

4. Get a locked visual reference for the subtitles translations.

This is a multimedia localization best practice, and usually it means asking the linguist to create a PDF of the voice-over script or bilingual file. However, this one is tricky for text-based subtitles, since many translation interfaces don’t support output to PDF. Fortunately, there are other options. For example, your linguist can take a screenshot of a text sample, capturing as much content as possible. Since most text corruptions are global, this can serve as a baseline for catching them.

Test your workflow & implement a thorough QA review

Finally, don’t forget these two video localization best practices. Testing your workflow is particularly necessary if your final deliverable is to a proprietary video player or format, or even to ones that aren’t widely used. And of course, a thorough quality assurance review is essential for any multimedia localization workflow, including on voice-over and dubbing projects. QA is the only way to ensure the quality, accuracy and integrity of your final productions, but on text-based captioning and subtitling projects, it’s also the only way to catch any text corruption issues that may have snuck in at the very last minute.

Learn what JBI Studios Offers

Topics: Subtitles & Captions Video Translation Translation & Localization

Fill Out Form