Case study: Preparing a non-standard tagged file for translation in Trados Studio

Two years ago I posted article that described the process of preparing a non-standard tagged file for translation in Trados TagEditor. Now the time has come to re-evaluate options that I, as a translator, have while translating tagged files that come from various sources and do not necessarily be based on HTML, FrameMaker, or any other similar standard.


  • PSPad or any other Unicode-enabled text editor
  • SDL Trados Studio 2011
  • Dummy.ini (download here)

First, lets take the following sample text:

<ARRAY nElems=36>
<STRING>Error Potentiometer Hip Left</STRING>
<STRING>Error Potentiometer Knee Left</STRING>
<PART order=0><LABEL><STEXT>Operator Switch Light Curtain</STEXT></LABEL></PART>
<PART order=0><LABEL><STEXT>I confirm that I am aware that the XXX? system may only be used if the mandatory maintenance and safety checks have been carried out during the last 15 months by an engineer certified by the manufacturer. These measures are necessary in order to guarantee your patients? safety and the XXX system?s reliability. The mandatory maintenance and safety checks are therefore due in no later than %0.f days. Contact your relevant XXX service center for further information, or book the mandatory maintenance check, if you have not already done this, via</STEXT></LABEL></PART>

Text marked in red needs to be translated.

In the previous article, the objective was to prepare the text for translation in TagEditor.

Now, let’s take a look how to achieve the same result quicker and more conveniently with SDL Trados Studio 2011. Assume that First, let’s make a list of tags, i.e. non-translatable content that we want to keep as it is. In our case we have these tags:

  • <ARRAY nElems=36>
  • <PART order=0>, </PART>
  • <LABEL>, </LABEL>
  • <STEXT>, </STEXT>

Please note that the ARRAY and PART tags contain attributes.

First, we have to prepare the source file, as end of line characters included in the source are not recognized as line breaks by the Studio. We have to replace the original EOL chars with <br /> HTML tag. Open the source file (PSPad is used in this case), press CTRL+H and enter the following in the Replace window:

Press OK and select replace all option. Now, we have the source file ready.

  1. Run the SDl Trados Studio.
  2. Select Tools > Options. In the left pane, click File Types and click the New button on the right.
  3. In the Select Type window, select HTML. Click OK.
  4. Enter name of the new File Type in the File type name field (in our case, “CaseStudy”). Enter identifier into the File type identifier field (in our case, “CaseStudy”). This can be anything that describes your settings. Click Next.
  5. In the HTML settings import window, click Browse and locate the Dummy.ini that you have previously downloaded (unzip into preferred location). Click Dummy.ini and select Open.
  6. Click Finish.

Now, we have our new file type (CaseStudy) in the list of available file types. We are going to enter a list of tags (called elements and attributes).

  • In the left pane, click the CaseStudy and then Elements and attributes.

  • In the right pane, click on the Dummy line and select Delete.
  • Let’s add the tags now. Fist, <ARRAY nElems=36>. Click Add. In Element Settings window, enter element name in Name field. In our case, enter “ARRAY” (case-sensitive and without brackets!). In Content type, select EMPTY and for Tag type select Always inside segment. In Content protection, select Always protected (even in explicitly translatable content).
  • This tag has attribute (nElems=36) and thus, we have to define it. In Attributes area, click Add. Enter attribute name. In our case, enter nElems. Do not include “=36”. Keep the Translate checkbox unchecked.
  • Click OK. We have our first tag defined.
  • We will define the <STRING> tag. Click Add. In Element Settings window, enter element name in Name field. In our case, enter “STRING” (case-sensitive and without brackets!). In Content type, select EMPTY and for Tag type select Always inside segment. In Content protection, select Always protected (even in explicitly translatable content).

Identically, define the remaining tags, i.e. <PART order=0> (do not forget to define the attribute!), <LABEL>, and <STEXT>.

After all tags are defined, our window looks like this:

Click OK.

Let’s test the sample file in SDL Trados Studio 2011.

  1. Save the sample text as HTML file. using Notepad or any other Unicode enabled program, like PSPad. Use UTF-16 LE encoding.
  2. In Studio, select File > Open > Document.
  3. Open the file with sample text you have previously created.

The Studio opens the file and all unnecessary text (i.e. <ARRAY nElems=36>, <PART order=0>, <LABEL>, and so on) is tagged and thus cannot be tampered with.

The actual file for translation looks like as follows:

Benefit: If you prepare the data for translation like this, you may achieve higher leverage, since in the original condition, the actual tags are included into analysis and it is possible that some matches will not be found just due to differing tags. This is obvious in segments 2 and 3 above – in original condition, they provide 0% match, while after preparing, you will obtain 83% match. An exact word count analysis is thus possible.

Translate the file in the Studio and once finished, export the translation into the original format. The last required step is converting the <br /> HTML tags into EOL characters. In our case, we can simply delete the <br /> HTML tags. Open the translated file in PSPad, select CTRL + H and fill the fields as follows:

The Replace field has to be empty (not even a space)! Press OK  select replace all option. Save the file and we are done!

Important note: The position file definition in the File Types list (Tools > Options > File Types) matters. This is a HTML file type and under these circumstances it is listed before the standard XHTML 1.1 and HTML File type. This means that our custom definition would be used for opening all HTML files in the future, and we do not want this. Therefore, after finishing work on our particular project with a file in a special format, we have to either delete the custom definition we have created, or uncheck the check-box next to the File Type.

This entry was posted in Computer-aided Translation. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *