Converting TMX into bilingual Excel file without special tools

Not all translators have all of the complex software tools available. Imagine you have a TMX file and you need to extract its contents into a bilingual Excel file. This article describes how to perform this ?data digging? with Microsoft Word and Unicode text editor.

Pre-requisites:

  1. Unicode text editor, such as Notepad 2, PS Pad, or similar
  2. Microsoft Word
  3. Microsoft Excel

Procedure:

  1. Open the TMX file in Unicode editor.
  2. Select all content and copy the same into a new Word file.
  3. Select all text in Word file (Ctrl+A) and set it as hidden (Ctrl+D, select Hidden checkbox).
  4. Press Ctrl-H (Search and Replace) and click the More button.
  5. Select the Use wildcards checkbox.
  6. Enter ?\<seg\>*\</seg\>? (without double quotes) into Find input field.
  7. Click into the ?Replace with? field and leave it empty. Click Format button and then select the Font option. Uncheck the Hidden checkbox). Click Replace All button.
  8. Press Ctrl-H (Search and Replace).
  9. Uncheck the Use wildcards checkbox.
  10. Enter ?<seg>? (without double quotes) into Find input field.
  11. Click the ?Replace with? field and enter ?<seg>?. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.
  12. Press Ctrl-H (Search and Replace). Keep the Use wildcards checkbox unchecked.
  13. Enter ?</seg>? (without double quotes) into Find input field.
  14. Click the ?Replace with? field and enter ?</seg>^p?. Click the More button, click Format button and then select the Font option. Uncheck the Hidden checkbox). Click Replace All button.
  15. Enter ?</seg>? (without double quotes) into Find input field.
  16. Click the ?Replace with? field and enter ?</seg>?. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.
  17. Save the Word file.

Now we have a Word file, where all texts other than source and target are hidden.

  1. Copy all text in the Word file and paste into a new text document in Unicode text editor.
  2. Save a new text file.
  3. Open Microsoft Excel.
  4. Open the text file you have just created in Excel.
  5. Import the TXT following screen instructions. On pages 1 and 2 of the import wizard, click Next button. On the last page, select the Text radio button and click the Finish button.

Now we have an Excel file with our source and translation text. However, pairs are arranged into a single column. As the last step, we need to transpose every second row into a column. For this, we will use the following macro in Excel.

  1. Copy the below code.
  2. Press the keys ALT + F11 to open the Visual Basic Editor.
  3. Press the keys ALT + I to activate the Insert menu.
  4. Press M to insert a Standard Module.
  5. Paste the code by pressing the keys CTRL + V.
  6. Press the keys ALT + Q to exit the Editor, and return to Excel.
  7. To run the macro from Excel, press ALT + F8 to display the Run Macro Dialog. Double Click the macro’s name to Run it.

Important:

You have to change the sheet name in the code below to whatever your sheet name is. Example: If your sheet name is ?List1? (instead of default Sheet1), you need to replace With Sheets(“Sheet1”) with With Sheets(“List1”).

WE ARE DONE. Now we have a bilingual Excel file created from source TMX.

Here is the code:

Option Explicit
Sub ReorgData()
Dim a As Variant, b As Variant
Dim lr As Long, lc As Long
Dim i As Long, ii As Long, c As Long, nc As Long
With Sheets("Sheet1")
lr = .Cells.Find("*", , xlValues, xlWhole, xlByRows, xlPrevious, False).Row
lc = .Cells.Find("*", , xlValues, xlWhole, xlByColumns, xlPrevious, False).Column
a = .Range(.Cells(1, 1), .Cells(lr, lc))
ReDim b(1 To UBound(a, 1), 1 To UBound(a, 2) * 2)
For i = 1 To UBound(a, 1) Step 2
ii = ii + 1
nc = 1
For c = 1 To UBound(a, 2)
b(ii, nc) = a(i, c)
b(ii, nc + 1) = a(i + 1, c)
nc = nc + 2
Next c
Next i
.Range(.Cells(1, 1), .Cells(lr, lc)).ClearContents
.Cells(1, 1).Resize(UBound(b, 1), UBound(b, 2)) = b
.Columns.AutoFit
End With
End Sub

This entry was posted in Computer-aided Translation, Software-related. Bookmark the permalink.

12 Responses to Converting TMX into bilingual Excel file without special tools

  1. Larisa says:

    What can i do if the unicode file is too big and word can not copy it. and even when i copied it piece by piece, when i chenge it, i can not save changes. do you any solution for that?

    thank you!

  2. Khan says:

    I tried this one but it gives me a type mismatch error. Can you please let me know how to resolve this?
    Thanks
    Khanonline

    • admin says:

      Can you be more specific where you get that error?
      If you used the recommended software, it should work fine. At least, works well on my machine.

  3. caroline says:

    Hello,
    from point 11 on, I get 0 results… there must be a mistake on my side, but which one. Could you write a macro for Word? I was trying to, but it seems to be, I do a mistake.
    Issue with:
    Enter ?? (without double quotes) into Find input field.
    Click the ?Replace with? field and enter ??. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.

    thanks
    caroline

    • caroline says:

      I came a bit further, but the following error message appears in xls:
      Cells(1, 1).Resize(UBound(b, 1), UBound(b, 2)) = b has an issue

  4. caroline says:

    Hi,
    forget it! It works a dream!!! Here is the macro I wrote in Word for it, maybe it might be of any helf for anybody else:

    Sub tmxversxls()

    ‘ tmxversxls Macro


    Selection.Paste
    Selection.WholeStory
    With Selection.Font
    .Name = “”
    .Hidden = True
    End With
    Selection.Find.ClearFormatting
    Selection.Find.Font.Hidden = True
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = False
    With Selection.Find
    .Text = “\*\”
    .Replacement.Text = “”
    .Forward = True
    .Wrap = wdFindAsk
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchAllWordForms = False
    .MatchSoundsLike = False
    .MatchWildcards = True
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Font.Hidden = True
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = True
    With Selection.Find
    .Text = “”
    .Replacement.Text = “”
    .Forward = True
    .Wrap = wdFindAsk
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Font.Hidden = True
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = True
    With Selection.Find
    .Text = “”
    .Replacement.Text = “”
    .Forward = True
    .Wrap = wdFindAsk
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = True
    With Selection.Find
    .Text = “”
    .Replacement.Text = “”
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = False
    With Selection.Find
    .Text = “”
    .Replacement.Text = “^p”
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Hidden = True
    With Selection.Find
    .Text = “”
    .Replacement.Text = “”
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    End Sub

    I simply copied the text from the txt-file into word and launched my macro 🙂
    and followed your instructions again. I wonder why it didn’t work immediatly, though.
    Thank you Michal!

    • admin says:

      Hi Caroline, I am glad it worked for you after all. And thanks for the macro. Will give it a try later. Take care! Michal

  5. Lisa Pecunia says:

    Unable to get this to work, using MS Word v15 for Mac.

  6. Edith says:

    Thank you so much for this handy tool. I had to try a couple of times, but now it works. Fantastic!

  7. Cynthia Martel says:

    Good afternoon! I know that this post is a few years old… But I’m still hoping to get an answer!

    When I tried this, I got the following message when I tried to run the Macro in Excel:
    Runtime error ‘9’
    Subscript out of range.

    Any ideas on how I could fix that?

    Thanks in advance!

Leave a Reply

Your email address will not be published. Required fields are marked *