Not all translators have all of the complex software tools available. Imagine you have a TMX file and you need to extract its contents into a bilingual Excel file. This article describes how to perform this ?data digging? with Microsoft Word and Unicode text editor.
Pre-requisites:
- Unicode text editor, such as Notepad 2, PS Pad, or similar
- Microsoft Word
- Microsoft Excel
Procedure:
- Open the TMX file in Unicode editor.
- Select all content and copy the same into a new Word file.
- Select all text in Word file (Ctrl+A) and set it as hidden (Ctrl+D, select Hidden checkbox).
- Press Ctrl-H (Search and Replace) and click the More button.
- Select the Use wildcards checkbox.
- Enter ?\<seg\>*\</seg\>? (without double quotes) into Find input field.
- Click into the ?Replace with? field and leave it empty. Click Format button and then select the Font option. Uncheck the Hidden checkbox). Click Replace All button.
- Press Ctrl-H (Search and Replace).
- Uncheck the Use wildcards checkbox.
- Enter ?<seg>? (without double quotes) into Find input field.
- Click the ?Replace with? field and enter ?<seg>?. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.
- Press Ctrl-H (Search and Replace). Keep the Use wildcards checkbox unchecked.
- Enter ?</seg>? (without double quotes) into Find input field.
- Click the ?Replace with? field and enter ?</seg>^p?. Click the More button, click Format button and then select the Font option. Uncheck the Hidden checkbox). Click Replace All button.
- Enter ?</seg>? (without double quotes) into Find input field.
- Click the ?Replace with? field and enter ?</seg>?. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.
- Save the Word file.
Now we have a Word file, where all texts other than source and target are hidden.
- Copy all text in the Word file and paste into a new text document in Unicode text editor.
- Save a new text file.
- Open Microsoft Excel.
- Open the text file you have just created in Excel.
- Import the TXT following screen instructions. On pages 1 and 2 of the import wizard, click Next button. On the last page, select the Text radio button and click the Finish button.
Now we have an Excel file with our source and translation text. However, pairs are arranged into a single column. As the last step, we need to transpose every second row into a column. For this, we will use the following macro in Excel.
- Copy the below code.
- Press the keys ALT + F11 to open the Visual Basic Editor.
- Press the keys ALT + I to activate the Insert menu.
- Press M to insert a Standard Module.
- Paste the code by pressing the keys CTRL + V.
- Press the keys ALT + Q to exit the Editor, and return to Excel.
- To run the macro from Excel, press ALT + F8 to display the Run Macro Dialog. Double Click the macro’s name to Run it.
Important:
You have to change the sheet name in the code below to whatever your sheet name is. Example: If your sheet name is ?List1? (instead of default Sheet1), you need to replace With Sheets(“Sheet1”) with With Sheets(“List1”).
WE ARE DONE. Now we have a bilingual Excel file created from source TMX.
Here is the code:
Option Explicit
Sub ReorgData()
Dim a As Variant, b As Variant
Dim lr As Long, lc As Long
Dim i As Long, ii As Long, c As Long, nc As Long
With Sheets("Sheet1")
lr = .Cells.Find("*", , xlValues, xlWhole, xlByRows, xlPrevious, False).Row
lc = .Cells.Find("*", , xlValues, xlWhole, xlByColumns, xlPrevious, False).Column
a = .Range(.Cells(1, 1), .Cells(lr, lc))
ReDim b(1 To UBound(a, 1), 1 To UBound(a, 2) * 2)
For i = 1 To UBound(a, 1) Step 2
ii = ii + 1
nc = 1
For c = 1 To UBound(a, 2)
b(ii, nc) = a(i, c)
b(ii, nc + 1) = a(i + 1, c)
nc = nc + 2
Next c
Next i
.Range(.Cells(1, 1), .Cells(lr, lc)).ClearContents
.Cells(1, 1).Resize(UBound(b, 1), UBound(b, 2)) = b
.Columns.AutoFit
End With
End Sub
What can i do if the unicode file is too big and word can not copy it. and even when i copied it piece by piece, when i chenge it, i can not save changes. do you any solution for that?
thank you!
There are various open-source tools that can be used for splitting and merging Unicode files.
I tried this one but it gives me a type mismatch error. Can you please let me know how to resolve this?
Thanks
Khanonline
Can you be more specific where you get that error?
If you used the recommended software, it should work fine. At least, works well on my machine.
Hello,
from point 11 on, I get 0 results… there must be a mistake on my side, but which one. Could you write a macro for Word? I was trying to, but it seems to be, I do a mistake.
Issue with:
Enter ?? (without double quotes) into Find input field.
Click the ?Replace with? field and enter ??. Click the More button, click Format button and then select the Font option. Check the Hidden checkbox). Click Replace All button.
thanks
caroline
I came a bit further, but the following error message appears in xls:
Cells(1, 1).Resize(UBound(b, 1), UBound(b, 2)) = b has an issue
Hi,
forget it! It works a dream!!! Here is the macro I wrote in Word for it, maybe it might be of any helf for anybody else:
Sub tmxversxls()
‘
‘ tmxversxls Macro
‘
‘
Selection.Paste
Selection.WholeStory
With Selection.Font
.Name = “”
.Hidden = True
End With
Selection.Find.ClearFormatting
Selection.Find.Font.Hidden = True
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = False
With Selection.Find
.Text = “\*\”
.Replacement.Text = “”
.Forward = True
.Wrap = wdFindAsk
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.Hidden = True
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = True
With Selection.Find
.Text = “”
.Replacement.Text = “”
.Forward = True
.Wrap = wdFindAsk
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.Hidden = True
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = True
With Selection.Find
.Text = “”
.Replacement.Text = “”
.Forward = True
.Wrap = wdFindAsk
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = True
With Selection.Find
.Text = “”
.Replacement.Text = “”
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = False
With Selection.Find
.Text = “”
.Replacement.Text = “^p”
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Hidden = True
With Selection.Find
.Text = “”
.Replacement.Text = “”
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
I simply copied the text from the txt-file into word and launched my macro 🙂
and followed your instructions again. I wonder why it didn’t work immediatly, though.
Thank you Michal!
Hi Caroline, I am glad it worked for you after all. And thanks for the macro. Will give it a try later. Take care! Michal
Unable to get this to work, using MS Word v15 for Mac.
You need to be more specific, if you want any response.
Thank you so much for this handy tool. I had to try a couple of times, but now it works. Fantastic!
Good afternoon! I know that this post is a few years old… But I’m still hoping to get an answer!
When I tried this, I got the following message when I tried to run the Macro in Excel:
Runtime error ‘9’
Subscript out of range.
Any ideas on how I could fix that?
Thanks in advance!