Sunday, November 18, 2007

PDF (and DTP) rant

Yes, you probably already guessed what the subject of this rant is.

The usual nightmare - the client sends you a text to translate, but it's in PDF! And, what's even worse, you need to deliver not only the translation, but they want full DTP service - meaning they want the finalized PDF of the translated material, ready for printing :(

Of course, the PDF is a low-resolution version, and the illustrations will have to be "upped" to 300DPI, as required by the printing service vendor.

So, you try to explain that it's not done that way, that you cannot simply "reconstruct" the translated PDF from the low-res PDF you received. You try to argue that it simply cannot be done - at least not at the usual rates and within the usual timeframe.


However, the end client isn't aware of the technical problems, and you need to "simplify" the explanation of the problem, without getting too technical.

(As an aside, I'm constantly amazed by the fact that so many people think that, in order to get a translation of a 200-page manual you just need to press a few keys on the keyboard and - presto! - the perfectly printed (and bound) 200-page full-color book comes out of your office printer!)

But, to get back to the point, I usually have several "colorful" analogies in reserve, in order to explain WHY it isn't so easy to do. One of my favorites goes something like this:

"Well, to put it simply, you're providing me with 400 pounds of pork sausages and expecting me to return a 400-pound pig that happily runs around the pigsty.... Can't be done, sorry...."

So, somehow you manage to make them understand that it's no easy feat to get the final PDF which is just barely acceptable for printing, you agree on the price and the deadline, and now the real work begins...

The problem of PDF originals is multiple - first you have to extract the text (preferably preserving at least some of the formatting), then you have to extract the illustrations (and increase the resolution, if necessary), making sure nothing has been left out.

Then you have to do the DTP from scratch, using any of the usual DTP programs (in our case it's usually InDesign).

However, none of the problems stated are simple.

First, you cannot easily extract text from PDF - at least not in a usable format, which would not require additional "manual" work. For this purpose I still use Acrobat 5 and its "Save as RTF" function. It exports all of the text, but all lines of text usually end with line breaks, which have to be removed, so that the text can be translated using Trados or whatever TM software you use.

Acrobat 6 and 7 do this better, and usually do not place line breaks at the end of every line of text, but they have other, more serious faults - one of the most serious being that they tend to arbitrarily skip portions of the text when exporting to RTF - which makes them unusable.

There are other non-Adobe solutions, one of them being Abby PDF Transformer. It uses OCR principle, and actually works OK. However, it often does not preserve formatting (bold and italic text, etc.), so, again, it's not a perfect solution.

Most of the translations we do this way are manuals of one kind or another, mostly containing references to commands which should appear as bold text, like this:

To open the file, press Open, and browse to the file you want to open.

So, it's extremely important to have the bold formatting applied to appropriate strings...

In order to avoid any errors in the DTP process, and to make the DTP process as fast as possible, we usually tag our PDF-exported RTF files prior to translating - checking whether all the necessary formatting (usually only italic and bold) has been retained. It's a manual work, and sometimes it takes a long time, since we also insert special tags as placeholders for pictures and symbols, so that the text which will be translated looks e.g. like this:

To open the file, press Open or use the <PIC> icon on the toolbar.

(Otherwise, the DTP crew would have to find the locations of all those tiny icons themselves, wasting significantly more time, not to mention the increased possibility of errors!)

Then, the translation is done the usual way (Trados + Word), and the final, translated RTF is then processed using the VBA Word macro we developed in-house for export to InDesign tagged text. The tagged text has special formatting (red color) for our picture tags, and also some other stuff which makes the life of our DTP crew a lot easier when doing the DTP "from scratch".

However, the whole process is still extremely labor-intensive and time-consuming - and the end result is usually still just barely "acceptable" (since you obviously can't e.g. "upsize" a picture from 72DPI to 300DPI without any loss of quality...)

I don't have to mention that for this type of work we charge a lot more than when we work from "normal" DTP files.

But, even when we get the original InDesign (or any other DTP) files with all the links and fonts, the situation isn't always as clear as it might seem - often the original INDD files were prepared by... hmmm, let's say "less than professional" people, and are best left alone. In such cases we just use the original links (pictures), and create new INDD files "from scratch" - which is usually a lot faster than trying to use the originals we receive.

By "less than professional" I mean the situation where you have e.g. 12 InDesign paragraph styles defined, and when ALL of those styles applied in the document use manual overrides :(

Or the situation where e.g. indented text in INDD file is "created" using tabs, or - even worse - spaces... :(

You get the picture....

Labels: ,

Saturday, November 17, 2007

Linux (and tools)

Well, this one definitely isn't a rant :)

I'm constantly amazed how much Linux has improved in the last few years.

Right now I'm using SuSE 10.1, and more often than not I find that most of the peripherals I sometimes use (like a cheap digital camera, etc.) work better in Linux than in Windows :)


One of the best Linux programs I've been using for years is Hylafax - a powerful office faxing package, which I can heartily recommend to anyone who needs a robust office faxing solution. And, what's best, there's nothing like it in Windows world - well, nothing quite as robust, not to mention free.

The only thing I'm still missing is a simple replacement for FileMaker database. I keep my books in FileMaker - I developed my own accounting system in it, and have been using (and improving) it for years.

So far, I haven't been able to find something as simple and as compact as FileMaker. I don't need a full-blown MySQL or PostgreSQL installation, with dozens of directories, etc.. Something simple with an SQLite background would be quite OK, but the problem is that I also need simple and easy (visual) forms and reports creation... :(

I often need to change the reports I use, due to ever-changing legal requirements, and in FileMaker I can do it in a couple of minutes. No such luck with anything I've tried under Linux so far. However, I'm confident that I won't have to wait much longer, given that there are several promising Linux projects of that kind being actively developed...

Translation tools

As regards translation tools, there are many - but almost none that can be used "out-of-the-box" for any kind of "general" translation work.

By that, I mean there aren't many universal Linux translation tools - most are intended only for use with .po files - i.e. for localization of Linux software.

One Open Source exception is OmegaT - which I intend to test-drive soon.

There are other tools, like Heartsome XLIFF Editor - which isn't free, let alone Open Source, but still (somewhat) usable...

I've tried Heartsome XLIFF Editor (HXE), but wasn't exactly overwhelmed with it. I guess I'm spoiled by years of working with Trados, but HXE isn't exactly intuitive to use, and its Help is also very sparse...

Another thing is that it handles tags rather strangely. I tried to do a translation of an XLF file with it, but it didn't come out very well. Perhaps I did not configure it properly, but even after preparing a good TMX and importing old translations, the matches were rather strange. Combined with some segmentation problems, which I couldn't resolve quickly, I decided I didn't have the time to fiddle with it any more, and gave it up for that particular task. However, for simpler (non-tagged) texts, it should do OK, I guess.

I just wish there was a TM system under Linux that could be plugged into Emacs ;)

Labels: ,

Wednesday, November 14, 2007

Working with Passolo

Recently I had a project I had to do in Passolo. I was given instructions to download version 5 Translation Editor (the stripped-down "satellite version", intended for translators), and to use that for translation of a software project.

I've never used Passolo before, so it was a bit of challenge at first - trying to get to know the program and understand how it works.
At first I was glad to see that it has some kind of Trados integration, but soon realized that it doesn't really work in "Translation Editor" - at least I couldn't get it to work.


The help file does not really help, since obviously the same version of help file is used both in "full" and "translation editor" versions, so most of the functions described in the Help file don't actually work in the "satellite" Translation edition.

After some e-mails with my PM on this project, it was determined that Trados integration doesn't actually work. :(
I re-checked several times, and even tried to change the Trados integration macro, but to no avail - the macro cannot be edited in "satellite" edition of Passolo.
BTW, I realized what kind of hack it was when I saw that the reference to Trados Workbench file was hard-coded in the macro! :o

Another thing I couldn't exactly get the hang of was its fuzzy matching. I've properly populated the glossaries I was to use (a kind of workaround in order to be able to use existing translations, at least for terminology and shorter segments/sentences), but the fuzzy matches were rather surprising. Sometimes I'd get the expected match, and sometimes not. Strange.

Not to mention that once your glossary exceeds about 2.000 lines/terms, it tends to get rather slow... I had several glossaries, ranging from about 1.000 terms, up to 10.000 terms. I had to "switch on" (enable) those larger ones only when absolutely necessary, as it slowed the program almost to a crawl - I'd wait 10 seconds or more for the program to populate the window with matches for each new segment...

In short, not an easy job. What bothered me the most is the fact that the Help file that comes with this "satellite" edition actually refers to the full version, so you have to find out yourself what actually works in this "bare-bones" version :(

It's not all bad, though. The "live dialog" preview works fine, and is helpful when you need to shorten the translated string to make it fit on the dialog.

However, you should not rely completely on Passolo and its fuzzy-matching function to ensure consistency...

Labels: , ,

Ergonomic considerations

Sometimes I have to work with another proprietary software, which the client uses to translate the software strings for mobile phones - i.e. the texts which appear as phone screen messages, dialogs, etc.

This one is particularly annoying - it's done in Access, and the developers, IMHO, should be shot, as a deterrent measure for discouragement of other such "developers" with complete disregard for ergonomics. Perhaps even worse punishment would be to force them to use their own creation for a couple of hours....


The program is an ergonomic nightmare - for each string you translate, you need to click on several dissociated buttons on the screen. Even the most elementary rule of software UI is broken - the tab order is totally haphazard!

Not only that - but those buttons you need to click all the time do not have any keyboard equivalents!

Since I've been using that particular POS for years, I naturally devised some workarounds.

When I started to work with the program, I developed an immense and strong hatred for whoever programmed it... My right wrist started aching in about half an hour... despite the fact that I always use a wrist rest for my "mouse hand". So, something had to be done.

My first solution was to just extract the strings from the underlying Access file, export them to Word, and use Trados to translate.

However, populating the Access file back with the translated strings wasn't always easy, and, besides, I couldn't see if my strings are too long while translating in Word - which meant a lot of rewriting once I loaded the translations back into the original program and discovering that many of them are too long.

Not good.

I realized I'll actually have to use the program itself. So, I thought about it, and decided to add my own keyboard shortcuts.

The program comes in executable form (.EXE file), but thanks to a very good program - Resource Hacker - I was able to modify it and add my own keyboard shortcuts (accelerators).

"Resource Hacker" is free to use, so that's another plus!

BTW, after several years (and several versions) of using this particular localization tool, which is the subject of this post, the latest version (which I got about a month ago), finally has keyboard accelerators! Of course, not ALL of them :(

It now has about 3 out of at least 5-6 necessary accelerators.

Sheesh!

Labels: ,

Thursday, November 8, 2007

Logoport from Linux

So, you also work for Lionbridge, and have to use Logoport more and more.


Wouldn't it be nice if you could use Word + Logoport directly from Linux, without having to boot vmware Windows virtual machine?
Well, I've managed to do it, and it works.


I've used Crossover Office to install Word from the Office XP package, which works flawlessly. I've also added the "Logoport.dot" file in the templates directory, and also added the "Logoport.dll" to "system32" directory in the "fake Windows" in cxoffice.


Once you boot Word under Crossover, activate the Logoport.dot, and connect to Logoport using your account details.

Logoport will most likely tell you that you need to update Logoport, just click "OK", and the new dll will be installed.
After that close Word and restart it, as per instructions, and after that you can work without and problems in Word, connecting to Logoport and translating as usual.



Labels: ,

Saturday, November 3, 2007

Digging our own graves

Just yesterday I had a coffee with an old friend and co-sufferer, and sometimes sub-contractor when I have more on my plate than I can swallow... which isn't that rare :)


We were mostly whining about how it's getting harder and harder to survive, and his situation seemed a bit worse than mine, since his regular clients all seem to be trying to swindle him all the time... using different word counts, Trados match percentages, etc.
At least with most of my regular clients I don't usually have such problems.


We ended up hating mostly everything about our jobs - for similar reasons, like increasing stress, impossible deadlines, clients/agencies trying to squeeze more and more profit from us using dubious (and sometimes quite obvious) schemes, etc.




I remarked that quite a long time ago (about 10 years ago, when Trados became a de facto standard, not only among agencies, but also when the end-clients realized the potential for savings) I said that we're actually digging our own graves with this kind of work.


"How's that?" you might ask...


Well, if you think Trados (or another TM tool) enables you to work/translate more efficiently, and that it makes your life so much easier, I will probably concur, but there's another side to the story (isn't there always?).


Namely, Trados and translation memories in general were not invented to make life easier for translators, but to enable translation/localization agencies (and their end clients) to make more money! (Well, actually to make them spend less on translation, with the same end result.)


It was obvious from the start that there will be unscrupulous agencies/clients who will insist only on getting the most out of it (Trados TMs), by using whoever is available (or the cheapest) at the moment, filling their TMs with whatever they could lay their hands on, and equating the quantity with quality. Well, not exactly equating, but simply taking it for granted that the more segments a TM has, the more money they could squeeze out of it. The practice soon became widespread, even among those who should do (and know) better.


Of course, when the main goal is to have one huge TM, you don't pay much attention to what actually gets into it. Thus we come to the main point - quality.


As freelancers, we try to get paid what we think is fair price for our work. If you work in a usual language pair (let's say German to English), your rates will probably be quite a bit lower that if you're translating into some obscure language, in which you don't have much competition...
Your best bet then, if you want to survive among fierce competition, is to be distinguished from thousands of other freelancers, who are often cheaper, by your quality. So, you either specialize in a very obscure field, or you gain recognition as the best in certain field/area of work.


So, say you've achieved a nice niche for yourself, and you're charging comfortable rates, you have clients who care about quality (and that's why they hire you, and pay you higher rates), and for a few years everything seems to be working out OK.


But, after a while, the agency you work for starts switching tools and procedures, introducing more and more (unpaid) work for you, suddenly you find yourself doing more and more (unpaid) PM work for them, there are more and more reports to write, and in the end you realize that the time you need to translate those 400 words seems to take longer and longer - and you're still being paid XX $/EUR per word.


You also realize that the source texts you are translating are getting worse and worse, and you start asking questions. It turns out that the end client (a large telecom company or whatever) is gradually switching some of their operations to the cheaper parts of the world. Well, looks like the globalization is beginning to affect you, too. Suddenly, the manuals you translate from English are not written by competent Western European technical writers with excellent command of English, but by underpaid (probably part-time) staff on the other side of the world. Suddenly, those 400 words of English become 400 words of Chinglish (or Engrish, whichever you prefer), and it takes you at least three times longer to penetrate the strange language being used - and you need to develop mind-reading skills, too... in order to realize what the writer wanted to say, but didn't know how to say in English. So, it now takes you three times longer to translate those 400 words. And you're still being paid the same per-word rate...


However, realizing that globalization has caught up with you, you keep your mouth shut and continue working for the same rates, knowing that there are hordes of cheaper translators at your feet, and knowing that, if that "Chingrish" manual is being delivered in English-speaking parts of the world as something "acceptable" (as long as it's cheap), it won't take long before the translation agency (or the end client) realizes that they could save on translation, too.


But, one day you realize that the TM you got for the next job has other people's translations in it - the stuff that was not translated by you! And those come up as 95-100% matches! You can't believe it - and the quality of translation is simply awful! Those segments don't even use the standard terminology, which has been used for that end-client for years!


So, you get in contact with the PM and explain the situation, and say that those "translations" are simply unacceptable, and have to be completely retranslated...


"Sorry, old boy, that's what we've got from the end client, and we're not being paid to recheck the 100% matches, either! Leave those as they are! If you want to change those, you do so without being paid!"


What?


You can't believe your ears! You're being forced either to accept those half-literate segments as your own (as 100% matches), or you'll have to retranslate them for free!


So, naturally, you try to tell them about quality, about language nuances, you point out the worst blunders, and generally try to tell them that such "translations" are rubbish, and will reflect very poorly on the end client and their product - but to no avail.


It's dog eat dog world out there, and the end client has to get the (localized) product on the market before the others do. That's of paramount importance, and "slightly lower" quality of translation is not as important as THAT.


BINGO! Now we're getting to the point...


So, I'm actually FORCED by the agency/end client to reduce the quality of my translation by playing along.


You can see for yourself where this is going - once you accept those rules, you lose your only selling point - your quality! After all, there are many cheaper translators who will do such a (poor) job at a fraction of your cost!


So, what to do about it?


With clients like those, I make it quite clear that I don't use other people's translations and/or TMs. The only translation segments and/or matches I can work with are my own. At least that way I have some control over the quality of the end product (translation).


I made it quite clear that, the next time I find anyone else's translation in the TMs I am forced to use, that will be the end of our (long-term) cooperation.


How do you deal with such issues?

Labels:

Thursday, November 1, 2007

Never-ending story

It ain't easy. No way.

After almost 15 years as a freelance translator, I thought I was quite prepared for whatever my clients might throw at me.

How wrong I was!

I use Linux - at least my main machine boots into Linux. Like the vast majority of my co-sufferers, I have to use Trados, and that means Windows, too. However, I was able to circumvent this problem and prevent the Redmond Rogue from overtaking my machine by using virtualization.

I've managed to contain Windows by using vmware :)

So, everything that requires Windows runs under Linux, inside this virtual machine, which is NOT allowed to access the Internet. The end result is that the "windowed" Windows XP runs faster in vmware than natively - since I don't have to burden the OS with firewalls, antiviruses, etc., which all slow Windows down, sometimes to a crawl...

Getting it all to work seamlessly wasn't trivial, but now I've got a working system with the best of both worlds - and almost no worries about viruses, trojans, backdoors, etc...

Anyway, I was using this setup quite successfully for years now, when my main client decided to switch to proprietary translation system, which is (ouch!) Internet-based - meaning I have to be online all the time while translating (in Windows)... :(

Not only Internet-based, but some of its features (e.g. invoicing) are accessible only through MS IE :(

Time for more tinkering...

The problem of Web-based translation work was solved by setting up another Windows virtual machine, which is being used only for Web translations (I have to run a firewall in that one, though), and the IE invoicing problem was solved by running IE directly from Linux (installed through Wine).

However, some big changes are on the horizon, and for us translators they mean even less control over the work we do - like doing away with local TMs (on translators' hard disks).

Recently I've even read about "crowdsourcing" in translation context!
Apparently it's a new buzzword among the big players in the l10n field - and a way to squeeze even more profits from the unsuspecting and would-be translators - or should I say "translators"?

It started with Google, I guess - see here or here.

The concept comes from Open Source community - basically you have a vast community of volunteers spread all over the world, and you harness that potential to get something usable - in this case profitable :)

More info and some insightful comments here.

Now, this is something that really makes my stomach turn...

But, I guess those big boys looking to make a quick buck out of it will soon learn the meaning of the phrase "There's no such thing as a free lunch".

What's this all about?

rant, n.
[f. the vb.]
1. A high-flown, extravagant, or bombastic speech or utterance; a piece of turgid declamation; a tirade.
2. Extravagant or bombastic language or sentiments; magniloquent and empty declamation.

Well, the first two OED definitions should suffice. I'm a translator (and localizer, and more often than not a DTP-monkey [no disrespect intended to DTP profession!]), and for those who are in the same boat as me, the need to vent some steam from time to time should be something quite familiar.

So, rants it is!