Cleaning up a crappy OCR job for translation

It’s a sad fact in the professional work of translators that a lack of understanding on how to deal effectively with various PDF formats causes enormous loss of productivity and results which are not really fit for purpose. The aggressive insistence of many colleagues possessed of a dangerous Halbwissen on using half-baked methods and inappropriate tools contributes to the problem, but, bowing to the wisdom about arguing with fools, I now mostly sit back with a bemused and amused smile and watch the tribulations of those who believe in salvation by PDF import filters and cheap or free OCR. “TANSTAAFL” is a true as it ever was.

Just before the weekend I got an inquiry from an agency client I rather like. Nice people, good attitude, but struggling sometimes trying to find their way with technology despite some in-country “expert” training. This inquiry looked a bit like ripe fish at first glance. The smell got stronger after I was told that because the corporate end client had converted the PDF for their annual report and begun to edit the mess (and comment it heavily too) in the OCR file that this would be all there was to work with. It was a thoroughly appetizing sight when imported into a translation environment…

Read more | translationtribulations.com

Posted on juin 9, 2014 in Field of translation

Share the Story

About the Author

Back to Top