In Focus: Localization as Engaging Translation into Actual Reality

Dear Project Manager

Read this article in: English (American)

Estimated reading time:33minutes

In Focus: Localization as Engaging Translation into Actual Reality

·         Dear Project Manager

·         Masters of Craft and Bachelors of Art

·         Workflow Description

·         Ingredient X: Why Working with Us

Dear Project Manager

Heat is warm, fear is cold; the shared attribute of both is that heat and fear behave like gas: once emitted they will travel until they fill the entire space of any closed volume; depending on the volume, the concentration of heat or fear molecules may differ, but the whole space will still be taken as the function of time.

Stage fright is a special type of fear; it will conquer the whole of the inner self bringing you to the point of incapacitation, unless you master a trick or two to control it and even use this energy to boost your performance. Once I was told a story by a person who participated in a marketing presentation of new generation of some equipment. ‘They both looked excellent, the presenter and his interpreter, impeccable attire, ‘wet look’, and all that, but they were completely scared: one scared to present, the other to interpret in front of a large audience; so during almost two hours they were kind of sight-locked on each other, looking and speaking only to themselves. This experience was completely useless for me, until the coffee break began with those generous treats, but it was very captivating in the psychological aspect of it. I know you kind of like examining things from this perspective too’.

Oh, yes, stage fright is something you have to learn to control. Everyone has their own tricks, and I am not some kind of self-important training coach for success; I share personal stories to make my point. No matter what you do, relevant techniques will have to achieve two objectives: firstly, unchain your body, so that the unwanted excessive nervous energy is effectively vented; without setting your body free nothing will happen. Your hands should find where to be and where to go naturally. Doing some acting helps, but not much - or you can steal the show from your presenting client, and this you do not want. Secondly, you should personalize communication, as speaking to hundreds of people at the same time is like speaking to the world in general; this is too much for your mind to bear when addressing to the abstract and still keeping mental focus. I usually find several people in the audience to shift the eye-contact concurrently, as if you are talking just to them; you see their reactions to what you are saying in order to continually assure yourself that the message was understood and is engaging.

This article is not addressed to ‘the whole world’ as well. It has a very specific though hypothetical addressee: Dear Project Manager.

We are project managers ourselves, too, and basically cater for PMs in our marketing efforts as ‘birds of feather’. Project managers hold no executive position in an organization, but they are definitely any corporation’s bulwark; their work is difficult, hard and complicated.

PM has the project to implement, job to accomplish within certain budget and timeframe, conforming to expected quality level, given all possible in-process uncertainties and continual new inputs - and if something may potentially go wrong it probably will. ‘No plan survives the first contact with reality’ says one of Murphy’s Laws.

PM has the boss, boss’ boss or client, or boss’ boss’ client; you have no way of controlling them. Neither most of the time can they change the budget, deadlines or other requirements. In some of localization project chains there can be three or four subcontracted businesses before the assignment comes to a specific PM. Failure to perform a major project important for your company may cost you employment. And then you have vendors, freelancers or not, to implement the project you administer, and you are responsible to the corporate management for their performance on time, on budget and quality. Really a tough situation, and it is recurrent.

Thus, what does Dear Project Manager want?

Effectively implement the project and be safe. So very natural. If you have worked with someone before whom you were happy with, you are safe to award a contract to such person again, if such person is available for the job. If they are not available or you need a vendor for a new specific project, you need to find, assess and make decision about a new service supplier. Initial interviewing and even recommendation from the people you trust may not work, as something done for them may not necessarily suit you. Some would like to look deep into detail of the potential vendor workflow organization to make sure initial contact with the service supplier is really worth the effort.

The whole point of this article is to give a chance to project managers who may potentially be considering us as their prospective service providers to see whether, in their take, we:

·         know what we are talking about;

·         have relevant experience and equipment;

·         have adequate workflow processes and, very importantly, QA/QC in place, and

·         are flexible enough in terms of customer orientation to clients and PMs, predictable and reliable,

to make sure that we can help them be effective, feel safe and decide whether to proceed with us.

So, Dear Project Manager, with your kind permission allow me try to convince you in the above.

Masters of Craft and Bachelors of Arts

There is much discussion regarding whether translation is a craft or an art. You definitely need to be quite ingenious to be able to translate texts into another language encountering all those challenges which every translator is well aware of. One of the principles is that you can take something out from the original text but should not add anything, for instance, in order to embellish the original. So, regular translation should be pretty exact and direct, though at times a great deal of creativity should be employed to achieve that.

We believe that the AVT, the Audio Visual Translation, is an art as such and in itself.

It was not until the year 2020 that we became seriously involved in AVT. As usual the move towards the AV was due to the operational necessity and the need to better help clients.

In the past we, as simultaneous interpreters, were confronted with a very specific challenge: videos, videos used in the course of interpreted lectures and workshops as part of the training material that clients brought with them. Unlike Power Point presentations or handout texts, those never came localized. Here you are: the narrator reads a perfectly ‘engineered’ and written text with all those words like cumbersome, insurmountable, authoritative, impregnable; protagonists speak in local accents and professional jargon; as an interpreter you cannot really compete. (Once we got involved in audio/video edit and sound design, we got to know even more how the narrator’s speech is processed in the post-production phase). Even if you are a very fast simultaneous interpreter able to break the sound barrier and go right Mach 2 supersonic in the rendition to chase the original rhythm and pace, it is no good, as the human mind cannot comprehend and retain information effectively after certain word per second rate has been reached.

We always care about the quality of our work. This also means we have to always care about our audience, being prepared to, at times, walk an extra mile or two.

Thus, we decided to transcribe and translate videos for free and then voice over them in a primitive manner right from the interpreter booth, otherwise the whole value of training videos would be lost. Why so? We remember: project managers have limited budgets and operate within a very narrow and strict framework of conditions where substantiating certain decisions takes a lot. And we want project managers to be happy and safe with us and come to us again.

Those ‘in the picture’ know what it will cost to localize a 10-minute video retaining a major localization company. Most probably a large localization company will subcontract a smaller one, and then the latter will deal individually with translators, voice actors, subtitlers, editors, proofreaders, sound studio, post-production mastering specialists, office space rent, administrative personnel, taxes and so forth. And it will take time. If you want to expedite delivery, it will take even more effort and money. And then some modifications may be required on the fly, and this again will cost even more time and money.

Thus, even if your material is not falling into the category of ‘broadcastable’, you should not be very much surprised that 30 seconds of professional voice over in the final product will cost $100-150. When you have seven to ten training videos to localize the ‘check’ of the whole three-four day in-country workshop, along with other essential costs, may well equal to the price of a not bad used car. There are some organizations which can easily afford that, but most will not be able to justify such expense in terms of cost/tangible benefit.

Then once we were asked to do a primitive voice over of a two-hour lecture. We used a regular headset and Zoom to do it. Well, it was functional and ‘disposable’. However, this very experience kick-started our move to AV.

Many people need localized content - localized in a good quality manner, both in terms of translation and ‘the way it sounds’, not super-professional recording studio-grade product, but very well done and very affordable. Well, when you pay ten times less for a good quality product it must definitely be affordable.

So, we set our sails for the voyage in the AV waters. Here we go into the Wild Blue Yonder, as the song tells us.

In this journey we found that almost everything in AVT is different from the conventional translator’s world. It starts from smaller things (e.g., for proper sound in Remote Simultaneous Interpreting it is a dictation to have a USB connected microphone, whereas in AV it should be a condenser-type XLR mike) to really basic and ‘programmatic’ (you shall not use direct word-per-word translation when generating text for Subtitling or Voice-Over).

There is always first time for revelations.

Here is why AVT translation is an art. In Audio-Visual projects, directly translated text (and we always first translate transcribed text word-to-word) should be re-done, prepared, adjusted and properly synchronized individually both for the AV Text Component (Subtitling) and AV Voice (Narration, Voice-Over, Dub, etc.) whereas relevant preparation is done differently for each of the components. Only the synchronization standard is shared: no more than one second of desynchronization is conditionally allowed.

The challenge is the length of translated text, meaning it should not go beyond the master audio track elements. However, when English texts are translated into Eastern Slavonic languages the translation product is 1.3-1.4 times longer the original (in some other languages, e.g. Armenian it can be 2.0 times). So it should be creatively shortened for voice over to still preserve the rhythm, clarity, culturally engaging capacity and, most importantly, still perfectly convey the initial message in the target cultural reality.

Then you have to comply with the Client’s Style Sheet explaining what is expected to be in the final product.

The additional challenge is subtitling, as the requirements for good quality subtitling are something really extraordinary from the ‘regular translation’ point of view. Subtitling requirements may be different, depending on the client and systems, however, there are certain principles (guidelines) that are universal:

·         subtitles should be ‘invisible’, so that you read and understand them without effort, as if they did not exist. This links to character/per second rate and number of characters in the line (there are multiple standards);

·         text is broken into the subtitle lines logically. If two lines are used, they should be ‘stacked’ in the form of a pyramid;

·         subtitles appear when the person on the screen begins to speak;

·         there is a limit of maximum two lines, and certain amount of characters in one subtitle, including punctuation (we use 32 by default);

·         subtitles stay on the screen for minimum two and maximum seven seconds;

·         ‘distance’ between subtitles should be at least four video frames;

·         subtitles should take the ‘subtitle-safe’ portion of the screen not to obstruct seeing something important;

·         and more.

The above means that AVT translator’s job becomes extremely difficult - though similarly thrilling and interesting, but of course.

What you should do with the text so to render it engagingly subtitleable? Some kind of artistic wizardry, perhaps. In the result it will not be translation as such. Surely, it is not magic, but it is an art, as you basically need to ingeniously re-create, create a new version of the audio-visual material which is now different, though remaining the same.

And now here is an outline of the processes involved in our AV project workflow, along with short presentation of the hardware and software we use to implement such projects.

Workflow Description

There is a number of individual elements of an AV project which can be performed as a stand-alone service, combination of elements or a complete set if a Client just provides a video to be localized, without a script. These elements are:

·         Transcription;

·         Translation or transcreation of the transcribed text, back translation if necessary;

·         Checking for translated text to fit for subtitling and/or voice over and adjusting it after such initial fit-checking on a video;

·         Creation and timecoding of subtitles;

·         Voice over recording of an audio track;

·         Audio edit;

·         Video edit and synchronization;

·         Pre- and post-production.

Thus, the full scope AV project includes AVT elements (transcription, translation/transcreation, text fit-to-video or subtitles, initial check); production component (voice recording and subtitle creation, timecoding) and editing tasks (audio and video edit).

There are two following focal points of programmatic nature which apply to all AVT activities.

1.       It is important to mention upfront that, just like with written translation assignments, all our AVT tasks are subject to mandatory QA/QC check by the second linguist if the situation allows. Meaning, insofar as practically achievable, we request permission from the Client to have two our team members to sign an NDA (Non-Disclosure Agreement), so that we can provide such quality check by a second person. Notwithstanding the fact how good one is in translation, human mind tends to oversee certain amount of proofreading and editing issues. Thus our normal procedure is as follows: the day after you completed a translation job and finally proofread you text, before turning in the text to the client, you wake up early in the morning and take yet another go to read through the text with a fresh eye. No matter how thorough you did it the day before, you still find a few little bugs or logical inconsistences. Proofreading by a second linguist is even more efficient.

With AVT the workflow price of a mistake in translation or the omission in transcribing may be quite high; you may so easily end up re-recording the whole audio track or quite a big portion of it should this happen.

We will illustrate this point with a real life example before we continue the discussion of individual elements of the work process.

While checking what other people do in AV localization, I came across a nice word describing the creation of a new localized variant of an AV material: re-version. It sounded intuitively good to convey the meaning: you do, so you can re-do; you negotiate so you can re-negotiate; you create a version and so you can re-version, sounds natural. I decided to use this word when writing a script for our Interpreters with Equipment video.

The person doing the QA/QC raised a red flag and checked to find that re-version means: 1: an act or the process of returning (as to a former condition) b: a return toward an ancestral type or condition: reappearance of an ancestral character. 2: a product of reversion specifically: an organism with an atavistic character. reversion. noun. What is another word for reversion: retrogradation, reversing, inversion, rotation, reaction, reverting, regression, throwback, atavism, return and relapse.

So, despite that many may still hear ‘creating a new version’ as connotation of reversion, the actual academic meaning of our message: We Will Completely Re-Version Your AV Material would be: We Will Degrade Your Material into Other Languages.

However, in this case the audio track had been already recorded. It meant that in order to remediate the situation the whole audio track or at least the two-minute relevant section of it had to be re-recorded, re-edited and re-designed. Using what we call ремонтный кусочек in Russian, or ‘fixing/repair piece’ is a bad idea: your tempo, sound level of recording, gain, distance to mike, etc., of the new recording session will stand out from the natural flow of the previously recorded track. Since Interpreters with Equipment is intended to be not just the promotional but also capability demonstrational video, we decided to keep the ‘fixing piece’ to be used as reference to illustrate to clients that revisions mean re-recording just like the omission in transcribing and translation.

2.       The second ‘high-level general’ point to emphasize is the ability and channels to operationally talk to the client with regards to style sheets, instruction, expectations, on-going resolutions of issues as due to the AVT nature such projects can be very challenging and difficult or impossible to implement in the quality manner and on due time without being able to reach back to the client for direction.

Now, therefore:

1.       Transcription. We use oTranscribe for manual transcription (you can also put time stamps with it), or domain-trained voice recognition AI-powered software for it. (It is really efficient and cost effective. You can export transcribed text in multiple formats, including SRT subtitling files. Another advantage is that you do not need a monthly subscription for it, it works as ‘pay as you go’: you buy certain number of minutes and pay next when you have used them up). Still, you have to be really alerted with the voice recognition software - even with such excellent one as as some 5-6 % of the words may not be adequate to the actual text even if the video sound quality is good.

2.       Translation (Transcreation). We use memoQ (perpetual Translator Pro license) as our basic CAT tool to allow expeditious integration, editing and content exchange with the client. It is strongly recommended, still, to first directly translate and then transcreate the translated product for localization for obvious reasons of enhancing the target culture audience integration. This is yet another stage of the workflow where quick and effective communication with the client is essential. When necessary and seen expedient by the client, we offer ‘back translation’ to ascertain that main messages along with nuances and intricacies were taken care of in the localized version of the material. Here is a useful link on the back translation technique:


3.       Check-to-Fit. Prior to voice over or subtitling, we try the ready translation/transcreation to check whether it fits into the original track timeline, and whether the text is directly usable for creating foreign language subtitles synchronized with the original track narration. More often than not some shortening of the text is necessary. For subtitling we mostly use Subtitle Edit, as it gives the right waveforms and multiple tools for synchronization of suites with video, editing and export in all non-binary formats. We use Wondershare Filmora X video editor voice recording function to see how the target language voice over lays on the original narration. Scripts are again adjusted when it is found that the translated text is not fitting the original track timeline.

4.       Recording. It is usually a bad idea to record voice over in a video editor (though, it is kind of ‘convenient’) because video editors are not DAWs (Digital Audio Workstation), so the quality of sound is not really impressive with VEs. We use Audacity as DAW, not because it is free, but due to the fact that it is a very simple however very effective audio recording and editing tool which many professional voice actors use. As our sound consultants say, the quality of the sound is formed by your mike and audio interface. DAW is a tool; you can use Adobe Audition, Reaper, etc., if you have more challenging jobs, usually involving recording of music and sophisticated sound design. We especially value the advice and insight of those who were sound engineers before becoming successful voice actors:


On the equipment part we use Røde NT1-A studio condenser microphone, Focusrite Scarlett 2x2 audio interface, Beyerdynamic Pro dt 990 professional production/editing headphones and a pair of studio monitors. Recording takes a lot of time; you can easily spend two-three hours recording a 30-minute voice over, as there will be many takes until you are satisfied.

5.       AV Editing. After the voice over recording is ready, quite a bit of time is spent for audio edit: removing breathing and other extraneous sounds, cutting, rendering the recording dynamic and so forth. I usually first suppress breathing sounds by using negative Amplification value and then cut the right piece, perform two-step noise reduction, apply specially developed graphic EQ profile, compression, limiter and normalization. Necessary video synchronization is done after the audio track is okay.

6.       Pre- and post-production. We usually offer clients one minute of free recorded and edited product sent for vetting prior to the project implementation, so that they can check it for value. After the final product is delivered to the client, we still will be available for a reasonable number of revisions bearing in mind that a revision almost always means re-recording, and, thus, re-editing of the whole track or substantial piece of it.

Ingredient X: Why Working with Us

Some clients are different. How are they different? They are different from the others. Not only do they want a service to be provided fast and in quality manner, but they are inquisitive minds, wanting to understand how something works; they ask questions, be that in order to become more confident that we can provide as expected, or just wishing to know more about various walks of life. If you are still reading this ‘big read’, most probably you one of the kind.

‘So, you are either an interpreter or a translator, right?’ I was asked by one of such clients. Well, theoretically, it is true. However, what is true theoretically or partially will most probably be false in general.

There are more than one million of individuals who have accounts at ProZ, the world’s biggest localization marketplace, claiming that they are translators. Only a certain proportion of this number represents people who are actually professional translators. Out of the remainder only 10% maybe can interpret, and a very, very little portion of it will be simultaneous or conference interpreters. Each of the functions, translator and interpreter, means that a person needs to have a very peculiar skill set along with very specific cognitive, psychological qualities and aptitudes, some in-born, some developed with time. Does it mean simultaneous interpreters are ‘cream of the cream’? Some would say so, however, consecutive interpreting is the most difficult, as you have to hold so much in your memory before you can begin to interpret. Simultaneous function begins from a practical test of whether you are able to do it or not in principle, and if affirmative, then you excel, like with everything else.

Can you roll your tongue both sides inwards in your mouth so that it forms some sort of a pipe you can breathe through? If you can, will you be able to explain to somebody else how to do it? I bet you will not.

Based on my experience. I know people who are excellent speakers in a foreign language, but simply cannot interpret, and even re-tell; give them free hand to speak about the subject themselves and you will hear exciting and thrilling stories, but they simply cannot convert somebody else’s words. One super excellent translator told me: ‘Well I know how to translate it, but how you interpret this, I have this immobilizing stumbling block’. Another person who I know is, probably one of the best high-caliber conference interpreters. But when asked to interpret consecutively, his brain’s ‘RAM’ can hold only three words, and then he begins interpreting abruptly. And this is not the matter of training. It is the matter of ‘rolling your tongue in your mouth’, figuratively speaking. So, your ability to perform certain function at times is inexplicable. You simply can or cannot do it.

In our profession (be that interpreting or translation) there are some other common concepts that you may hear, used for marketing, which again are true statements theoretically but may be quite misleading practically.

Such as: ‘In our agency all our interpreters have PhD in linguistics’. How will this help in real work, when first of all you should be able to do simultaneous function per say and then avail yourself for life time learning of own craft every day? Your PhD on the Medieval English grammar will not help you anyhow.

‘All our linguists interpret only in their own language’ means what? Are there many conferences where there is no bilingual two-way rendition? Members of two-person conference interpreter teams change shifts not every 20-30 minutes, but when they need to interpret into their mother tongue?

If your command of English is ‘close to native’, how interpreting into your mother tongue helps if a person speaks poor English, or if UK northern accents are involved? Wouldn’t it be better if you interpret from your mother tongue - where you definitely understand every connotation - into ‘your’ English? I lived in London for one year, so I know that ‘call again’ means ‘come again’ and ‘building society’ is not about buildings. When you go up north from London, local accents change remarkably with a pitch of 20 miles, and northern accents are more challenging than the US Dixie’s, which are such a kerfuffle unless you have lived there for quite a long while.

Why should there be only ‘native interpreters’ voices from original countries’? I would rather have the pleasure of listening to the legendary Barry Slaughter Olsen interpreting from English into Russian than to many Russian language interpreters speaking in their phonetical original locales, who do not care much mastering, using and acting with their voice to sound phonetically neutral and not obtrusive to the audience’s ear.

Thus, I believe that the majority of such statements are rather commercial slogans which are okay for promotion but not for the attention of the Dear Project Manager, who needs to have the very firm grip with the reality. Real things can be different than actual. In reality very few good translators are equally good interpreters, as the nomenclature of relevant qualities is different, as well as the set of stress and multitasking factors. In your capacity of interpreters, you interact with people and find yourself being sort of a ‘sponge’ for human emotions, feelings, reactions, criticisms, sarcasm and alike. At times you need to act as some kind of ‘traffic controller’. Anyway, your psychological buildup has to be very robust; that is why interpreters are so prone to severe professional fatigue to the point of developing misanthropy as you, in essence, lease out your mind for speaking somebody’s else’s words through you brain every time, which is utmost unnatural for human consciousness. You have to really love your job to be an interpreter.

At the same time 50% of interpreters can be equally or close to be equally effective translators. Perhaps an interpreter will rather opt out from doing the translation job if there are enough interpreting contracts, but for those 50% it is very doable. Moreover, translating a lot on a certain subject is critically important to an interpreter for learning more on the subject in terms of terminology and overall understanding of the specific area of knowledge.

Before we got intensely involved in Audio-Visual localization projects we thought that there were only two distinct language rendering professions: translation and interpreting, and you gravitate either to one or the other. However, AV that includes AVT, dubbing and audio-video edit is definitely the third one. Of course, being a translator/interpreter you have to master few new things (dubbing, audio-video edit), but language rendition is still in the core of it, as with interpreting and translation. What makes it different is the pivotal role of creativity in the synergized process, and this is why it is so thrilling in case you are willing and dare to get involved.

In yet another promotional video it was said: ‘Do not entrust your AVT with a voice-over actor, nor dubbing with a translator, as these people are trained for relevant separate functions. We work individually with the best translators, subtitlers, voice over actors and sound engineers. Come to us, it will be remarkable’. No doubt about that. But one thing was not mentioned of course: the price. Every involved specialist will charge their own pretty high fees for the job. Combined with other costs of doing business the total amount to be paid for the finished product will be surprisingly remarkable as well.

And now the whole story boils down to us; what we can do, for what cost, what kind of quality and why we claim that it is worth using our services.

We have been doing translations for over 25 years, at times very creatively, so we will handle AVT. For almost two years we have been constantly learning, both training and working for the actual contracts, to do subtitling, audio and video edit. As conference interpreters we have been continually working with our voices phonetically and rhythmically.

In order to accomplish something, you need to know what you are doing, be capable of it, possess and know how to use your technical equipment and software. Well, this is just exactly what is done by other people who claim that they can do it.

But simple summation of prerequisites will not necessarily produce the expected result. It is just like the concept of happiness: you can have all material and spiritual things available on hand, put that all together in a pattern and still not be happy.

We believe that success is not the sum of ingredients, it is their ‘chemical reaction’.

For this reaction to happen you need an Ingredient X, a catalyst and forming agent.

We claim we have one, and it is our passion for the AV work along with the subjective creativity.

To demonstrate it, we have created our promotional videos; if these are engaging for you then we are right in our assumption that we can to the good job for you. If not then we have the wrong type of Ingredient X for you, Dear Project Manager.

And yes, this all will be very reasonably priced; it will be controllable (you will have one single point of responsibility and contact; the response time to your correspondence will normally be under 30 minutes during the whole period of the project implementation), scalable, budget-friendly, with no extras or hidden costs.

As we say in our Remotely Present Video: ‘we want you to become our client and partner, come to us again and again and bring your friends and business associates’.

Now, this completes my presentation, Dear Project Manager. I hope that you will find this information useful.

Kind regards,

Faithfully yours,


Written by Andreas (Andrey Parkalov), Z4T Integrated Text and Voice

Photograph by Andreas

More articles by this producer

Videos by this producer


One Stop Localization. English, Russian, Ukrainian and Belarusian. Free video footage by Music by: Oleksii Kaplunskyi, Lesfm. 'Epic Drums"