Over the years of work, I have noticed an alarming trend — many otherwise capable subtitlers will often follow their client’s guidelines too strictly, almost dogmatically, without a real understanding of why those guidelines are the way they are and not knowing when to deviate from them to ensure the audience’s viewing comfort. This leads to countless subtitling errors, because no matter how robust and well-thought-out a style guide is, there will always be gaps in it, and so there’ll be moments when you need to make a judgement call based on your expertise rather than a written prescription.
In this new article series, I’d like to stress the importance of a thoughtful, intelligent approach to subtitling and to highlight some of those gaps, starting with arguably the biggest one.
At the same time, when given creative freedom, the most skilled and experienced subtitlers don't obsess over that number as much everyone else seems to believe, because they know just how unreliable it can be, for multiple reasons.
First of all, as I wrote in one of my previous articles, CPS and WPM consider only the volume of subtitle text but not its other properties, such as complexity or format. Unfamiliar words, tricky syntax, puzzling dialogue, italics and some other things will slow down your reading, and these two metrics simply do not reflect that.
Max Deryagin’s Subtitling Studio
Celebrity Interview: Agnieszka Szarkowska
Ladies and gentlemen, and everyone else, here comes a new article in the Celebrity Interview series. This time I am joined by none other than Agnieszka Szarkowska, the newest laureate of the Jan Ivarsson Award. She’s one of the leading experts in media accessibility and eye-tracking research as well as a renowned academic teacher, ex-translator, trainer, consultant, conference organizer and presenter as well as proponent of cutting-edge technologies in the field of audiovisual translation.
In this interview, Agnieszka talked about her beginnings, demystified the inner workings of subtitling research, expounded upon her studies, addressed some controversies, and gave us a peek into what’s to come.
Max Deryagin (MD): To start, please walk me through your career journey, from your student days and up to now.
Agnieszka Szarkowska (AS): Okay! After doing my MA in English, I was happy to leave the university and wanted to become a translator. I’d tried to get an AVT job but didn’t know anyone, so I ended up as a translator working in-house in a corporation while also freelancing. But after a while I felt something was lacking, I got a bit bored with my work — I needed more intellectual stimulation. So, I went back to the same university and did my PhD. At that time, I was doing some odd subtitling and SDH jobs, but my mind was set on becoming a researcher and educator rather than a full-time translator.
Agnieszka in her 20s
After obtaining my PhD, I transferred to another department, which had translation at the heart of its curriculum, and I have worked there ever since — with a two-year break for research fellowship at the Centre for Translation Studies at UCL. I have received a number of research grants, which allowed me to work on various projects, such as text-to-speech audio description, AD in education, eye-tracking studies, respeaking, and so on.
MD: Okay, my first question is going to be about your research. How do you choose what issue or subject to explore in a new study? Do you follow the trends? Or does it depend on what would be the most impactful thing for the field or your career?
AS: I like to get to the bottom of things, understand what was really done and what remains to be done. For instance, before I started to work on experimental research into reading speed, I read everything I could find on this topic, because I had an impression that this had already been thoroughly researched and maybe it doesn’t make sense to do it now. After all, everybody was citing prof. d’Ydewalle and his famous study from 1987 as having provided scientific evidence for the famous 6-second rule (which roughly translates into the speed of 11 cps).
By the way, the paper was quite hard to find and I was so happy when I finally located it in the University of Antwerp library, thanks to the help of my friend Gert Vercauteren. Since then, I always try to make my publications available in the open access so that people can find them more easily.
Agnieszka happy to finally find the damn paper
But in this paper prof. d’Ydewalle says himself that “nobody seems to know how the six-second rule was arrived at” and then describes two studies, one of which showed a group of people videos at three speeds: 32 cps (what they called “2-second rule”), 16 cps (“4-second rule”) and 10.6 cps (“6-second rule”) and asked them which subtitles they preferred. Remember it was mid-1980s, so it doesn’t surprise me that most people chose the 6-second rule. And then I thought: hey, a lot has changed since then, right? The size and quality of TV sets, the speech rate and editing in films, research equipment, the rise of the Internet, streaming and mobile devices, to mention a few. And we still seemed to rely on this old study! So I thought: I’d like to know what it looks like now. That’s how it started. And it turned out to be a very hot and controversial topic.
The kind of computer they used in the old “6-second-rule” study
So, to answer your question, I like to do a deep dive first, see what hasn’t been done yet and what’s interesting and worth doing. I also like for my research to have practical implications for viewers and the industry. A few years back I worked a lot on audio description. We wanted to create AD and test it with blind viewers (together with Ania Jankowska), but we didn’t have the funds to do the recording, so we started working with text-to-speech software, which was better than nothing. Then it became a research project. Nowadays it’s not as controversial as it used to be, but in the early 2010s speech synthesis was not that good, so it was quite a novelty.
MD: So how did you arrive at eye tracking?
AS: I owe my first eye-tracking experience to Pilar Orero, who received some funding and invited me, together with a number of other researchers, to the Autonomous University of Barcelona for a hands-on training session with a Tobii eye tracker. It was there that I met many colleagues and friends who I learnt from and collaborated with for many years to come.
Agnieszka watching the eye-tracking computer
The eye-tracking computer watching Agnieszka
I loved eye tracking from the start: it provided hard evidence to whatever hypotheses you had and also allowed you to test all sort of assumptions, for instance about subtitling and shot changes. That’s the type of research that’s always fascinated me, as opposed to somebody’s interpretation of what the translator had in mind when they translated this or that.
MD: Right. Okay, the next question is going to be a difficult one. Yes, you probably know what it is — it’s about that one study claiming that people can read subtitles at 20 characters per second. It has received a lot of criticism. So I wonder: what has this whole thing meant to you? Has the feedback changed your perspective or perhaps gave you confidence to continue?
AS: Haha. I sort of touched upon this earlier, but the rationale for this study was to try and test what viewers are confronted with in real life, and the reality is that the speed has gone up. So I chose three speeds — 12, 16 and 20 cps — and tested a number of different things with different viewers: English, Spanish and Polish native speakers as well as deaf and hard of hearing people in the UK. My line of thinking was: if some viewers can indeed keep up with subtitles displayed at 20 cps, it must be young people, probably familiar with subtitling. That’s why I chose young participants. I believed that if anyone can read fast, it must be them. If they can’t keep up, then nobody can. So you see that it didn’t really make sense to start with the elderly, right?
This study has cost me a lot of nerves and there are many misunderstandings that arose around it that I've tried to clear up over the years. One misconception is that I only tested students. No, we tested all sort of people, most of whom weren’t students. I remember one guy said he was a “part-time participant” at UCL, so he was making money by attending various studies, but he didn’t study anything there. We conducted our research using an eye tracker that belonged to UCL and was located at university premises, but if you come to my university lab for a study, it doesn’t automatically make you a student, right?
Another criticism I received was that we tested only a handful of people. Yes and no. There were two of us conducting the study: myself and Olivia Gerber-Morón. Altogether we tested 97 participants, individually, and each test took between 90 to 120 minutes. So you can calculate yourself how many hours, and weeks, we spent in the lab. For an experimental study with eye tracking, that’s a lot. How many eye-tracking studies in our field do you know who used a larger sample? Is it representative of the whole population? Probably not. Yes, we tested Spanish and Polish people living in the UK, so it is possible they had different AVT habits to people living in Spain and Poland. So it’s important to test people in various countries, since they’ve had a different experience with subtitling.
But I think the harshest criticism that came from professional subtitlers was rooted in the belief that this study goes against the core principles of subtitling, such as condensation. If you can subtitle at 20 cps, you don’t need to condense as much as you do when you subtitle at 12 cps. But in this study we don’t say 20 cps is the optimum speed, it’s the maximum speed that we tested. We now have more evidence, from a study by Jan-Louis Kruger in Australia, which we replicated in Poland and in the UK; it showed little difference in some aspects between the speeds of 12 and 20 cps, but a huge difference at the speed of 28 cps. So we know that 28 cps is an impossible speed to read.
Back to my London study, many participants we talked to during the interviews stressed that they are distracted by the discrepancies between the audio and the subtitles to the extent that they switch the subtitles off or change the stream to intralingual subs. Of course, this only applies to those who can understand the language. But there are millions of people now who watch audiovisual content with the subtitles even though they understand the language of the soundtrack.
Some people criticised the video material I chose, the genre, the duration of the clips, etc. But the fact is that any choice is riddled with problems and at the end of the day you have to make a choice. I made an informed choice, aware of some issues, but otherwise it wouldn’t be possible to do anything really. So, my answer to those critical comments is: let’s do more studies on reading speed with different languages and viewers. If you feel like you would like to do yours, you can conduct an experiment in the way you think is the best. You can even come to my lab and use our eye tracker. In fact, a PhD student, Huihuang, who I co-supervise with Jorge, spent a few months in Warsaw testing Chinese people in a study on reading speed (at 5 versus 9 cps).
Again, this study was not flawless. I have learned a lot and since then I’ve moved on and tried to do some things differently. With any study, we always try to improve and build on the previous ones. What is important to remember is that one study is not enough. We should replicate our studies on other populations and languages, and see if we are getting similar or different results. Then we will be able to state something with more certainty.
MD: In that regard, what are the common misconceptions about academic work and research, from the perspective of us practitioners and also the corporate world?
AS: I think people overestimate what can be done within a subtitling course at a typical university MA programme. First of all, there is not enough time. My subtitling class consists of 15 meetings, 90 minutes each. Can you become a subtitler in less than 4 months, having attended 15 classes? Some people believe that we should teach how to do invoicing, how to market yourself, etc. This is something that I won’t do in my university class. Not because I don’t think it’s worth doing — on the contrary, it is very useful and should be taught — but elsewhere. I’d rather focus on text condensation, reading speed, cultural references, etc. I believe we need to teach those transferable skills that they can apply later in any setting or software. Software is another thing: companies expect people to come out of university and be able to start working immediately in their cloud tool. But most of these tools are not publicly available.
Many people may also think that academics go to work once or twice a week and read books for the rest of the time. What we in fact do is a lot of boring admin work, attend various committee meetings, mark students’ theses, and do a lot of unpaid work, such as peer reviews, etc.
And one more thing: I remember how surprised my dad was when he found out that I am not getting paid to go to a conference and deliver a talk. He couldn’t believe that it’s me who has to pay to be able to present ;) The same with publications.
MD: Surprising indeed! Now, can you offer us a glimpse of what you’re researching right now? :)
AS: We are doing another study on reading speed, among other things, if you can believe it! But we are approaching it in a different way. One difference is that previously we created the subtitles consistently at the same speed: for the 20 cps condition, the subtitles were roughly between 19 to 21 cps, all of them. This is not very realistic, so we are now working with what we call “actual speed”, with each subtitle being different. We are hoping to see how this actual speed will impact the reading. We also have a fancy new eye tracker which allows us to analyse the reading of subtitles at the word level, rather than subtitle-level as was the case with the vast majority of previous studies.
Word-level areas of interest in eye tracking
In the AVT Lab, my doctoral students, Gabi and Sonia, are also working on the perception of dubbing, including AI dubbing and the impact of the soundtrack’s language on subtitle processing.
As part of the Watch Me project, our team are also working on watching subtitled videos with the sound off and seeing how people process subtitled videos when there are discrepancies between the audio and the text in the subtitles. We conduct our experiments in three countries: Poland, the UK and Australia (a big thank you to my post-doctoral researcher Valentina Ragni!), and we are hoping to get more generalizable results based on data from various types of viewers.
MD: Now let’s move on to a different topic — namely, teaching. There’s a lot of talk right now about the necessity of giving AVT students the basics of machine translation post-editing. Have you added them to your own university programs? And what are your general thoughts on this?
AS: Last year we had dr David Orrego-Carmona as a visiting lecturer at the University of Warsaw and he taught what I believe was the first class solely devoted to post-editing and automation in subtitling. And it was a big success.
I have also added MT and automation to my subtitling class. I first teach them everything from scratch, how to do things traditionally, or manually you can say. And then I show them how to automate things, for instance using automatic speech recognition to create transcription and timing, and also show them some tools that integrate MT. I even let them use all this in their final test.
And you know what? I think it’s the older generation — people my age and older — that has a problem with MT. Most young people have embraced it and understand it’s not perfect and has to be post-edited. They understand they need to learn when it’s usable and when it’s not. They can’t imagine translating without these AI tools. However, we just need to make sure that MT is implemented correctly in subtitling workflows (as part of augmented translation) and that the benefits are shared fairly between the companies and the freelancers. Obviously, we need to keep improving these tools as well. And, from a training perspective, we need to learn how to harness them to our advantage.
MD: Not sure about the older generation thing, but yes, technology is marching forward, so we need to prepare for the future. Now, as far as I’m aware, you taught during COVID. Was it the pandemic that inspired you and your colleagues to launch AVT Masterclass?
AS: I started making my first screencasting tutorials ages ago so that I didn’t have to repeat the same things for students who missed a class. And then I thought, hey, since I already have this tutorial, maybe it will be useful to others, so I contacted EZTitles and they made it available on their YouTube channel. I believe you yourself made a tutorial for them too.
AS: So, this made me realise that online education is easily scalable. At the uni, I teach subtitling to a handful of 20 or so students, but the needs of the industry are enormous. Not everybody wants to sign up for a fully-fledged university programme in translation — most people just want to learn specific skills, such as SDH or template creation, or how to use a subtitling tool. AVT Masterclass bridges the gap between academic training and the industry needs by offering courses targeted at specific skills. We offer two types of training: publicly available self-paced courses for individuals (as part of their continuing professional development) and private tailor-made courses for companies.
With my colleagues Łukasz Dutka and Agnieszka Walczak, we have worked across various fields in the media localization industry. For instance, Łukasz has extensive experience working as an in-house and a freelance subtitler, both in linear broadcasting and with OTT services. He has a strong background in template creation, SDH, and live subtitling. Agnieszka has a background in audio description, voice-over, subtitling and SDH, and she has a wealth of experience in managing global localization workflows from a perspective of a large LSP and a leading content owner.
Our collective experience has given us an in-depth understanding of the intricacies of media localization. Independently, each of us recognized the pressing need for training in our areas of expertise. All of us, Łukasz and Agnieszka and myself, had trained people at various levels and organizations before. But the pandemic made it easy to train people online on a larger scale. When we conducted our second training for the European Parliament, we thought: a lot more people could benefit from what we have to offer. We can provide flexible self-paced training for people living in different time zones and share our passion and knowledge with students across the globe. And there are great online educational platforms to make this possible. We struck a deal with some subtitling tools manufacturers and we now have a growing portfolio of courses, including big courses and some mini courses, targeted at very specific skills. We train people in various aspects of subtitling: timing, translation strategies, template creation or SDH/CC.
MD: I see. So what are the future plans for Masterclass?
AS: We are working on new publicly available courses for freelancers and tailor-made training programmes for companies. This year, we are planning to release courses on project management, working with subtitling guidelines, post-editing, QC and translating audio description.
In 2022, we made our courses available for free to all Ukrainian linguists. We now continue to support them in their efforts to subtitle audiovisual content into Ukrainian and create SDH in this language.
MD: I think it’s a great initiative, and probably one of the many things that lead to you receiving the Jan Ivarsson Award at the Languages and the Media conference. Which I think is a big deal, congratulations!
AS: Thanks! :)
Agnieszka with the award, next to Sharon Black, President of ESIST
Conferences and Technology
MD: Speaking of conferences, you are the main organizer of Intermedia, and it’s a large one, so I’d like to ask you: what goes into organizing a conference?
AS: Oh, you don’t want to know! A lot of useless bureaucracy and paperwork. Intermedia 2019 was a great conference in terms of speakers, participants, atmosphere, access services, and so on — but I’m not organising another conference ever again.
MD: So… Intermedia is over?
AS: We’ve had to suspend Intermedia and are looking into ways to revive it. If anyone’s interested, please reach out to us!
Agnieszka at Intermedia 2019, with a sign-language interpreter
MD: From your perspective, what are some of the new interesting topics that have been discussed at conferences lately?
AS: Hm, let’s see... Augmented translation, integration of translation memory into subtitling tools, subtitling certification, automatic dubbing and automation in general. Also, translator training.
MD: You probably have seen ChatGPT making headlines, and you know about neural machine translation and voice synthesis in dubbing, and all these cutting-edge technologies. What is your opinion about the use of AI in AVT: do you think it is the future and something we should embrace, or do you have reservations regarding it?
AS: I am a big fan of technology and in fact I’m planning to have my own avatar created for online education as soon as I can afford it (although my business partner Łukasz thinks we’re all much better as real-life personas, haha). It will save me a lot of time recording video tutorials, putting on makeup and editing out my hesitations and false starts!
But seriously, AI is a game changer and I'd be interested to find out more about how we can harness it for media localization.
MD: If you had a loudspeaker to the entire AVT world, what would be your one message to translators, academics and industry people?
AS: In the grand scheme of things, we all work towards the same goal: making audiovisual media accessible to viewers around the world. Let’s cooperate and keep the conversation going to better understand each other’s needs and viewpoints!
MD: Alright, awesome! This was the last question I’ve prepared, so we’re done here! Thanks for your time!
AS: You’re welcome!