top of page

Toward Intelligent Subtitling
Part 4: Visual Shifts

Shot changes. A crucial element in the subtitling process. Timing to them has become standard practice today, and we’re quite used to it — we know what to do and what not to do. Indeed, most well-established guidelines — corporate, national, academic, and other — touch upon shot cuts and instruct us to respect them when cueing, in order to match the film’s editing rhythm and to avoid visual flicker.

We take it for granted — and yet, something about this practice has always felt off to me. It seemed rather lacking, like there was more to it than anyone knew, a bigger picture waiting to be discovered, as I couldn’t help but notice — time and time again — one peculiar thing in my work: that there are many different kinds of these image “changes”, all warranting a similar treatment but getting no mention in any style guide whatsoever.

In this fourth part of the Intelligent Subtitling series, I am going to explore this bigger picture, share my findings, and introduce an entirely new concept: the visual shift.


And we’ll begin with...

The Idea


So, what is it exactly, this “visual shift”? Well, basically, it’s any abrupt change in the image. When something happens quickly and extensively on the screen, that is what you get. Like, for example, the light going off:


In this Simpsons scene, it’s the light, but it can be any fast movement or transformation — of people, objects, structures, environments, and so on.


Now notice what happens when I let the second subtitle linger a bit more and go over the switch:


It flickers a little, doesn’t it? Like if we crossed a shot change improperly. But... there isn’s one! It’s the same shot! Because of the cueing error, our eyes get drawn to the subtitle’s untimely disappearance, which breaks the flow and disturbs the viewing process. A small issue on its own, perhaps, but these timing problems will accumulate over a movie’s duration and lead to a tangible difference in how it’s perceived.

Now, with this in mind, let me propose my simple yet important postulate:

Subtitles are to be synchronized not only with sound but also with picture.

Any sufficiently sharp visual shift must be timed to: instant shifts should be treated similarly to shot changes, while abrupt transformations require intuiting the best in- and out-frames. This must be weighed against the other considerations: reading speed, segmentation, etc.



What is “sufficiently sharp” will become apparent later on — and yes, your intuition will always be the best tool for deciding on the optimal timing, as there are too many variables and exceptions for me to be able to come up with a concrete set of rules.


The Map of Shifts


Alright, now that we know what visual shifts are and what to do with them, let’s explore their full classification:

Screenshot 2024-01-05 at 14.06.51.png


We have three main categories: Camerawork, Editing and Mise-en-scène, each with its own two subcategories. And the shot cut is but one small blip in this grand scheme of things. Interesting, isn’t it?


Now let’s discuss the map, part by part, starting with...


1. Editing


In a previous article of the Intelligent Subtitling series, I briefly covered this aspect of filmmaking and explained some of its key components. One of them is shot transitions — a fundamental element of visual storytelling.

There are many kinds of transitions out there, and among them, the simple cut gets employed the most.

Cuts, of course, create visual shifts.

(If any of the examples below don’t seem convincing, try to watch them in full screen.)

But many other transition types — wipe, iris, fade, dissolve, etc. — can produce shifts as well, if fast enough:

And here, the creators of the Sherlock series use the optimal timing for a passing transition:


Next, we have editorial effects. These are modifications made not to the image itself but to how it’s presented: a framerate change (e.g. to stop-motion), aspect ratio switch, sudden slow-mo or fast-mo, rewind effect, etc.


Yes, an abrupt stop can too create a shift if it subverts the viewer’s expectation of what should happen next.

2. Camerawork


Cinematographers, as well as their assistants, adjust and manipulate the camera in very particular, deliberate, orchestrated ways to tell a story as it was envisaged by the movie’s director. This includes moving the camera in multiple axes of space; zooming in and out when needed; choosing the right lens, aperture and film stock; focusing on the right elements in the frame; and making a whole assortment of other related decisions.


As far as movements, some can be quite rapid and thus require timing to. Like whip pans:


The latter flickers, because it doesn’t respect the shift. The same applies to whip tilts, as well as camera rolls, sped-up dollying, fast horizontal tracking, swift boom shots, and so on and so forth. You need to mind them when working on your subs. Here are a few examples from Wes Anderson’s Isle of Dogs:


Manipulations with the camera’s setup — its aperture, focus distance, depth of field, magnification level, etc. — can also produce visual shifts. This crash zoom in Quentin Tarantino’s Kill Bill is one example:

As well as this rack focus from Bong Joon-ho’s The Host:

Here’s Wes Anderson demonstrating the correct approach in Fantastic Mr. Fox:

The text disappears at just the right frame. Perfect timing, from the master himself.

3. Mise-en-scène


Actors, props, costumes, sets, buildings, lighting, computer-generated imagery — everything that appears within the camera frame is collectively referred to as the mise-en-scène. When any of its elements transform, move, or appear/disappear suddenly and quickly, we get a visual shift.


Here are a few examples:


Yes, removing the subtitle on the shot cut can sometimes be not ideal. When you have multiple visual shifts happening at the same time or in a quick succession, you need to consider the intended gaze path and how the subs’ appearance and disappearance will affect it — and then to prioritize accordingly. And just like with shot changes, you can choose to cross or not to cross, depending on what works best.

Moving on, there’s also what I call a picture-in-picture shift, which is a transition not between two shots but rather within one shot, on some sort of a screen — of a phone, computer, TV set, projector, or similar:


But these all are visual objects. Some other parts of the image cannot be categorized as such — like lighting, color, weather, the time of day, certain types of computer effects, etc. They’re the environment.

The Role of Sound


Originally, I wanted to create a broader concept — the audiovisual shift — which would include both the image and the sound components. What I found out via testing, however, was that sharp acoustics and aural changes almost never necessitate a special kind of timing on their own. Well, one might be compelled to cue differently in cases like the ones below, where the sound either suddenly starts/stops or abruptly interrupts the dialogue:


To me, though, such cueing feels more like splitting hairs rather than doing something impactful. Gunshots, screams, claps, shrill electronic sounds, vocables, accented notes in music... none of them seem to mandate special treatment, considering that the viewing device’s volume level and one’s hearing ability can vary a lot. At least I myself couldn’t find evidence to the contrary — and I’ve really tried. (But I’d love to be proven wrong!)

Still, what such sounds actually do is they amplify visual shifts, making them sharper:

[loudness warning!]


Hehehe, that was a good one, wasn’t it? 👻



And this is it for now. Again, we have three main categories of shifts: Camerawork, Editing and Mise-en-scène, each with its two subcategories. When subtitling, you need to detect these shifts and to find the best timing for them using your intuition, judgement, and sense of visual rhythm, while keeping in mind some other important considerations. Oh, and try to not overdo it — it’s all about getting the right balance.

So, let us move away from the narrow idea of the shot change, toward the greater concept of the visual shift.


Alright, this concludes the article. As always, if you have any questions, thoughts, or remarks, feel free to share them in a comment below.


bottom of page