Transitions and Cuts for Short-Form Video
Cutting principles from the editors who shoot real films (Walter Murch on Apocalypse Now, Thelma Schoonmaker for Scorsese, Lee Smith on Dunkirk, Joe Walker on Dune) re-anchored to the 9:16 phone frame with named techniques and named films.
Walter Murch and the Cut That Lands on the Feeling
By Bell Chen, founder. Updated May 18, 2026.
In In the Blink of an Eye, the editor Walter Murch, who won the Best Film Editing Oscar in 1997 for The English Patient and previously won two more Oscars for his sound and picture work on Apocalypse Now, ranks the six criteria a cut has to satisfy in descending order of importance: emotion at 51 percent, story at 23 percent, rhythm at 10 percent, eye trace at 7 percent, the two-dimensional plane of the screen at 5 percent, and the three-dimensional space of action at 4 percent.
"What I believe is being looked for, in the editor's choice of the 'right' cut, is something that satisfies all the following six criteria, in their order of importance: 1) it is true to the emotion of the moment; 2) it advances the story; 3) it occurs at a moment that is rhythmically interesting and 'right'; 4) it acknowledges what you might call 'eye-trace'..."
>
Walter Murch, `In the Blink of an Eye`, Silman-James Press, 2nd revised edition, 2001
The percentages are not literal arithmetic. They are a ranking of what a cut should serve when those criteria pull in different directions. The number to hold in mind is the first one. Emotion is more than half of why any cut should exist.
That ranking is the single most useful editorial framework for the 9:16 phone frame, because the platform attention model rewards the cut that lands on the feeling and punishes the cut that lands only on the visual. The "viral transition" tutorials that dominate the short-form how-to corpus invert Murch's ranking. They optimize for the three-dimensional space of action (the seamless whip pan, the masked door-wipe, the impossible camera path) and forget that the viewer scrolled because they did not feel anything yet. The named editors below cut for emotion first. The phone shoot earns the same standard.
The Six-Criteria Hierarchy, Used as a Checklist
Walter Murch articulated the hierarchy after editing Apocalypse Now (1979), The Conversation (1974, which Francis Ford Coppola co-edited with Richard Chew and Murch), and The English Patient (1996, for which Murch won both the Best Film Editing and Best Sound Mixing Oscars in 1997). In the Blink of an Eye is the slim book that captures the framework. The percentages add to 100 but are weighted: if a cut is perfect on the lower five criteria and wrong on the first, the cut is wrong. If a cut is right on emotion and breaks two-dimensional continuity, the cut is still right. The hierarchy reads, top to bottom, as follows.
Emotion (51 percent). The cut serves what the audience should be feeling at that moment. The next shot is the one whose presence intensifies, sharpens, or pivots the feeling rather than merely advancing the visual.
Story (23 percent). The cut moves the narrative forward. The next shot delivers information the audience needs to understand what is happening or what is about to happen.
Rhythm (10 percent). The cut lands on the beat that the sequence has established (musical, conversational, visual). The cut feels rhythmically right rather than rhythmically arbitrary.
Eye trace (7 percent). The audience's eye is already looking at the part of the frame that will carry the action into the next shot. The cut continues the eye's path rather than restarting it elsewhere.
Two-dimensional plane (5 percent). The 180-degree rule holds. Screen direction is consistent between shots. A character looking left continues to look left.
Three-dimensional space (4 percent). The action and geography between shots makes physical sense. The hand reaching in shot A continues to the cup in shot B in a way the room layout supports.
The use of the hierarchy is not theoretical. When two takes both feel viable in the edit and there is no obvious reason to choose one, Murch's framework asks: which cut serves the higher-ranked criterion better? If take A has cleaner eye trace but take B is better on emotion, take B wins. The audience will forgive an imperfect eye trace if the emotion is right. The audience will not forgive an emotionally inert cut, no matter how clean the screen geography is.
On the phone, the hierarchy collapses to a faster question because the clip is shorter and the criteria conflict less often. The question becomes: does this cut land on the next thing the viewer should feel? If yes, ship it. If the cut lands on the next thing the viewer should see but does not yet care about, the script has not earned the cut, and the editor's job is to add a beat (a reaction shot, a held look, a delivered punchline) before the cut goes through.
The Vocabulary of Cutting, With Named Cinema Examples
The named techniques below are the working vocabulary that every editor cited in this guide uses. Each has a verifiable origin in cinema history and a direct application to short-form video.
The hard cut
A hard cut is the unmodified transition: one shot ends, the next shot begins, no dissolve, no effect, no audio bleed. It is the default unit of editing and the workhorse of every editor cited in this piece. Thelma Schoonmaker, who has edited Martin Scorsese's films since Raging Bull (1980) and won the Best Film Editing Oscar three times (for Raging Bull in 1981, The Aviator in 2005, and The Departed in 2007), has discussed in interviews archived at A.frame.com, The Academy's editorial site, how the bulk of Goodfellas (1990) and The Departed (2006) is straight cuts paced to the rhythm of the music and the rhythm of the line readings. The fancy moves are the exceptions. The hard cut is the texture of the film.
On the phone, the hard cut is similarly the default. Ninety percent of the cuts in a good talking-head clip should be hard cuts on action or on dialogue beats. The viewer's brain processes them instantly because that is what brains were trained to do by every TV cut and every film cut since the 1900s. Reach for an effect only when the hard cut cannot do the work.
The dissolve
A dissolve overlaps the end of one shot with the start of the next, fading the first out as the second fades in. It carries the texture of time passing or of a memory shift. The canonical use is the dissolve sequence in Citizen Kane (1941, edited by Robert Wise) that compresses Kane's marriage to Emily across a decade of breakfast tables. The contemporary use is the dissolve in a documentary or biopic montage when the story collapses years into seconds.
On the phone, the dissolve is mostly wrong. A two-second dissolve in a fifteen-second TikTok burns 13 percent of the runtime on a transition that is not earning attention. The exceptions are the dream-sequence intro, the memory cutaway in a personal essay, or the deliberate slow-motion vlog reveal. Use the dissolve when the script is asking for a felt sense of time, not for visual softness.
The fade in and fade out
A fade from black to image, or from image to black, marks the start or end of a section. It carries the texture of a curtain rising or falling. In Schindler's List (1993, edited by Michael Kahn for Steven Spielberg), the fades to black bracket the chapters of the film. The black is not absence; it is breath.
On the phone, the fade is almost always a mistake when used as a transition between shots inside a clip. The viewer reads black frames as "the video ended" and the muscle memory of the swipe takes over. The exception is the deliberate stylistic opening (one second of black with a title card) or the closing beat (fade-out timed to a final line that wants room to resonate). Use the fade as a chapter mark, not as glue.
The L-cut and J-cut, the audio overlaps
An L-cut continues the audio of the outgoing shot over the incoming shot. A J-cut starts the audio of the incoming shot before the picture cuts to it. Both are the unsung workhorses of professional dialogue editing. Joe Walker, Denis Villeneuve's editor since Sicario (2015) and the editor of Arrival (2016), Blade Runner 2049 (2017), Dune (2021), and Dune: Part Two (2024), uses audio bleeds across cuts so aggressively that, per the IndieWire Toolkit interview on his Villeneuve method, the audience often does not consciously notice the picture cut at all. The conversation continues in voice while the image shifts to the listener.
On the phone, the J-cut is the single highest-ROI editing move for talking-head content. Start the next sentence's audio one to three frames before the visual cut. The viewer's brain registers the new line before it registers the new shot, and the cut feels like the natural continuation of thought instead of like a stitch. The L-cut is the inverse and works for reaction-shot reveals: hold the audio of the speaker over a cut to the listener's face, then let the listener's reaction land in silence.
The match cut
A match cut is the cut between two shots that share a similar shape, motion, or sound, so that the cut feels like a single continuous form across an impossible bridge. The most-cited example in cinema is the bone-to-satellite cut in 2001: A Space Odyssey (1968, edited by Stanley Kubrick himself with Ray Lovejoy), in which a hominid throws a bone into the air and the cut lands on a bone-shaped orbital satellite four million years later. The second most-cited is the match-on-action in Lawrence of Arabia (1962, edited by Anne V. Coates, who won the Best Film Editing Oscar in 1963 for the film), in which Peter O'Toole blows out a match and the cut lands on the sun rising over the desert.
Coates discussed the cut in a Motion Picture Editors Guild oral history and credited David Lean's instruction to find a "stop and start at the same time" image. The match-on-action principle was reused decades later by Schoonmaker for Scorsese (the helicopter blades cutting to the spinning fan in Apocalypse Now, a sequence Murch worked on; the carousel pony cutting to Vegas in Scorsese's filmography) and by editors across every era since.
On the phone, the match cut is the rare cinema move that ports cleanly because the viewer's recognition of the shape match happens in under a second and produces a hit of pattern-completion pleasure. In my read of high-retention short-form posts I audit, match-cut hooks consistently outrun other hook shapes in the first three seconds for a single reason that Coates would recognize: the cut itself is the punchline. The most-shared TikTok hooks built on match cuts work the same way as Coates's match-on-action: blow out a candle, cut to a sunrise; close a laptop, cut to a closing door; lift a coffee cup, cut to a hot-air balloon lifting off. The cut earns the next eight seconds because the viewer's brain just felt the trick land.
The jump cut
A jump cut removes a slice of time from inside the same shot, so that the subject appears to leap forward in space or pose without the camera having cut to another angle. Jean-Luc Godard's Breathless (À bout de souffle, 1960, edited by Cécile Decugis) is the canonical first deployment and is the cut that introduced the technique to the broader cinema vocabulary. The jump cuts in Jean Seberg and Jean-Paul Belmondo's car scenes were initially read as mistakes by viewers and as deliberate provocations by critics; they became, over the decade that followed, a French New Wave signature.
On the phone, the jump cut is no longer a provocation; it is the dominant cut shape of every talking-head TikTok and every commentary YouTube short. Daniel Schiffer's tutorial work, the entire CutWithGreg / Greg's Tips Premiere school, and every Casey Neistat-descended vlogger uses jump cuts inside a single locked-off camera to compress pauses, breath gaps, and stumbles into a single energetic line. The technique is the L-cut and J-cut's faster cousin: it is what makes a six-minute monologue cut down to ninety seconds without feeling cut. Use it as the default for any seated, single-camera, talking-head shot.
The smash cut
A smash cut is an abrupt cut between two shots of starkly different content, often loud-to-quiet or chaos-to-calm. Edgar Wright and his editors (Paul Machliss, Jonathan Amos, Chris Dickens) have built much of their visual comedy on smash cuts that punctuate punchlines, and the Tony Scott / Pietro Scalia work on Domino and Man on Fire uses smash cuts to inject violence into otherwise still moments. Pietro Scalia, who won the Best Film Editing Oscar in 1992 (shared with Joe Hutshing) for the JFK montage sequences and again in 2002 for Black Hawk Down, has used smash cuts as structural devices, not just as punchline tools.
On the phone, the smash cut is the most-effective comedic editing tool available. Hold a beat on the setup line, then smash-cut to the punchline visual on the snare hit of the audio. The cut works because the cognitive shift between the two shots is the joke. Wright's tradition ports to short-form one-to-one.
The whip pan and the swish pan as transitions
A whip pan ends one shot in fast horizontal motion (the camera whips left or right), and the next shot starts in matched fast motion, so that the cut is hidden inside the blur. A swish pan is the same technique with less speed and more grace, typically inside a single take rather than across a cut. The film tradition includes Robert Yeoman's whip pans on Wes Anderson's films and Edgar Wright's whip-and-cut sequences in Hot Fuzz (2007).
On the phone, the whip-pan transition is the workhorse of travel-vlog editing. Sam Kolder's camera-through-the-floor whip-pan, breakdown documented at Rugged Road Trips, pairs a whip pan upward at the end of shot A with a whip pan downward at the start of shot B, so that the cut reads as if the camera passed through a ceiling or a floor between locations. The technique has been imitated to the point of cliché in 2023 to 2026 short-form. It still works when motivated by genuine location change. It fails when used as visual filler on a single-location clip.
The morph cut, the talking-head cheat
The morph cut is a software-assisted transition introduced in Adobe Premiere Pro CC 2015 that smooths a jump cut inside a single locked-off interview by interpolating frames across the cut. The face appears to move continuously even though a chunk of time has been removed. The morph cut is the polite version of the jump cut and is the secret weapon of cleaned-up corporate interviews and YouTube long-form.
On the phone, the morph cut is the right tool when the talking-head footage has too many awkward pauses or stumbles to be cleaned by jump cuts alone, and when the editorial register is "calm and authoritative" rather than "energetic and casual." The jump-cut register is TikTok; the morph-cut register is LinkedIn. Both are valid. The decision is which register the script is asking for.
What the 9:16 Phone Frame Changes About Cutting
Walter Murch's framework was written for two-hour cinema. Schoonmaker's, Smith's, and Walker's craft was honed on films that average two minutes of cut footage per minute of edited timeline. The phone clip runs fifteen to ninety seconds and the cut rate is two to four times faster. Three things change. Three things do not.
What does not change. First, the emotion-first ranking. The cut on the phone serves what the viewer should be feeling next, exactly as Murch describes it for cinema. Second, the audio bleed. The J-cut and L-cut land on the phone for the same reason they land in Dune. Conversation flows over picture; the picture catches up. Third, the match-cut principle. A shape that matches across a cut produces a hit of recognition the algorithm reads as engagement.
What does change. First, the time budget per cut. Lee Smith on Dunkirk (2017, Best Film Editing Oscar 2018, per the IndieWire interview on intercutting three timelines) had ninety minutes of runtime to braid the one-hour mole, one-day sea, and one-week air timelines. The phone editor has thirty seconds. The cuts have to do more work per cut because there are fewer of them and each one has to land. Second, the audio environment. The phone is watched in commute mode, on speaker, with notifications competing for attention. The audio bed cannot rely on a Dolby Atmos theater room to carry subtlety; the cut has to be readable on a single tinny phone speaker. Third, the captions. The viewer often watches muted, with captions doing 60 to 80 percent of the comprehension work. The cut has to align with the caption beat, not just the audio beat.
In my read of short-form accounts I audit, the single most reliable upgrade on a slow-performing account is to add J-cuts to every dialogue boundary. Three frames of audio lead-in is enough to make a fifteen-second clip read as professionally edited rather than as raw recording. The accounts that adopt this change first usually see a one-to-three-second uplift in average watch time within the next ten clips. The accounts that resist it tend to be over-relying on whip-pan transitions to do the work that audio overlap should do. The same audio-driven discipline ports back into the camera-movement guide: the moves and the cuts have to agree about where the viewer's attention is going next.
Why Most Viral-Transition Tutorials Fail the Emotion Test
The genre of TikTok transition tutorials ("the camera-through-the-floor whip pan," "the impossible mirror reveal," "the door-handle wipe") is optimized for the lowest two of Murch's six criteria: the 5-percent two-dimensional plane and the 4-percent three-dimensional space. The tutorial teaches the audience how to make a clean visual transition between two physically impossible locations. The thing the tutorial does not teach is why the cut should happen at all.
The result, which I watch repeatedly in audits, is a clip that opens with an over-engineered transition (a slick whip-pan, a perfect masked wipe) into a script that has not earned the transition. The cut lands on the geometry; the emotion is empty. The viewer reads the empty gesture in the first second and swipes. A technically perfect transition is irrelevant to the viewer's nervous system, which is looking for the feeling the cut produced.
Schoonmaker's three-Oscar career is built on the inverse decision. The cuts in The Departed that everyone remembers are not the technically clever ones; they are the emotional ones. The cut from Costigan recognizing Sullivan to Sullivan's reaction, then to the silence before the elevator opens, is hard cuts on faces. There is no transition effect. The cut is just on the right frame of the right look. That cut is doing 51-percent work. Every short-form creator who wants their videos to move people, rather than to look slick, should be studying Schoonmaker's framing decisions, not the masked-wipe tutorials.
The substitute on the phone is not to drop transitions. It is to use the cut to land on the feeling. A jump cut on a punchline serves emotion. A J-cut on the next sentence of the story serves emotion. A match cut from "I closed the laptop" to "I closed the door of my old job" serves emotion. A whip-pan transition from "in San Francisco" to "in Tokyo" can serve emotion if the script is about the leap; it is empty if the script is about something else entirely.
When Transitions Hurt the Cut
There is a counter-perspective worth sitting with, which Sally Menke (Quentin Tarantino's editor from Reservoir Dogs in 1992 through Inglourious Basterds in 2009, before her death in 2010) embodied across two decades of cutting some of the most maximalist sequences in modern cinema. Tarantino's films are full of stylized moves: split-screen, freeze frame, chapter cards, on-screen text, voice-over interruption. Menke's job, on the published evidence of Tarantino's on-record dedications to her at the Sally Menke Memorial Lecture at Sundance and in interviews, was not to add more style; it was to know when to pull back. The non-linear structure of Pulp Fiction (1994) only works because the cuts inside each chapter are unobtrusive. The viewer rides the structural daring at the chapter level while the intra-chapter cuts stay invisible.
The applied lesson for short-form is that the transition vocabulary is a budget. Spend it where it earns its place; spend nothing where it does not. A clip with one match cut on the right beat is stronger than a clip with four whip-pans. The viewer reads four transitions as anxiety; the viewer reads one motivated transition as control. Pamela Martin's editing on King Richard (2021, Best Film Editing Oscar nomination), Battle of the Sexes (2017), and The Fighter (2010, Best Film Editing Oscar nomination) is similarly disciplined: the films are about families and tennis and boxing, and the cuts are about the human faces in those rooms, not about the geometry of the punches.
The rule reduces to: if you cannot defend the transition out loud in one sentence connected to the emotion, replace it with a hard cut.
What This Means for Your Next Edit
Before you commit the cut, run Murch's hierarchy as a three-question checklist. First, does the next shot land on the feeling the viewer should have right now? If yes, the cut is right. Second, if the answer to the first is unclear, does the next shot deliver story information the viewer needs to keep watching? If yes, the cut is right. Third, if both feel ambiguous, does the cut land on the rhythm the sequence has built? If yes, ship it. If all three answers are no, the cut is unmotivated, and you should either replace the next shot or remove the cut entirely.
For talking-head and commentary content, the default cut shape is the J-cut: audio of the next line begins three frames before the picture changes. For match cuts on hooks, the default is to find a shape, motion, or sound that bridges the two locations. For comedic punchlines, the default is the smash cut on the snare. For transitions between actual location changes, the whip pan earns its place once per clip, maximum. Everything else should be hard cuts on action. The pattern is austerity in transitions, generosity in story. Pair the cut decisions on this page with the framing decisions in the shot composition guide and the motion decisions in the camera movement guide; the three together cover the bulk of the craft surface a short-form clip touches. (If you want a second read on which cut shape your competitors use most often, the Superdirector brand-profile analysis surfaces the cut vocabulary repeated across a competitive set; one option among several.)
The cuts your viewer remembers six months later will not be the slick ones. They will be the ones that landed on a feeling.
FAQ
What is Walter Murch's six-criteria cut hierarchy?
Per In the Blink of an Eye (Silman-James Press, 2nd revised edition 2001), the editor of Apocalypse Now, The Conversation, and The English Patient ranks the criteria a cut should serve in this order: emotion (51 percent), story (23 percent), rhythm (10 percent), eye trace (7 percent), two-dimensional plane of the screen (5 percent), three-dimensional space of action (4 percent). When criteria conflict, the higher-ranked one wins.
What is the difference between a J-cut and an L-cut?
A J-cut starts the audio of the next shot before the picture cuts; the viewer hears the new line first, then sees the new shot. An L-cut continues the audio of the previous shot over the next picture; the viewer keeps hearing the old line while seeing a new image. Joe Walker uses both aggressively on Denis Villeneuve's films, per the IndieWire Toolkit interview, so that conversations flow over picture cuts and the audience often does not consciously notice the cut at all. On phone-frame video, the J-cut is the highest-ROI single addition to talking-head editing.
Is the jump cut bad style?
Not anymore. It was a provocation in Jean-Luc Godard's Breathless (1960, edited by Cécile Decugis); it is the dominant cut shape in modern talking-head TikTok, YouTube commentary, and tutorial video. The jump cut compresses pauses and stumbles inside a single locked-off shot and is the right default for a seated, single-camera, fast-paced clip. The cousin tool when the register needs to read calmer is the morph cut, introduced in Adobe Premiere Pro CC 2015.
What makes a good match cut?
A shared shape, motion, or sound across the two shots, so that the viewer's brain reads the cut as a single continuous form. Anne V. Coates's match-on-action in Lawrence of Arabia (1962, Best Film Editing Oscar 1963, per the Motion Picture Editors Guild oral history) cuts from a blown-out match to a desert sunrise. Kubrick's bone-to-satellite cut in 2001: A Space Odyssey (1968) is the second-most-cited. The phone-frame version uses everyday shapes (a closing laptop, a lifting cup) that bridge two locations or two scales.
Should I use a dissolve or a fade on a TikTok or Reel?
Almost never as glue between shots. A two-second dissolve in a fifteen-second clip burns 13 percent of the runtime on a transition that is not earning attention. The exceptions are deliberate stylistic uses: a dissolve for a memory cutaway or a time-pass montage, a fade for a chapter-mark or a final-line resonance. The default phone cut is the hard cut. The decoration is the exception.
How fast should I cut on short-form video?
Faster than feature cinema but slower than tutorial advice suggests. Lee Smith's Dunkirk (2017, Best Film Editing Oscar 2018) braids three timelines across a feature runtime and the average shot length stays near three seconds inside the action sequences. The phone-frame parallel is a cut every two to four seconds for energetic content and every three to six seconds for thoughtful content. The number to chase is not the count of cuts per minute but whether every cut lands on a feeling, a story beat, or a rhythm beat. The cut that does none of those is the cut to remove.
Continue Learning
Understanding Beat Structure in Short-Form Video
Learn how to structure your videos using beats, the building blocks of narrative pacing that keep viewers watching until the end.
Camera Movement Techniques for Dynamic Content
Learn professional camera movement techniques that add energy and visual interest to your videos, from smooth pans to dynamic tracking shots.
Shot Composition for Short-Form Video
Composition principles from DPs who shoot real movies (Deakins, Yeoman, Fraser, Messerschmidt) re-anchored to the 9:16 phone frame, with safe-zone math, real creator examples, and decisions you can make on set.