The Video Essay as cumulative and recursive scholarship

Suppose you wanted to make a video essay around ten minutes long. You might go about it in two different ways. First, you might pick a bunch of different video clips and show each example one time. Second, you might pick a single video clip and show that one example several times. The end result will be ten minutes long in both cases, but you arrive at those ten minutes by a different path. The first approach is cumulative: you fill out the video by accumulating different examples without repeating them. The second approach is recursive: you fill out the video by repeating the same example, not by adding new ones. Admittedly, the cumulative/recursive distinction is a blunt distinction, and many (perhaps most) videos work between these two extremes. But the distinction seems valuable for several reasons: as a guide to noticing affinities and contrasts across a range of videos, as an indication of some of the different rhetorical problems that a video essayist might face, and as a way of thinking about how a video essay may draw on or depart from existing traditions in written scholarship.


Works that operate primarily in the cumulative mode include Dissolves of Passion by Catherine Grant, Minnelli Red by Carlos Valladares, Kim Novak: A Profile Piece by Claire Steinman,[1] Sound in Hanna-Barbera by Patrick Sullivan, and Gilmore Girls Hair: Unraveling Rory’s Locks, by Screenprism.[2] These videos assemble a remarkably wide range of elements: respectively, all 64 of the dissolves in Brief Encounter (1945); clips featuring various shades of red, drawn from sixteen different films directed by Vincente Minnelli; nearly two dozen scenes from Vertigo depicting Kim Novak in profile; audio and video clips from twelve separate Hanna-Barbera programs; and examples of Rory Gilmore’s hairstyles across all seven seasons of Gilmore Girls, plus the reunion special.[3] Note that the cumulative mode cuts across other useful distinctions that scholars have drawn to make sense of the videographic field, such as the distinction between the explanatory and the poetic.[4] For instance, Screenprism’s cumulative Gilmore Girls video contains the explanatory voice-over that is typical of the pedagogical demonstration, while Steinman’s cumulative Kim Novak essay features the rhythmic editing typical of the cine-poem.[5] The cumulative mode may be completist, as in Grant’s Dissolves of Passion, which works through every single example in order; or it may be selective, as in Valladares’s Minnelli Red, which uses its wide range of examples to represent the even wider range of possibilities one might find in Minnelli’s films.

Although it has roots in the fan-made supercut, the cumulative video often draws on established traditions in written scholarship, where certain arguments demand a long list of examples to be fully convincing, as in genre studies that draw examples from dozens of films. For comparison, consider two classic pieces of written criticism. In a book chapter on the functions of dialogue in narrative film, Sarah Kozloff makes the case that dialogue serves the function of character revelation by quoting three script passages (The Fugitive [1993], Shadow of a Doubt [1943], and Tootsie [1982]) and making a handful of short supporting references.[6] Charles Ramírez Berg, in a book passage arguing that Emilio Fernández and cinematographer Gabriel Figueroa persistently favored low-angle compositions, provides captioned illustrations from La Perla (1947), María Candelaria (1944), Flor Silvestre (1943), Río Escondido (1948), and Enamorada (1946).[7] In both cases, it is the accumulation of diverse examples that makes the argument convincing. A single example would not prove the point that character revelation is a commonplace function; nor would one citation prove the point that the low-angle composition is a recurring technique. Audiovisual scholarship in the cumulative mode often works in the same way, presenting a wealth of examples to provide compelling support for a generalized claim. At one point in the Hanna-Barbera video, Sullivan uses a split-screen effect to introduce twelve separate examples, all in support of his larger claim that sound effects combine with a visibly shaking image to give life to off-screen crashes, with the important implication that this function is fully consistent with the life-giving aims of animation. The members of Screenprism support their argument that Rory’s changing hair offers insights about her changing character by offering quick analyses of nearly a dozen hairstyles: long and simple, an up-do, the high-society bun, the retro look, long curls, girl braids, a long bob, tight curls, wild-child bangs, pink dye, and a professional side pony.

The cumulative-recursive distinction also cuts across the sometimes-fuzzy distinction between the scholarly video and the popular video; some of my examples are scholarly, some are popular, and some are both. At first glance, Screenprism’s Gilmore Girls video seems to belong firmly on the popular side of the spectrum. It was posted to YouTube (where it has received nearly 300,000 views), and its wit and pace suggest that it was aimed at fellow fans, not media scholars. However, the video does not just assemble a range of clips for the purposes of entertainment. It actively interprets those clips, finding layers of thematic meaning in the changing patterns. In so doing, it accomplishes the familiar scholarly goal of interpretation, while treating that goal as one purpose among others. Does this mean that interpretation is what separates a scholarly from a popular one? Not necessarily. Sullivan’s video is certainly scholarly, but, in my view, it is more concerned with theory than interpretation. The video argues that we must take sound into account when we evaluate a work of animation; a video may appear to be poorly animated when we experience it as a silent image, but it may create a more vivid impression of life and movement when we experience the image and sound together. In comparison to the Gilmore Girls essay, this insight tells us little about characterization and theme, but it offers a fresh way of thinking about longstanding issues in animation theory. Rather than draw a sharp line splitting videos into popular and scholarly camps, my instinct is to use the term scholarship to refer to a cluster of activities (researching, arguing, theorizing, interpreting, evaluating, and more), which do not necessarily have any one thing in common. Both of these videos are doing many things at once, some of which are recognizably scholarly, and some of which are not.[8]

Either way, these videos show how the accumulation of examples can advance an argument, even when the argument is never stated in so many words. Whereas writers often state their key points clearly and up front, as in a traditional thesis statement, many audiovisual critics prefer to leave central claims implicit; they compile and organize the examples in such a way that the viewer must infer the argument. In just two minutes of screen time, Steinman manages to establish that the profile shot is a recurring motif in Vertigo, that it is tied to Madeleine/Judy (Kim Novak) specifically, and that it is often linked to the gaze of Scottie, and she does all this without the benefit of words. A written essay would struggle to get through so many examples in two pages, or even in ten. Other cumulative videos rely on words more extensively: Screenprism and Valladares use voice-overs, while Sullivan favors onscreen text. But the essayists’ handling of audiovisual form still does much of the argumentative work. For instance, Valladares makes sense of Minnelli’s color palette by clustering his examples into four categories: red as melancholy, red as love, red as panic, and red as red. Although he makes these claims in the voice-over and reinforces them with onscreen text, the crucial work here lies in the editing: in the selection and organization of clips, which meaningfully interprets each clip by grouping it into a particular category. At one point, Valladares juxtaposes the “I Remember It Well” number from Gigi (1958) with “The Party’s Over” from Bells Are Ringing (1960). One could compare the scenes on a number of levels, but the precise placement of the clips within Valladares’s essay brings one specific aspect to the fore: the way that the color red comes to express the melancholy tone of each song. This aspect becomes salient precisely because the two clips join two larger accumulations: a cluster of clips involving red in general and a sub-cluster involving melancholy in particular. The clustering furthers the goal of interpretation, but with a twist. Valladares proposes that the color red has at least three possible emotional meanings in Minnelli’s films: melancholy, love, and panic. The crucial twist comes in the section on “red as red,” which daringly swerves against such interpretations by arguing that the color has a sensory appeal beyond whatever emotional meaning it might add to a given film’s story.

Immediately after this example, Valladares summarizes his argument so far by repeating three recently shown clips: Lust for Life (1956), Bells Are Ringing, and Meet Me in St. Louis (1944). This brief passage is simultaneously cumulative (stringing together three short clips) and recursive (repeating key moments from clips we have already seen). Clearly, the two modes are not mutually exclusive, and the essayist may shift between them depending on the needs of the argument. Adding another layer of complication, the video essayist must decide whether to present the clips simultaneously or sequentially—that is, via split screen or via an unfolding timeline. In her Vertigo essay, Steinman favors sequential presentation, moving through the examples one by one, building up to a rapidly edited passage where eight clips appear in less than eight seconds, perfectly timed to the music. By contrast, in the previously mentioned off-screen crash sequence, Sullivan gradually puts twelve Hanna-Barbera examples onscreen one by one, until they fill the entire screen. Although I found myself persuaded after the first three or four, seeing all twelve together clinched the case, while providing the opportunity to recognize how many variations there are on the basic pattern.


Works that are partly or wholly in the recursive mode include Gentlemen Prefer Blondes (remix remixed 2013), by Laura Mulvey; Jacques Tati’s Play Time—How to Make a [Critical] Joke, by Miklós Kiss; Opening Choices: Notorious, by John Gibbs and Douglas Pye; Un/Contained by Catherine Grant; and Variations on a Scene by Davide Rapp. Replaying is the defining tactic of the recursive mode. The videos by Mulvey and Rapp are unusually pure examples, presenting a single clip multiple times while deforming it in ways that ask viewers to experience the clip anew each time. Mulvey’s video shows a clip from Gentlemen Prefer Blondes (1953) at different speeds and rhythms five or more times (depending on how one counts); Rapp’s video shows a clip from Mario Bava’s Kill, Baby, Kill (Operazione Paura, 1966) and then repeats the same clip in five different variations, including one where the clip has been rendered as a Moebius strip. Even videos that are not recursive overall may employ recursive strategies at key moments, as in my other three examples. Kiss’s essay brings together several clips and slides from Tati’s film (in the cumulative mode), but the heart of the video is a close analysis of a single joke showing Giffard (Georges Montant) walking into a glass door. After an initial presentation of the joke, Kiss rewinds the clip four times to make four distinct points about it. Similarly, Gibbs and Pye introduce an eighteen-minute video with a four-minute segment showing the opening scene of Notorious (1946), or portions of it, several times. The remainder of the essay shifts to a more cumulative approach, assembling clips from elsewhere in Hitchcock’s film to explain how the opening has established key themes that the rest of the film will explore. The recursive mode often focuses on a very small fragment and analyzes it closely. Grant examines a four-second shot of a broken window from Fish Tank (2009). By my count, the video shows this cryptic image seven times over the course of five minutes; additionally, we sometimes hear the clip without seeing it. Along the way, Grant juxtaposes the clip with several other materials, including relevant scenes from elsewhere in the film and quotations from scholarly sources.

Like the cumulative mode, the recursive mode may be explanatory or poetic, and it may present its clips simultaneously or sequentially. Also like the cumulative mode, the recursive approach has some notable precursors in written scholarship. For instance, Michel Chion opens his book Audio-Vision: Sound on Screen with a description of the opening sequence of Bergman’s Persona (1966); he then describes the sequence again as it might appear without the sound, thereby demonstrating the value that sound has added to the work.[9] Significantly, Chion’s written account assumes that the reader has access to film or video technology, at least in an imaginary sense; he even asks readers to join him in rewinding the film. Other forms of written recursion are less explicit in their invocation of audiovisual technologies, but they “replay” the scene nonetheless. When Douglas Pye analyzes Lisa’s entrance in Rear Window (1954), he quotes several lines of dialogue, as they might appear in a screenplay. Then he goes through the conversation again, quoting some of the same lines but pairing them with richer descriptions of Grace Kelly’s movements as she walks through the scene.[10] Though published in 2010, the effect is strikingly similar to a video essay that plays the scene all the way through a single time and then replays the same scene, starting and stopping to highlight significant moments. For both Chion and Pye, the scholarly goal is to push the reader to notice nuances that are easy to miss precisely because the scene in question has become so famous and familiar.

As we have seen, cumulative scholarship (whether written or audiovisual) might contribute to a number of different scholarly programs: genre studies, national cinema studies, auteur studies, and more. At first glance, the recursive technique seems to have a more specific affinity for one particular method: close reading or analysis. The scholar who practices close analysis in written criticism typically examines one film at a time and sometimes homes in on a particular scene or shot for special attention. Similarly, all five of the videos listed above focus on a single film, and the recursive passages (by definition) repeat a single clip. However, it should be noted that the practice of close analysis can be surprisingly flexible, intersecting with many scholarly approaches. Close analysis has a long history in film studies, stretching back to include works of auteur criticism (e.g. Robin Wood’s book on Hitchcock, originally published in 1965), structural analysis (e.g. Raymond Bellour’s 1973 analysis of a 12-shot sequence in The Big Sleep [1946]), and neoformalist criticism (e.g. Kristin Thompson’s 1988 chapter on Late Spring [1949]).[11] Continuing and revising this tradition, recent scholars have employed close analysis to develop arguments about many subjects, from ideology and technology to cinephilia and philosophy.[12]

Reviewing some celebrated examples of close analysis, I initially supposed that books and essays practicing the method would rely on recursion extensively, describing the same scene in different passages of prose. However, that is not always the case: many great works of close analysis take the reader through the scene by describing its sounds or images in order, creating an argument that proceeds step-by-step rather than back-and-forth. For instance, Mary Ann Doane’s classic analysis of the home-movie scene in Rebecca (1940) moves systematically from the beginning of the scene to the end. Along the way, Doane makes connections with other scenes and draws a comparison with a similar sequence from Caught (1949), but the underlying structure of the passage remains chronological.[13] Similarly, Douglas Pye’s close analysis of Distant Voices, Still Lives (1988) uses prose to create a vivid impression of moving through the scene moment by moment, as if we were watching the scene one time with a remote control: “There is now a cut to inside the house…. The radio sound has now become inaudible…. An unaccompanied female voice now starts singing.”[14] Indeed, recursion in the strong sense of the word—describing the same exact moment again and again—arguably seems out of place in written prose, where craft norms advise against redundancy.

By contrast, audiovisual criticism has allowed scholars to take full advantage of recursion as an analytical technique. Indeed, the format seems particularly well suited to the goal of bringing easy-to-miss details to the surface. At the sensory level, movies can be quite dense, with layers of intricacy from foreground to background, all subject to change moment by moment. By playing the same clip repeatedly, a recursive passage gives the viewer the opportunity to notice complications, felicities, and contradictions. The software’s technology empowers the scholar to bring these nuances to the fore. For instance, Kiss lists four distinct points about the glass-door joke in Play Time, and he clarifies each point using a different technique: slow-motion (to show how Tati stages distracting action in the foreground), red and blue circles (to illustrate the similarity between the door handle and the briefcases), blue lines (to highlight the difficult-to-see outline of the glass door), and a moving red line (to show how the door’s outline overlaps the outline of a distant building). Meanwhile, a precisely timed voice-over connects these details to the larger argument. Even in a well-illustrated book, it would be difficult to convey these points so clearly. Grant’s essay on Fish Tank uses no voice-over, instead relying on montage and onscreen text to make its points. The result is a brilliant demonstration of the scholarly power of juxtaposition. At first, the central clip of the cracked window appears by itself, following a text slide discussing Grant’s affective response. Later, the clip is shown in its original context from Fish Tank, where it appears as an unexpected cutaway during a scene of the protagonist weeping. The next few iterations (including one with audio but no picture) emphasize particular formal features, such as camera movement and sound design. Eventually, the video arrives at its culminating juxtaposition, using a split screen to compare the cracked-window shot with a similar image from earlier in the film. While the video’s onscreen text does important work by situating the scene within the context of psychoanalytic theory, much of the argument is carried by the montage, which uncovers layers of meaning to this four-second shot by presenting it in so many different ways: by itself, within its original context, with or without sound, or in comparison to an earlier scene. Again, such interpretive recontextualization is a tactic with precedent in written criticism, but it carries special force in videographic form. Grant has described her video as a “dense yet concise study (and experience) of the intricate poetic-cinematic patterning” of Arnold’s film; the video allows viewers to experience the clip and reflect on their experiences.[15] Whenever I watch this video, I get the sense that I have learned some new ways of thinking about it—but I also get the sense that the clip remains deeply mysterious. In other words, the repetitions explain how the shot of the broken window fits into several patterns in the film, without diminishing the shot’s disturbing power.

It is quite common for video makers to shift from mode to mode, depending on the creative and rhetorical needs of the project at hand. My own work has switched between modes over time. My first video was firmly in the cumulative mode; more recently, I have produced close analyses of individual scenes, and my editing has shifted toward the recursive mode. In the process, I have found myself confronting a whole new set of problems that were less salient in the cumulative mode: How many times should I show the scene? Should the repetitions be consecutive or spaced far apart? Must each repetition feature its own technical modification? At first, the problems seemed to be more creative than scholarly—a matter of making sure no one got bored. But watching these other essays has shown me that the most successful recursive videos treat these problems as scholarly problems, as well—as matters of interpretation, of analysis, of argument, and, ideally, of insight.

Patrick Keating is a Professor in the Department of Communication at Trinity University, where he teaches film studies and video production. He is the author of The Dynamic Frame: Camera Movement in Classical Hollywood. His videos have appeared at [in]Transition, Movie: A Journal of Film Criticism, and NECSUS: European Journal of Media Studies

1. I am proud to report that Steinman made the essay in my class. So far, few people have seen it. I hope that readers will take the time to watch this extraordinary two-minute cine-poem.

2. The members of Screenprism include Susannah B. McCullough, Naina Lee, and Leigh Raper.

3. I take the number 64 from Catherine Grant, “Dissolves of Passion: Materially Thinking through Editing in Videographic Criticism,” in The Videographic Essay: Practice and Pedagogy, ed. Christian Keathley, Jason Mittell, and Catherine Grant, 2019, Some of the other numbers are estimates; it can be difficult to arrive at an exact figure because certain sequences draw multiple examples from individual scenes.

4. On the distinction between the explanatory and the poetic, see Christian Keathley, “La caméra-stylo: Notes on Video Criticism and Cinephilia,” in The Language and Style of Film Criticism, ed. Alex Clayton and Andrew Klevan (London: Routledge, 2012), 181.

5. Cristina Álvarez López and Adrian Martin use the term “pedagogical demonstration” to refer to videos that follow the format of an “illustrated lecture.” They contrast this category with the “cine-poem.” See “The One and the Many: Making Sense of Montage in the Audiovisual Essay,” in The Audiovisual Essay: Practice and Theory of Videographic Film and Moving Image Studies, September 2014. Online at

6. Sarah Kozloff, Overhearing Film Dialogue (Berkeley: University of California Press, 2000), 43-47.

7. Charles Ramírez Berg, The Classical Mexican Cinema: The Poetics of the Exceptional Golden Age Films (Austin: The University of Texas Press, 2015), 109-112.

8. One could say the same thing about much written criticism.

9. Michel Chion, Audio-Vision: Sound on Screen, second ed., trans. Claudia Gorbman (New York: Columbia University Press, 2019), 3-4.

10. Douglas Pye, “Enter Lisa: Rear Window (1954),” in Film Moments: Criticism, History, Theory, ed. Tom Brown and James Walters (London: British Film Institute, 2010), 45.

11. Robin Wood, Hitchcock’s Films Revisited, revised ed. (New York: Columbia University Press, 2002); Raymond Bellour, “The Obvious and the Code,” trans. Diana Matias, in The Analysis of Film, ed. Constance Penley (Bloomington: Indiana University Press, 2000), 69-76; and Kristin Thompson, “Late Spring and Ozu’s Unreasonable Style,” in Breaking the Glass Armor: Neoformalist Film Analysis (Princeton: Princeton University Press, 1988), 317-352. Wood’s volume incorporates the chapters from his first volume on Hitchcock, originally published in 1965.

12. To cite two among many possible examples, consider Rashna Wadia Richards, Cinematic Flashes: Cinephilia and Classical Hollywood (Bloomington: Indiana University Press, 2013) and Donna Kornhaber, Wes Anderson (Urbana: University of Illinois Press, 2017), which use insightful passages of close analysis in strikingly different ways.

13. Mary Ann Doane, The Desire to Desire: The Woman’s Film of the 1940s (Bloomington: Indiana University Press, 1987), 163-168.

14. Douglas Pye, “Movies and Tone,” in Close Up 02, ed. John Gibbs and Douglas Pye (London: Wallflower Press, 2007), 23-24.

15. Catherine Grant, “Beyond Tautology? Audiovisual Film Criticism,” Film Criticism 40, no. 1 (2016),;rgn=main.