If you were to step into a nickelodeon in 1897, you would not simply be a quieter version of a modern moviegoer. You would be a different kind of participant altogether. The flickering images you saw—a train pulling into a station, workers exiting a factory, a couple sharing a kiss—were novelties, spectacular in their mere existence. They were what film scholar Tom Gunning has famously termed a “Cinema of Attractions.” This cinema didn’t tell complex stories; it exhibited. It confronted the viewer directly, much like a magic trick or an amusement park ride, prioritizing showmanship over narrative.
The journey from this cinema of direct address to the narrative-absorbed, psychologically immersive experience we know today was not a foregone conclusion. It was a revolutionary transformation, not just of technology, but of human perception. The first decades of cinema were a grand, unconscious pedagogical experiment, a period where filmmakers and audiences collaboratively had to learn, invent, and internalize a new visual language. The spectator, as we understand the role, had to be born.
This post will explore that education of the gaze. We will move from the confrontational shock of the early “actuality” film to the sophisticated narrative grammar pioneered by D.W. Griffith, examining how a series of technological and formal innovations slowly trained audiences to read images, follow stories, and ultimately, lose themselves in the dream of the screen.
The “Cinema of Attractions”: A World of Views, Not Stories
In its infancy, cinema was an extension of the 19th-century visual culture of spectacle. It shared DNA with the magic lantern show, the diorama, the kaleidoscope, and the carnival midway. The primary goal was not to tell a story but to present a view, a fascinating event, or a documented reality.
The Lumière Brothers: The World as Document
The films of Auguste and Louis Lumière are the purest examples of this. Workers Leaving the Lumière Factory (1895) is precisely what its title promises. The Arrival of a Train at La Ciotat (1896) is arguably the most famous film of this era, not for its narrative, but for its visceral impact. The legend of audiences ducking in terror as the train steamed toward the camera is likely apocryphal, but it perfectly encapsulates the nature of this early cinema: it was an attraction that directly addressed the viewer, creating a sense of immediate, present-tense spectacle.
These films were “actualities”—short slices of life. There was no character development, no plot, and crucially, no editing within the scene. The camera was a static observer, capturing an event in a single, continuous take, replicating the perspective of a theatergoer watching a staged scene. The audience’s role was to witness, to marvel at the technological miracle of captured motion.
Georges Méliès: The World as Illusionistic Stage
If the Lumières showed the world as it was, Georges Méliès showed the world as it could be. A magician by trade, Méliès saw in cinema the ultimate tool for illusion. In films like A Trip to the Moon (1902), he used stop-motion substitutions, multiple exposures, and elaborate painted sets to create a cinema of magical trickery. Yet, for all his narrative ambition, Méliès’s approach was still fundamentally that of the “Cinema of Attractions.”
His camera remained static, placed as if in the best seat of a theater. The magic happened within this single frame. The edits were not used to build psychological continuity but to create a magical effect—a person vanishing, a skeleton appearing. The audience was not asked to follow a character’s emotional journey, but to be delighted by the succession of wonders presented directly to them. The showman, Méliès himself in many cases, was never far from the surface, winking at the audience through the spectacle.
At this stage, the spectator was a witness to a demonstration. They were being trained in the basic grammar of the medium: understanding that a two-dimensional, black-and-white image represented a three-dimensional world, and accepting the temporal compression of real events. But the deeper language of narrative cinema was still to be invented.
The American School: Forging a Narrative Grammar
The shift from attraction to narrative required a new set of cinematic tools. This development happened most systematically in the United States, driven by commercial pressures to regularize production and keep audiences coming back. Key figures like Edwin S. Porter began to experiment with how shots could be combined to create meaning beyond a single viewpoint.
Edwin S. Porter and the Editorial Link
Porter’s The Great Train Robbery (1903) is a landmark in this transition. It’s a film that still contains elements of attraction—the famous close-up of outlaw Justus Barnes firing his gun at the camera could be shown at the beginning or end of the film as a standalone thrill. However, its true innovation lies in its editing.
Porter assembled the film from 14 separate shots, creating a narrative that moved across time and space. We see the robbery at the telegraph office, then the robbery of the train itself, then the bandits’ escape. This was not yet the “continuity editing” we know today; it was a series of tableaux, each a self-contained scene. The connection between them was largely logistical and chronological. The audience had to perform the cognitive work of understanding that these disjointed spaces were part of a single, linear story.
Porter’s genius was in realizing that filmic meaning could be constructed in the relationship between shots. He discovered, perhaps intuitively, the power of what later Soviet theorists would call “linkage.” In The Life of an American Fireman (1903), he even experimented with parallel action, showing a fireman racing to a rescue both inside and outside a burning building, though the temporal overlap is awkward by modern standards. The spectator was now learning to be a linker of scenes, an assembler of a fragmented narrative.
D.W. Griffith: The Architect of the Modern Film Language
If Porter provided the nouns and verbs of cinematic narrative, it was D.W. Griffith who, between 1908 and 1915, systematized the syntax. Working at the Biograph Company, and driven by an ambition to elevate cinema to the level of literature and history, Griffith (in collaborationCollaboration
Full Description:The cooperation of local governments, police forces, and citizens in German-occupied countries with the Nazi regime. The Holocaust was a continental crime, reliant on French police, Dutch civil servants, and Ukrainian militias to identify and deport victims. Collaboration challenges the narrative that the Holocaust was solely a German crime. across Europe, local administrations assisted the Nazis for various reasons: ideological agreement (antisemitism), political opportunism, or bureaucratic obedience. In many cases, local police rounded up Jews before German forces even arrived.
Critical Perspective:This term reveals the fragility of social solidarity. When their Jewish neighbors were targeted, many European societies chose to protect their own national sovereignty or administrative autonomy by sacrificing the minority. It complicates the post-war myths of “national resistance” that many European countries adopted to hide their complicity.
Read more with his cameraman G.W. “Billy” Bitzer and a talented stock company of actors) synthesized and popularized a suite of techniques that fundamentally reoriented the spectator’s relationship to the story.
The Close-Up: From Face to Psyche
The most revolutionary of these techniques was the analytical close-up. In the cinema of attractions, the human figure was part of the spectacle. Griffith turned the face into a landscape of the soul. By cutting from a medium shot to a close-up of an actor’s face—Lillian Gish’s terrified eyes, Henry B. Walthall’s noble anguish—Griffith shifted the cinematic focus from external action to internal emotion.
This was a profound change. It asked the audience to stop looking at a character and start looking into them. The close-up created psychological intimacy, privileging individual experience over collective spectacle. The spectator was no longer a witness to an event, but a confidant to a feeling. They were learning to read micro-expressions, to invest emotionally in the private turmoil of a character, a skill that is second nature to us now.
Cross-Cutting: The Engine of Suspense
Griffith’s masterful use of cross-cutting (or parallel editing) was perhaps his most significant contribution to film grammar. By cutting back and forth between two or more simultaneous lines of action, he created a new, purely cinematic form of suspense. The climax of The Lonedale Operator (1911), where he cuts between a besieged telegraph operator and the train rushing to her rescue, is a textbook example.
This technique required a massive cognitive leap from the audience. They had to hold multiple narrative threads in their mind simultaneously and understand their temporal convergence. Crucially, cross-cutting creates a sense of omniscience in the spectator. They know more than any single character on screen. This active, god-like position—piecing together the separate strands of the narrative—is the default mode for the modern viewer. Griffith taught us how to feel the tension of “in the nick of time.”
Griffith combined these techniques with others, like the iris-in/iris-out (to focus attention) and more fluid camera movement, to develop what would become the foundation of the “Classical Hollywood Style.” The goal of this style was, and remains, invisibility. The techniques are not meant to be noticed as techniques; they are meant to serve the story seamlessly, pulling the viewer deeper into the narrative world rather than calling attention to the mechanics of its construction.
The “Invisible” Style and the Ascendancy of Narrative
In a Griffith film, the edit is not a magical trick (Méliès) or a simple scene change (Porter); it is a psychological and narrative cue. A close-up tells us what to feel. A cross-cut tells us what to anticipate. The camera placement guides our attention and allegiance. The spectator, educated by this new, comprehensive system, could now relax into the story. They didn’t have to work to understand the basic rules; the rules operated subliminally, allowing them to invest their energy in empathy, suspense, and catharsis.
Griffith’s The Birth of a Nation (1915), for all its abhorrent racism, stands as the ultimate expression of this new language. It is a sprawling epic that uses every tool in the Griffith arsenal to tell a story, manipulating the audience’s emotions with a sophistication previously unimaginable. It demonstrated, conclusively, that cinema’s power lay not in its ability to record reality, but to construct a persuasive, emotionally compelling fictional one.
The Spectator’s New Role: From Collective Gaze to Individual Immersion
This transformation in film form precipitated a parallel transformation in the nature of spectatorship and the architecture of movie-going.
The Nickelodeon vs. The Picture Palace
The early nickelodeon was a cramped, often raucous space. The films were short, the atmosphere was communal, and the spectacle was often supplemented by a live lecturer (the benshi in Japan being a famous example) who would narrate the action and speak the characters’ lines. The audience’s gaze was still partly external, shared with the crowd and mediated by a live performer.
As films grew longer and narratives more complex, the role of the lecturer diminished. The film itself was now doing the narrative work. This shift was cemented by the rise of the “Picture Palace” in the 1910s and 1920s. These grandiose theaters, with their opulent décor and orchestral pits, were cathedrals of a new secular religion. They encouraged a different kind of viewing: silent, reverent, and individual. In the dark, surrounded by a grandeur that directed all attention to the glowing screen, the spectator was meant to be absorbed, to lose themselves. The collective gaze of the attraction-era audience gave way to the individual, immersive experience of the narrative-film viewer.
The Kuleshov Effect: Proof of Concept
The final piece of evidence for this newly trained perception came not from a filmmaker, but from a theorist. In the early 1920s, Soviet filmmaker Lev Kuleshov conducted a now-legendary experiment. He shot a single, neutral close-up of the actor Ivan Mozhukhin. He then intercut this identical shot with three different images: a bowl of soup, a dead woman in a coffin, and a child playing. Audiences, shown the sequences, praised Mozhukhin’s performance—his hunger, his grief, his joy.
The Kuleshov Effect demonstrated conclusively what Griffith had already intuited: that meaning in cinema is created not in the shot itself, but in the juxtaposition of shots, in the mental bridge the spectator builds between them. The audience had become an active, if unconscious, collaborator in creating the film’s emotional and narrative meaning. They were now fluent in the language of cinema.
Conclusion: The Legacy of the First Gaze
The silent era was not a primitive prelude to the “real” cinema of sound. It was the foundational period in which the very contract of viewing was written. The journey from the Lumière train to the climax of a Griffith epic represents one of the most significant shifts in the history of human sensory perception. We learned to parse rapid edits, to read the language of the face, to hold parallel narratives in mind, and to emotionally invest in shadows on a wall.
When we sit in a theater today, or scroll through our feeds, we are the heirs to this revolution. The grammar feels natural, invisible, but it is a learned one. Every time we lean in during a close-up, gasp at a cross-cut climax, or feel a surge of emotion at a perfectly timed reaction shot, we are exercising skills that were taught to us over a century ago in the flickering nickelodeons and grand picture palaces. The first spectators were students of a new art; we are now its fluent native speakers, our perception forever shaped by the birth of the cinematic gaze.

Leave a Reply