IRIS Wiki - Narrative Theories - Mood-Cue


Greg M. Smith's (2003) Mood-Cue approach to the analysis of filmic emotion follows an aesthetic perspective on the mechanisms of effect at work in cinematic experience, providing a noteworthy outgrowth and interesting alternative to individual- and character-focused cognitivist theories (e.g. Anderson 1996, Bordwell 1985, Carroll 1990, Currie 1995, Smith 1995, Branigan 1992, Grodal 1997, Tan 1996, 2008; Ortony 2003). It thereby illustrates viable means to overcome the traditionally dominant schemas of motive; action; and goals---as exemplified in Ed Tan's analytic focus on character structures (such as empathy; sympathy; admiration; and compassion) and thematic action and plot structures---by paying as close attention to style as to character. Indeed, Smith proposes the development of a functional understanding of how style should be coordinated with narration as the key to filmic emotion (Smith 2003, p. 109).

The Scenario Domain Addressed

The scenario domain (cf. Rank & Petta, 2006) of Greg Smith's effort comprises the narrative film and the viewer, in the western (Hollywoodian) cultural embedding characterised by the clarity of emotional appeal (Smith 2003, p.96). Narrative "Film" as the domain of this effort is understood to be a highly coordinated visual and audial medium for storytelling in uninterrupted real time, with encultured conventional strategies shared by producers and consumers.

Accordingly, availability of specific prior knowledge with the individual viewer is explicitly stated as a required component of the target scenario. Further, the potential contribution of the film within the scenario setting is clearly delimited: invitations to feel offered by films are not necessarily accepted by their audiences. Matching the desired coverage of both local and global levels of effect (see below), according to the approach a film typically extends multiple such invitations, over multiple and possibly overlapping durations.

While avant-garde and non-narrative films are explicitly excluded from the scope of the present theory and method, their potential relevance for other audiovisual narrative media unfolding in uninterrupted time is also explicitly asserted (ibid., Note 5 to chapter 1, p. 195).

Aims: Desiderata for an approach to filmic emotions

Greg Smith argues for the importance of laying out a scientific complement to the purely commonsense and empirical-based understanding of emotions (i.e., a theory of emotions, as a necessary basis for the development of a principled and practical methodological approach to the analysis of filmic emotions). Further desiderata identified for such an approach are:

  • to lead towards the particulars of a film and provide specific explanations for how particular films elicit emotions;
  • to provide terminology suitable for discussing emotions and their elicitation, i.e., to label emotional states (with some measure of certainty) and to discuss film structures encouraging such responses;
  • to cover emotional phenomena at both, local and global levels, i.e., to explain processes operating across whole films as well as micro processes governing individual scenes; and to explain how these process classes interoperate;
  • to discuss both, emotional states and their dynamic evolution;
  • to explain why films can elicit dependable reactions across a broad range of audiences without denying individual variability in affective reactions;
  • to explain emotion in a wide range of films across multiple genres;
  • to cover a wide range of cinematic devices (accordingly, the specific approach developed emphasises the importance of cinematic style in encouraging emotional responses);
  • to explain why certain narrative structures are more effective than others in cuing emotion, i.e., explaining not only successes but also failures of specific films to elicit emotions;
  • to generate specific questions for future research;

A key motivation for delineating a dedicated supporting emotion theory for the present purposes stems from; the need to make a careful selection of useful contributions from the vast range of available research scattered across disciplines; identifying questions of high relevance that have been researched little (or not at all), and to develop an integrated, complete picture. In particular, the declared aim of this theory is not a discussion of any specific reified emotion, but rather to identify the foundational structures at work in emotion episodes and to describe how they interact across modalities. The focus of the theory lies with the coordination of information from different subsystems (i.e., understanding sound, music, facial expressions, camera framing, etc.).

The two main research questions addressed regard the structure of the emotion system, and identifying specific characteristics of filmic structures particularly suited for exploiting this structure.

The Structure of the Emotion System

In his emotion theory, Greg Smith conceptualises emotions as groups of responses to several potential eliciting systems, comprising in particular: facial nerves and muscles; vocalisation; body posture and skeletal muscles; the autonomic nervous system; conscious cognition; and the nonconscious central nervous system. Out of all of these systems, only the last (i.e., the limbic system with the amygdala) has been scientifically shown to play a necessary role in emotion, but even this one is by itself not considered to be sufficient to cause emotion without any further contributions from some of the others. The overall implication of these circumstances is that the model must allow for multiple causes of experiencing emotion without prescribing any specific order of the contributions of the eliciting systems involved, given that none of all the possible causes need to be present for any particular emotion instance.

This criterion is met by the use of an associative network model as an approximation of the emotion system. The strength and clarity of an experienced emotion instance (which, as already indicated, is not to be equated with socially stereotyped "modal emotions", to borrow from Klaus Scherer's terminology) is represented as activation of a network node. The activation of this emotion node correlates directly with (primarily) the breadth and (secondarily) the intensity of activated components feeding it. For the network, basic thresholds are defined in terms of number of feeds from different supporting subsystems and overall cuing intensity, that are both required to be exceeded for an emotion node to be activated. Through such requirement of redundant cuing to elicit forceful emotions the model balances flexibility and dependability. This approach also meets the desideratum of explaining both the consistency of responses across audiences---under the assumption of a broad sharing of encultured network configurations throughout the population---as well as response variance across subjects---resulting from differences in the specifics of the makeup of these associative emotion networks and in the sensitivities to the contributions of individual subsystems.

At the next higher level, emotion nodes may in turn be interconnected in virtually any combination, shaped by individual experience and, to an important but still limited extent, the regularities of the socio-cultural environment. In this way, specific emotional reactions to given stimulations are reinforced, and the sensitivities to specific subsystem input channels and combinations are adjusted.

It is worthwhile to point out how this model falls in line with the substantial evidence accumulated in emotion psychology that questions the relevance (if not even the downright existence) of so-called "basic emotions", which are instead replaced completely by configurations of functional adaptive relevance: emotions operate as action tendencies spurring subjects towards functional activity, or serve to express a person's internal state. Each emotion prototype represented by an emotion node collects and integrates a rich set of information regarding: appropriate responses; typical emotion eliciting objects; and scripts of how the emotion typically evolves over time. This information feeds in turn into the individual subsystems, influencing their activity which again alters the activation of the network's emotion nodes, etc. Clearly, such conceptualisation of emotion engenders the challenge of how to label the complex and multifarious occurring network configurations in an unambiguous and reliable way (ibid, p.75).

Emotions, Moods and Emotion Episodes

The functional activities that action tendencies instigate a subject towards comprise orientation towards the environment and providing urgency to the way information is gathered. The orienting function of emotion promotes seeking out environmental cues confirming the subject's internal state: Greg Smith equates the primary set of orienting emotions states to moods. Moods are preparatory states wherein one seeks opportunities to express a particular emotion (or set of emotions connected at the higher level); these lower-level expectancies of emotions about to be experienced favour an evaluation of the environment in a mood-congruent fashion, focusing on a specific selection of stimuli. Moods tend to be more diffuse---i.e., tendencies towards expressing emotions rather than emotions proper---and longer lasting than particular emotions.

Moods have an inertia, which encourages to revisit the same stimulus that matches the mood time and time again, in order to sustain the current orientation with fresh bursts of emotions. There thus is on the one hand an inherent tendency to lock-in into a mood once it is established; however a mood is also dependent on the briefer but stronger emotion experiences to retain its dominant influence on overall behaviour. An emotion episode is a sequence of such emotional moments, formed by the combination of the emotional orientation provided by an established mood with the supporting external circumstances. An emotion episode, i.e., such a series of changing emotions, is experienced as structured and coherent entity with defined start, middle, and end. Instantiated emotion episodes resemble the actual emotion prototypes represented in emotion networks. By requiring a specific object, an emotion episode is more focused than a mood; it comprises an action tendency as well as an activation state. An emotion episode terminates before its supporting mood, which will then still carry on in its attempts to initiate further supportive emotion instances for some time (or until it is displaced by a new subject-environment configuration).

It is important to note that the emotional inertia of moods imposes constraints on what narrative pace is viable: the spectators of the narrative must be allotted sufficient time---next to sufficient cues---to change from one mood orientation to any significantly differing one, beyond all due considerations for the intellectual preparedness for such a shift (Smith, 2003 discusses related examples in some detail). Furthermore, the capacity of the emotion system is bounded (in particular for uninterrupted experiences), so that a film's emotional appeal can wear itself out, e.g. by exhausting capacity while preserving consistency over time (i.e., overly extending the duration of one and the same established mood), but also by placing too high demands on the level of effort that has to be expended to update or set up an additional new emotional orientation frame.

Somewhat confusingly, Greg Smith employs the term "emotion work" for such unconditionally required effort---in contrast to the widespread definition introduced by Arlie Hochschild (1983)---and explains how one of the appeals of mainstream cinema can be understood to lie in the maximisation of viewers' emotional investment, by having limited emotion work (e.g., careful attention being paid to the exposition) lead to very large paybacks in terms of the enabled emotional experience. In this context, it must be stressed how necessary considerations to contain emotion work load are of particular importance in the advent of any nonlinearities in film presentation---a reason put forward to explain the use of strongly marked devices to delineate flashbacks, or the employment of symmetrical flashbacks which reliably end with the anchor character who set out from the diegetic present being returned near the initial departing point.

As typical durations of instances of the different concept (which of course are co-determined by the pace of the environment), Greg Smith states: seconds to minutes for emotions; a few minutes for emotion episodes; and four to fifteen minutes (but up to hours) for moods. This set of temporal ranges is instrumental in the model's coverage of the dynamics of filmic emotions at global and local levels. The briefer extents of emotions are suited for the urgency and speed required to cope with sudden environmental changes. The consistency of emotional orientation adaptive for stable environments is provided by mood, which through its selective filtering of perception contributes itself to the shaping of coherent experiences of longer duration. Together, the concepts are instrumental to explain how all of flexibility and efficiency, speed and stability, adaptability and coherence can be brought about by the interplay of a parsimonious number of complementary interacting model components.

Given this picture of such a flexible emotion system, the next challenge is to answer the question how films can be structured so as to meet the desideratum of reliably eliciting consistent responses from across a wide variety of viewers. It is the very structure of the emotion system as presented that does inform the identification of structures filmmaking practice has developed to this end.

The Mood-Cue Approach to the Analysis Filmic Emotion

Against the theoretical background developed, Greg Smith sustains that the primary emotive effect of film is to create mood (as defined in the previous section). The task of successful film structures is to increase the probability of evoking emotion. This is achieved by paving the way though the establishing of a specific viewer predisposition towards the experiencing of emotion, i.e. by setting up a specific mood. This approach is practical given that due to their more diffuse nature the lower-level mood state can be established with less concentrated cuing than required by any specific emotion. In other words, the first task of a film is to install a specific emotional orientation of the viewer towards itself. Within this task, the need to sustain mood by a recurrent feed of stronger brief emotional moments must be also covered.

The options available to that end are vast, given that the full range of perceptual cues can be exploited, which includes, but is not limited to: facial expression, figure movement, dialogue, vocal expression and tone, costume, sound, music, lighting, mise-en-scène, set design, editing, camera (cf. IRIS deliverable D5.1), depth of field, character qualities and histories, and narrative situation (Smith 2003, p.42). Such a range of choices is important also to be able to successfully address the challenge posed by the variability of the audience: Use of a variety of redundant emotive cues increases the likelihood of audience members with differing preferences of emotional access to be all aligned towards an appropriate emotional orientation.

To summarise, according to the mood-cue approach, a film sets out by establishing a (first) basic orientation of the viewers towards the film. Once such coherence is achieved, subsequent dense bursts of highly coordinated cuing are used to either bolster or alter the mood. Greg Smith illustrates the importance of careful management of emotional orientation in his discussion of instances of bad practice, ranging from the early The Strike directed by Sergei Eisenstein to Wayne Wang's The Joy Luck Club.

Coordinated Cuing: Emotion Markers

Greg Smith introduces the term "emotion marker" for configurations of highly visible and typically simple and direct textual cues employed in classical Hollywood cinema with the main aim to elicit brief moments of emotion. Herein, the availability of alternatives to the use of strictly diegetic means (e.g., achievements, obstacles and failures in pursuit of diegetic goals) is of importance, since the purpose of emotion markers is neither to advance or retard progress in the narrative, nor to provide information or commentary about the story, but to bring about a brief burst of experienced emotion to sustain or change a viewer's established mood. With respect to the sustaining function of emotion markers, it should be recalled that it suffices for them to be congruent with the established mood (rather than to match it exactly), since the self-sustaining nature of moods will then see to it that viewer avail themselves of suitable aspects of the offered experience.

The identification in a range of films across different styles and genres of a substantial number of emotionally marked moments that do not serve any apparent significant goal-oriented function but plausibly serve as emotion markers provides credible support to this theory put forward.

To recap, within the proposed conceptual framework, individual emotional cues form the basic building blocks to analyse a text's emotional appeal. Larger structures, such as emotion markers are assembled out of such (redundant) cues. Mood is sustained by successions of cues, not all of which need to be themselves instances of larger substructures such as diegetic narrative obstacles or emotion markers.

Genre Microscripts

Against this framework, the notion of genre is itself expanded beyond its composition out of narrative and iconographic patterns, to further include patterns of emotional address equipping spectators with scripts to use in interpreting a genre film. It is in particular the smaller genre microscripts, intertextual sets of expectations for sequences and scenes, that are of particular significance. The educated/encultured viewer of Hollywood films has accumulated huge collections of such microscripts that invite anticipation of narrative, stylistic, and emotional developments and are invoked quickly and simply, e.g., with just a very few lines of dialogue. With their limited extent, microscripts match the likewise short scope of emotions and thus form useful guides for the viewers' emotion systems.

Greg Smith argues that in the event of a recognised genre microscript being consonant with an already established mood, such agreement encourages not just detached recognition, but active execution within the viewer's own emotion system: i.e., the spectators are invited to genuinely experience, i.e., feel, the associated feelings. The earlier caveat however of course still holds: by itself, the film cannot go beyond extending such invitations---it remains within the authority of the viewer whether these are actually accepted. In (Smith, 2003) detailed analytical examples are provided for how films can deploy selections of microscripts from different genres to achieve characteristic affective signatures: through qualities such as the frequency and clarity of microscripts and other compatible cues indicating specific genres, the relative dominance and robustness of established moods, i.e., of basic interpretatory references can be defined, providing the director with even significant latitude to indulge in feeding the audience with cues of other kinds without threatening their basic orientation. Conversely, overturning a very firmly established mood can become an impossible task, and Smith also discusses how different (varying) density of (non-diegetic) emotional informativeness and degrees of (varying) goal orientation of a text open up different strategies how to (try to) manipulate audience mood. Sustained higher degrees of goal orientation carry the danger of easily overly constraining the achievable emotional appeal, since it may be all too easy to label the emotional states of highly goal-oriented characters performing actions moving them towards a clear series of goals and to make sense out of available emotions cues; at the opposite extreme, it is demonstrated how lack of sufficient emotional guidance of the audience, leads to experienced unpredictability and failure to follow a story even though it may provide sufficient cohesion at a purely intellectual level. In contrast, approaches such as to set out from a low degree of redundancy and associated expectations for sparse emotion cuing, followed by a gradual increase towards an increasing number of redundant emotion cues near the film's climax are reported to provide significant emotional payoffs. Similar patternings of emotion cues

For all these considerations, it holds that as for the underlying physiological system, also textual qualities such as density of emotional information and level of goal orientation are relative, not absolute terms, that require intertextual comparisons for their definition.

In the mood-cue approach, music realises a primary function closely related to the role of genre microscripts, by providing local structures. Short musical cues in particular, such as motifs of brief duration, are well suited to appeal to the emotion system at specific moments, eliciting short-lived emotional experiences. In the same way, proximate sound effects for example can endow diegesis with a quality of "nearness". Anempathetic music (that shows conspicuous indifference to the situation and exactly thereby reinforces the apparently ignored individual emotion of characters and spectators) is a further example of providing emotional significance with musical means. A short catalog of emotion-eliciting roles of music provided by Smith (p. 106) includes: conveying of simple emotional states (e.g. via particular harmonic intervals); reminding the viewers of who possesses emotional knowledge (viewers vs. characters); helping to prompt key expectations (e.g. via historical, geographical, or cultural motifs); ironic commenting on action; linking moments, endowing an aggregate emotional force to a specific situation identified; subserving both, local and global purposes by repetitions. An additional important function results from withholding music from particular scenes, thereby e.g. preventing overdetermination.

Feeling-for vs. Feeling-with

A further particular role of emotional cueing regards the orienting function in the context of the distinction of "feeling-for" vs. "feeling-with", or---in the terminology of (Tan, 2000)---of non-empathetic from empathetic feelings within so-called R-emotions (emotions resulting from appraisals involving elements of the represented world): in empathetic R-emotion, the beholder pictures the meaning of the represented situation for some represented or implied person or character ("feeling-with"), while in non-empathetic R-emotion, the beholder takes on an outside perspective ("feeling-for"). Greg Smith explains the difference in terms of the state of narrational knowledge of the viewer. Feeling with is said to depend on the spectators being on par with the characters in terms of knowledge of the emotional situation, in which case "the character exemplifies what the viewer's appropriate reaction should be." (Smith 2003, p. 90). In contrast, in a situation where the viewer possesses knowledge denied to a character and that character could be predicted to experience specific feelings if that knowledge was disclosed, the condition for the viewer to feel for the character holds, wherein the spectator can rely more on their own competence of adequate affective reaction, based on microscripts and knowledge from other sources. It thus is in this latter class of situations, that viewers have the freedom to reject any invitation to feel for a character---and it is in this regard that supply with a sufficient amount of narratively significant cues can provide decisive encouragement for viewers to actually engage in "feeling for". Likewise, through strong cuing towards feeling with, the stage can be set for the character to be endowed with the authority to establish and thus orient the viewer in a specific direction.

Forward-looking vs. Rear-driven narration

Within his more detailed discussion of the film Casablanca, Greg Smith introduces the notions of "forward-looking" and "rear-driven" narration. These regard the two basic modes into which viewers may be cued, so as to predict the behaviour of characters. A rear-driven process selects operators based on the current and previous states, relying essentially on consistency of operator selection under sufficient knowledge of the influence of environmental conditions. In contrast, when forward-looking, it is an understanding of a character's goals that is used as the key to prediction: given an initial state, candidate actions are then selected based on whether and how well/quickly/safely/... they approach the known goal state.

According to Smith, classical cinema narration traditionally subscribes to the forward-looking approach while attempting to sustain the appearance of being rear-driven. Exposition of characters' background is limited to what is needed to uphold an impression of their being coherently motivated by their pasts. The characters' past usually remains largely hidden; and any exploration of a character's biography usually occurs again in a forward-looking fashion, e.g. resorting to devices such as extended flashbacks that are traversed via erotetic narration (i.e., the posing of a series of questions, which are subsequently adopted as goals to be answered).


