Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Music and the Aging Brain
Music and the Aging Brain
Music and the Aging Brain
Ebook951 pages9 hours

Music and the Aging Brain

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Music and the Aging Brain describes brain functioning in aging and addresses the power of music to protect the brain from loss of function and how to cope with the ravages of brain diseases that accompany aging. By studying the power of music in aging through the lens of neuroscience, behavioral, and clinical science, the book explains brain organization and function. Written for those researching the brain and aging, the book provides solid examples of research fundamentals, including rigorous standards for sample selection, control groups, description of intervention activities, measures of health outcomes, statistical methods, and logically stated conclusions.
  • Summarizes brain structures supporting music perception and cognition
  • Examines and explains music as neuroprotective in normal aging
  • Addresses the association of hearing loss to dementia
  • Promotes a neurological approach for research in music as therapy
  • Proposes questions for future research in music and aging
LanguageEnglish
Release dateMay 28, 2020
ISBN9780128174234
Music and the Aging Brain

Related to Music and the Aging Brain

Related ebooks

Psychology For You

View More

Related articles

Reviews for Music and the Aging Brain

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Music and the Aging Brain - Lola Cuddy

    2019;11(4):236–244.

    Chapter 1

    The musical brain

    Stefan Koelsch¹ and Geir Olve Skeie², ³,    ¹Department for Biological and Medical Psychology, University of Bergen, Bergen, Norway,    ²The Grieg Academy – Department of Music, University of Bergen, Bergen, Norway,    ³Department of Neurology, Haukeland University Hospital, Bergen, Norway

    Abstract

    During listening, acoustic features of sounds are extracted in the auditory system (in the auditory brainstem, thalamus, and auditory cortex). To establish auditory percepts of melodies and rhythms (i.e., to establish auditory Gestalten and auditory objects), sound information is buffered and processed in the auditory sensory memory. Musical structure is then processed based on acoustical similarities and rhythmical organization, and according to (implicit) knowledge about musical regularities underlying scales, melodic and harmonic progressions, etc. These structures are based on both local and (hierarchically organized) nonlocal dependencies. In addition, music can evoke representations of meaningful concepts, and elicit emotions. This chapter reviews neural correlates of these processes, with regard to both brain-electric responses to sounds, and the neuroanatomical architecture of music perception.

    Keywords

    Musical processing; music-evoked emotions; musical syntax mismatch negativity (MMN)

    Introduction

    Music is a special case of sound: as opposed to a noise, or noise-textures (e.g., wind, fire crackling, rain, water bubbling, etc.), musical sounds have a particular structural organization in both the time and in the frequency domain. In the time domain, the most fundamental principle of musical structure is the temporal organization of sounds based on an isochronous pulse (the tactus, or beat), although there are notable exceptions (such as some kinds of meditation music, or some pieces of modern art music). In the frequency (pitch) domain, the most fundamental principle of musical structure is an organization of pitches according to the overtone series, resulting in simple (e.g., pentatonic) scales. Note that the production of overtone-based scales is, in turn, rooted in the perceptual properties of the auditory system, especially in octave equivalence and fifth equivalence (Terhardt, 1991). Inharmonic spectra (e.g., of inharmonic metallophones) give rise to different scales, such as the pelog and slendro scales (Sethares, 2005). Thus for a vast amount of musical traditions around the globe, and presumably throughout human history, these two principles (pulse and scale) build the nucleus of a universal musical grammar. Out of this nucleus, a seemingly infinite number of musical systems, styles, and compositions evolved. This evolvement appears to have followed computational principles described, for example, in the Chomsky hierarchy¹ and their extensions (Rohrmeier, Zuidema, Wiggins, & Scharff, 2015)—that is, local relationships between sounds based on a finite state grammar, and nonlocal relationships between sounds based on a context-free grammar (possibly even a context-sensitive grammar; Rohrmeier et al., 2015).

    By virtue of its fundamental structural principles (pulse and scale), music immediately allows several individuals to produce sounds together. Notably, only humans can synchronize their movements (including vocalizations) flexibly in a group to an external pulse (see also Merchant & Honing, 2014; Merker, Morley, & Zuidema, 2015). This ability is possibly the simplest cognitive function that just separates us from animals (Koelsch, 2018), which would make music the decisive evolutionary step of the Homo sapiens, maybe even of the genus homo. Animals have song and drumming (e.g., bird song, ape drumming etc.), but gorillas do not drum in synchrony, and whales do not sing unison in choirs. Making music together in a group, that is, joint synchronized action, is a potent elicitor of social bonding, associated with the activation of pleasure-, and sometimes even happiness-circuits in the brain. Analogous to Robin Dunbar’s vocal grooming hypothesis (Dunbar, 1993), music might have replaced manual grooming as the main mechanism for social bonding during human evolution. Dunbar’s hypothesis (according to which vocal grooming has paved the way for the evolution of language) is based on the observation that, similar to manual grooming, human language plays a key role in building and maintaining affiliative bonds and group cohesion. This process is putatively driven by increased group size, which increases the number of social relationships an individual needs to maintain and monitor. Once the number of relationships becomes too large, an individual can no longer maintain its social networks with tactile interactions alone, and selection will favor alternative mechanisms such as talking to several individuals at the same time. However, many more individuals can make music in a group (compared to the group size typical for conversations), and thus establish and foster social relationships. This makes music the more obvious candidate to maintain social networks, and increase social cohesion in large groups. Thus a musical grooming hypothesis seems at least as likely to explain the emergence of music as a vocal grooming hypothesis for the emergence of language.

    Like music, the term language refers to structured sounds that are produced by humans, and similar to music, spoken language also has melody, rhythm, accents, and timbre. However, language is a special case of music because it does not need a pulse nor a scale, and because it has very rich and specific meaning (it can, e.g., be used very effectively to express who did what to whom). Ulrich Reich (personal communication) once noted that language is music distorted by (propositional) semantics. Thus the terms music and language both refer to structured sounds that are produced by humans as a means of social interaction, expression, diversion or evocation of emotion (Koelsch, 2014), with language, in addition, affording the property of propositional semantics. However, in language normally only one individual speaks at a time (otherwise the language cannot be understood, and the sound is unpleasant). By contrast, music provides the possibility that several individuals may produce sounds at the same time. In this sense, language is the music of the individual, and music is the language of the group.

    These introductory thoughts illustrate that, at its core, music is not a cultural epiphenomenon of modern human societies, but at the heart of what makes us human, and thus deeply rooted in our brain. Engaging in music elicits a large array of cognitive and affective processes, including perception, multimodal integration, attention, social cognition, memory, communicative functions (including syntactic processing and processing of meaning information), bodily responses, and—when making music—action. By virtue of this richness, we presume that there is no structure of the brain in which activity cannot be modulated by music, which would make music the ideal tool to investigate the workings of the human brain. The following sections will review neuroscientific research findings about some of these processes.

    We do not only hear with our cochlea

    The auditory system evolved phylogenetically from the vestibular system. Interestingly, the vestibular nerve contains a substantial number of acoustically responsive fibers. The otolith organs (saccule and utricle) are sensitive to sounds and vibrations (Todd, Paillard, Kluk, Whittle, & Colebatch, 2014), and the vestibular nuclear complex in the brainstem exerts a major influence on spinal (and ocular) motoneurons in response to loud sounds with low frequencies, or with sudden onsets (Todd & Cody, 2000; Todd et al., 2014). Moreover, both the vestibular nuclei and the auditory cochlear nuclei in the brainstem project to the reticular formation (also in the brainstem), and the vestibular nucleus also projects to the parabrachial nucleus, a convergence site for vestibular, visceral, and autonomic processing in the brainstem (Balaban & Thayer, 2001; Kandler & Herbert, 1991). Such projections initiate and support movements, and contribute to the arousing (or calming) effects of music. Moreover, the inferior colliculus encodes consonance/dissonance (as well as auditory signals evoking fear or feelings of security), and this encoding is associated with preference for more consonant over more dissonant music. Notably, in addition to its projections to the auditory thalamus, the inferior colliculus hosts numerous other projections, for example, into both the somatomotor and the visceromotor (autonomic) system, thus initiating and supporting activity of skeletal muscles, smooth muscles, and cardiac muscles. These brainstem connections are the basis of our visceral reactions to music, and represent the first stages of the auditory-limbic pathway, which also includes the medial geniculate body of the thalamus, the auditory cortex (AC), and the amygdala (see Fig. 1.1). Thus subcortical processing of sounds does not only give rise to auditory sensations, but also to somatomotor and autonomic responses, and the stimulation of motoneurons and autonomic neurons by low-frequency beats might contribute to the human impetus to move to the beat (Grahn & Rowe, 2009; Todd & Cody, 2000).

    Figure 1.1 Illustration of the auditory-limbic pathway. Several nuclei of the auditory pathway in the brainstem, as well as the central nuclei group of the amygdala, give rise to somatomotor and autonomic responses to sounds. Note that, in addition to the auditory nerve, the vestibular nerve also contains acoustically responsive fibers. Also note that nuclei of the medial geniculate body of the thalamus project to both the auditory cortex and the amygdala. The auditory cortex also projects to the orbitofrontal cortex and the cingulate cortex (projections not shown). Moreover, amygdala, orbitofrontal cortex, and cingulate cortex have numerous projections to the hypothalamus (not shown) and thus also exert influence on the endocrine system, including the neuroendocrine motor system.

    In addition to vibrations of the vestibular apparatus and cochlea, sounds also evoke resonances in vibration receptors, that is, in the Pacinian corpuscles (which are sensitive from 10 Hz to a few kHz, and located mainly in the skin, the retroperitoneal space in the belly, the periosteum of the bones, and the sex organs), and maybe even responses in mechanoreceptors of the skin that detect pressure. The international concert percussionist Dame Evelyn Glennie is profoundly deaf, and hears mainly through vibrations felt in the skin (personal communication with Dame Glennie), probably with contributions of the vestibular organ. Thus we do not only hear with our cochlea, but also with the vestibular apparatus and mechanoreceptors distributed throughout our body.

    Auditory feature extraction in brainstem and thalamus

    Neural activity originating in the auditory nerve is progressively transformed in the auditory brainstem, as indicated by different neural response properties for the periodicity of sounds, timbre (including roughness, or consonance/dissonance), sound intensity and interaural disparities in the superior olivary complex and the inferior colliculus (Pickles, 2008; Schnupp, Nelken, & King, 2011). The inferior colliculi can initiate flight and defensive behavior in response to threatening stimuli (even before the acoustic information reaches the AC; Cardoso, Coimbra, & Brandão, 1994; Lamprea et al., 2002), providing evidence of relatively elaborated auditory processing already in the brainstem. This stands in contrast to the visual system: Bard (1934) observed that decortication, that is, removing the neocortex, led to blindness in cats and dogs, but not to deafness (the hearing thresholds appeared to be elevated, but the animals were capable of differentiating sounds). From the thalamus, particularly via the medial geniculate body, neural impulses are mainly projected into different divisions of the AC (but note that the thalamus also projects auditory impulses into the amygdala and the medial orbitofrontal cortex (Kaas, Hackett, & Tramo, 1999; LeDoux, 2000; Öngür & Price, 2000).

    The exact mechanisms underlying pitch perception are not known (and will not be discussed here), but it is clear that both information originating from the tonotopic organization of the cochlea (space information), and information originating from the integer time intervals of neural spiking in the auditory nerve (time information) contribute to pitch perception (Moore, 2008; Plack, 2005). Importantly, the auditory pathway does not only consist of bottom-up, but also of top-down projections, and nuclei such as the dorsal nucleus of the inferior colliculus presumably receive even more descending than ascending projections from diverse auditory cortical fields (Huffman & Henson, 1990). Given the massive top-down projections within the auditory pathway, it also becomes increasingly obvious that top-down predictions play an important role in pitch perception (Koelsch, Skouras, & Lohmann, 2018; Koelsch, Vuust, & Friston, 2018; Malmierca, Anderson, & Antunes, 2015, in addition to space and time information). Within the predictive coding framework (currently one of the dominant theories on sensory perception), such top-down projections are thought to afford passing on top-down predictions, while sensory information is passed bottom-up, signaling prediction errors, that is, sensory information that does not match a prediction (Koelsch, Skouras, et al., 2018; Koelsch, Vuust, et al., 2018).

    Numerous studies investigated decoding of frequency information in the auditory brainstem using the frequency-following response (FFR; Kraus & Chandrasekaran, 2010). The FFR can be elicited preattentively, and is thought to originate mainly from the inferior colliculus (but note also that it is likely that the AC is at least partly involved in shaping the FFRs, e.g., by virtue of top-down projections to the inferior colliculus, see also above). Using FFRs, Wong, Skoe, Russo, Dees, and Kraus (2007) measured brainstem responses to three Mandarin tones that differed only in their (F0) pitch contours. Participants were amateur musicians and nonmusicians, and results revealed that musicians had more accurate encoding of the pitch contour of the phonemes (as reflected in the FFRs) than nonmusicians. This finding indicates that the auditory brainstem is involved in the encoding of pitch contours of speech information (vowels), and that the correlation between the FFRs and the properties of the acoustic information may be modulated by musical training. Similar training effects on FFRs elicited by syllables with a dipping pitch contour were observed in native English speakers (nonmusicians) after a training period of 14 days (with eight 30-min sessions of training on lexical pitch patterns, Song, Skoe, Wong, & Kraus, 2008). The latter results show the contribution of the brainstem in language learning, and its neural plasticity in adulthood.

    A study by Strait, Kraus, Skoe, and Ashley (2009) also reported musical training effects on the decoding of the acoustic features of an affective vocalization (an infant’s unhappy cry), as reflected in auditory brainstem potentials. This suggests (1) that the auditory brainstem is involved in the auditory processing of communicated states of emotion (which substantially contributes to the decoding and understanding of affective prosody), and (2) that musical training can lead to a finer tuning of such (subcortical) processing.

    Acoustical equivalency of timbre and phoneme. With regard to a comparison between music and speech, it is worth mentioning that, in terms of acoustics, there is no difference between a phoneme and the timbre of a musical sound (and it is only a matter of convention that some phoneticians rather use terms such as vowel quality or vowel color, instead of timbre). Both are characterized by the two physical correlates of timbre: spectrum envelope (i.e., differences in the relative amplitudes of the individual harmonics, or overtones) and amplitude envelope (also sometimes called the amplitude contour or energy contour of the sound wave, i.e., the way that the loudness of a sound changes over time, particularly with regard to the on- and offset of a sound). Aperiodic sounds can also differ in spectrum envelope (see, e.g., the difference between /S/ as in ship and /s/ as in sip), and timbre differences related to amplitude envelope play a role in speech, for example, in the shape of the attack for /b/ versus /w/ and /S/ versus /Ù/ (as in chip).

    Auditory feature extraction in the AC. As mentioned above, auditory information is projected mainly via the subdivisions of the medial geniculate body into the primary AC [PAC, corresponding to Brodmann’s area (BA) 41] and adjacent secondary auditory fields (corresponding to BAs 42 and 52). For a detailed description of primary auditory core, and secondary auditory belt fields, as well as their connectivity, see Kaas and Hackett (2000). Large parts of the AC are buried in brain fissures, while other parts are located on the lateral surface of the superior and middle temporal gyri (see Fig. 1.2). For example, the lateral sulcus (also referred to as Sylvian fissure, not indicated in Fig. 1.2) is a deep fissure, part of which separates the superior temporal gyrus (STG) from the frontal and the parietal lobe. Within this fissure (and thus not visible from the outside), on top of the STG, lies the superior temporal plane which hosts the PAC (see also Fig. 1.5), and numerous auditory belt and parabelt regions (the region of the superior temporal plane anterior of the PAC is also referred to as planum polare, and the region posterior of the PAC as planum temporale).

    Figure 1.2 Lateral views on the left and right hemispheres of the brain. The auditory cortex (gray area) corresponds to large parts of the superior temporal gyrus, the superior temporal sulcus, and parts of the middle temporal gyrus (the approximate borders of the gyri are indicated by dashed lines). In the left hemisphere, the superior posterior part of the temporal gyrus is referred to as Wernicke’s area, and the posterior part of the inferior frontal gyrus (indicated by a dotted line) is referred to as Broca’s area (see dotted area). While these areas in the left hemisphere appear to be more strongly engaged during the processing of language than of music, the homotope areas in the right hemisphere appear to be more strongly engaged during the processing of music than of language. The hashed area indicates the (lateral) premotor cortex.

    With regard to the functional properties of primary and secondary auditory fields, a study by Petkov, Kayser, Augath, and Logothetis (2006) showed that, in the macaque monkey, all of the PAC core areas, and most of the surrounding belt areas, show a tonotopic organization. (Tonotopic refers to the spatial arrangement of sounds in the brain. Tones close to each other in terms of frequency are mapped in spatially neighboring regions in the brain.) Tonotopic organization is clearest in the field A1, and some belt areas seem to show only weak, or no, tonotopic organization. These auditory areas perform a more fine-grained, and more specific, analysis of acoustic features compared to the auditory brainstem.

    Tramo, Shah, and Braida (2002) reported that a patient with bilateral lesion of the PAC (1) had normal detection thresholds for sounds (i.e., the patient could say whether there was a tone or not); but (2) had elevated thresholds for determining whether two tones had the same pitch or not (i.e., the patient had difficulties to detect fine-grained frequency differences between two subsequent tones); and (3) had markedly increased thresholds for determining the pitch direction [i.e., the patient had great difficulties in saying whether the second tone was higher or lower in pitch than the first tone, even though he could tell that both tones differed; for similar results obtained from patients with (right) PAC lesions, see Johnsrude, Penhune, & Zatorre, 2000; Zatorre, 2001].

    The (primary) AC is involved in the transformation of acoustic features (such as frequency information) into percepts (such as pitch height and pitch chroma). For example, a sound with the frequencies 200, 300, and 400 Hz is transformed into the pitch percept of 100 Hz (thus the missing fundamental of 100 Hz is actually perceived as the pitch of the sound, a phenomenon also referred to as residue pitch or virtual pitch). Lesions of the (right) PAC result in a loss of the ability to perceive residue pitch in both animals (Whitfield, 1980) and humans (Zatorre, 1988), and neurons in the anterolateral region of the PAC show responses to a missing fundamental frequency (Bendor & Wang, 2005). Moreover, magnetoencephalographic data indicate that response properties in the PAC depend on whether or not a missing fundamental of a complex tone is perceived (Patel & Balaban, 2001). Note, however, that combination tones emerge already in the cochlea, and that the periodicity of complex tones is coded in the spike pattern of auditory brainstem neurons; therefore different mechanisms contribute to the perception of residue pitch on at least three different levels (basilar membrane, brainstem, and AC). However, the studies mentioned above suggest that, compared to the brainstem or the basilar membrane, the AC plays a more prominent role for the transformation of acoustic features into auditory percepts (such as the transformation of information about the frequencies of a complex sound, as well as about the periodicity of a sound, into a pitch percept).

    Beyond pitch perception, the AC is also involved in a number of other functions, including auditory sensory memory (ASM), extraction of intersound relationships, discrimination and organization of sounds, as well as sound patterns, stream segregation, automatic change detection, and multisensory integration (for reviews see Hackett & Kaas, 2004; Winkler, 2007). With regard to functional differences between the left and the right PAC, as well as neighboring auditory association cortex, several studies suggest that the left AC has a higher resolution of temporal information than the right AC, and that the right AC has a higher spectral resolution than the left AC (Hyde, Peretz, & Zatorre, 2008; Perani et al., 2010; Zatorre, Belin, & Penhune, 2002). This might explain why the left AC reacts more strongly to language (in which phonemes usually occur more rapidly than tones of melodies), while the right AC reacts more strongly to music (in which sound sequences usually have higher pitch variation than in language). Finally, the AC also prepares acoustic information for further conceptual and conscious processing. For example, with regard to the meaning of sounds, just a short single tone can sound, for example, bright, rough, or dull. That is, the timbre of a single sound is already capable of conveying meaningful information.

    Operations within the (primary and adjacent) AC related to auditory feature analysis are reflected in electrophysiological recordings in brain-electric responses that have latencies of about 10–100 ms, particularly middle-latency responses, including the auditory P1 (a response with positive polarity and a latency of around 50 ms), and the later auditory N1, or N100, component (the N1 is a response with negative polarity and a latency of around 100 ms). Such brain-electric responses are also referred to as event-related potentials (ERPs) or evoked potentials.

    Echoic memory and Gestalt formation in the auditory cortex

    While auditory features are extracted, the acoustic information enters the ASM (or echoic memory), and representations of auditory Gestalten (or auditory objects Griffiths & Warren, 2004) are formed. The ASM retains information only for a few seconds, and information stored in the ASM fades quickly. The ASM is thought to store physical features of sounds (such as pitch, intensity, duration, location, timbre, etc.), sound patterns, and even abstract features of sound patterns (e.g., Paavilainen, Simola, Jaramillo, Naatanen, & Winkler, 2001). Operations of the ASM are at least partly reflected electrically in the mismatch negativity (MMN, e.g., Näätänen, Tervaniemi, Sussman, Paavilainen, & Winkler, 2001). The MMN is typically elicited in so-called auditory oddball paradigms: when individuals are presented with a sequence of repeating standard sounds or sound sequences (such as a repeating pitch, or a repeating sequence of a few pitches), the occurrence of a deviant sound (such as a sound with a different pitch) elicits an MMN. The MMN is an ERP with negative polarity and a peak latency of about 100–200 ms, and appears to receive its main contributions from neural sources located in the PAC and adjacent auditory (belt) fields, with additional (but smaller) contributions from frontal cortical areas (for reviews, see Deouell, 2007; Schönwiesner et al., 2007).

    Numerous MMN studies have contributed to the understanding of the neural correlates of music processing by investigating different response properties of the ASM to musical and speech stimuli, by using melodic and rhythmic patterns to investigate auditory Gestalt formation, or by studying effects of long- and short-term musical training on processes underlying ASM operations. Especially the latter studies have contributed substantially to our understanding of neuroplasticity (i.e., to changes in neuronal structure and function due to experience), and thus to our understanding of the neural basis of learning (for a review see Tervaniemi, 2009). Here, suffice it to say that MMN studies showed differences between musicians and nonmusicians on the processing of sound localization, pitch, melody, rhythm, musical key, timbre, tuning, and timing (e.g., Koelsch, Schröger, & Tervaniemi, 1999; Putkinen, Tervaniemi, Saarikivi, de Vent, & Huotilainen, 2014; Rammsayer & Altenmüller, 2006; Tervaniemi, Castaneda, Knoll, & Uther, 2006; Tervaniemi, Janhunen, Kruck, Putkinen, & Huotilainen, 2016).

    Auditory oddball paradigms were also used to investigate processes of melodic and rhythmic grouping of tones occurring in tone patterns (such grouping is essential for auditory Gestalt formation, see also Sussman, 2007), as well as effects related to musical long-term training on these processes. These studies showed musician/nonmusician differences (1) on the processing of 4- or 5-tone melodic patterns (Fujioka, Trainor, Ross, Kakigi, & Pantev, 2004; Tervaniemi, Ilvonen, Karma, Alho, & Näätänen, 1997; Tervaniemi, Rytkönen, Schröger, Ilmoniemi, & Näätänen, 2001; Zuijen, von Sussman, Winkler, Näätänen, & Tervaniemi, 2004); (2) on the encoding of the number of elements in a tone pattern (Zuijen, von Sussman, Winkler, Näätänen, & Tervaniemi, 2005); and (3) on the processing of patterns consisting of two voices (Fujioka, Trainor, Ross, Kakigi, & Pantev, 2005).

    The formation of auditory Gestalten entails processes of perceptual separation, as well as processes of melodic, rhythmic, timbral, and spatial grouping. Such processes have been summarized under the concepts of auditory scene analysis and auditory stream segregation (Bregman, 1994). Grouping of acoustic events follows Gestalt principles such as similarity, proximity, and continuity (for acoustic cues used for perceptual separation and auditory grouping see Darwin, 1997, 2008). In everyday life, such operations are not only important for music processing, but also, for instance, for separating a speaker’s voice during a conversation from other sound sources in the environment. That is, these operations are important because their function is to recognize and to follow acoustic objects, and to establish a cognitive representation of the acoustic environment. It appears that the planum temporale (located posterior of the PAC) is a crucial structure for auditory scene analysis and stream segregation, particularly due to its role for the processing of pitch intervals and sound sequences (Griffiths & Warren, 2002; Patterson, Uppenkamp, Johnsrude, & Griffiths, 2002; Snyder & Elhilali, 2017).

    Musical expectancy formation: processing of local dependencies

    The processing of regularities inherent in sound sequences can be performed based on two different principles. The first is based on the regularities inherent in the acoustical properties of the sounds. For example, after a sequence of several sounds with the same pitch, a sound with a different pitch sounds irregular. This type of processing is assumed to be performed by the ASM, and processing of irregular sounds is reflected in the MMN (see above). Note that the extraction of the regularity underlying such sequences does not require memory capabilities beyond the ASM (i.e., the regularity is extracted in real time, on a moment-to-moment basis). I have referred previously to such syntactic processes as knowledge-free structuring (Koelsch, 2012).

    The second principle is that, the local arrangement of elements in language and music includes numerous regularities that cannot simply be extracted on a moment-to-moment basis, but have to be learned over an extended period of time (local refers here to the arrangement of adjacent, or directly succeeding, elements). For example, it usually takes months, or even years, to learn the syntax of a language, and it takes a considerable amount of exposure and learning to establish (implicit) knowledge of the statistical regularities of a certain type of music. I have referred previously to such syntactic processes as musical expectancy formation (Koelsch, 2012).

    An example of local dependencies in music captured by musical expectancy formation is the bigram table of chord transition probabilities extracted from a corpus of Bach chorales in a study by Rohrmeier and Cross (2008). That table, for example, showed that after a dominant seventh chord, the most likely chord to follow is the tonic. It also showed that a supertonic is nine times more likely to follow a tonic than a tonic following a supertonic. This is important, because the acoustic similarity of tonic and supertonic is the same in both cases, and therefore it is very difficult to explain this statistical regularity simply based on acoustic similarity. Rather, this regularity is specific for this kind of major–minor tonal music, and thus has to be learned (over an extended period of time) to be represented accurately in the brain of a

    Enjoying the preview?
    Page 1 of 1