What Does Music Sound Like For A Cochlear Implant User?

Like speech, music is built on acoustic parameters and elements to convey meaning and information. However, music arguably represents the most challenging auditory stimuli (1) and frequently involves complex resolution mechanisms that are not required with language (2,3). For some cochlear implant (CI) users (particularly postlingually deafened CI users), music is unpleasant as auditory stimuli are largely impoverished and frequently misrepresented via the constraints of electrical hearing. On the contrary, present-day CI systems effectively convey speech such that many CI users approach 80% on sentence recognition tests in quiet environments (4). When it comes to more complex auditory sounds (1,5–12), such as phonetic-dependent languages, voice inflections, voice emotions, and music, CI processing systems have a difficult time delivering the dynamic range, fine temporal and fine spectral information that normal hearing (NH) listeners use. Consequently, it is common for CI users to describe music listening as unsatisfactory (13,14) and to perform poorly on music perception tasks (15,16). The objective of this study is to describe the way that music sounds to a CI user, with emphasis on postlingually deafened subjects, in light of several specific music processing deficits that have been identified.

HOW DO CI USERS PERCEIVE PITCH INFORMATION?

Cochlear Implants Are Out-of-Tune

Pitch refers to the perceptual correlate of frequency. In addition to absolute frequency, the qualities of pitch height and chroma are important attributes of how pitch is perceived, often represented by a helical structure (17). Although pitch is conveyed by both spatial and temporal cues (18–21), pitch perception is more reliant on spatial cues, known as place-pitch cues (22,23). Previous reports of high performing CI users suggest that pitch discrimination may be achieved by using overlapping center frequencies between individual filters (24–26) and to a lesser degree, from temporal cues obtained with harmonic series processing. With complex-tone pitches and low-frequency tones (up to ∼2000 Hz), temporal encoding becomes particularly relevant in delivering the periodicity rate and its associated fundamental frequency (27). For the majority of CI users, both place-pitch and temporal rate pitch information are disrupted (28).

Music Lacks Bass Frequencies

In a recent study involving 436 temporal bones, the authors reported a mean cochlear length of 37.6 mm with a range of 32 to 43.5 mm (29). However, the longest electrode array on the market is 31.5 mm with several commonly used electrodes measuring around 24 mm. Consequentially, the electrode array does not stimulate certain regions of the cochlea. A theoretical solution to this problem is to insert the electrode array as deep as possible into the cochlea; however, insertion angles greater than 400 degrees increase the likelihood of destroying any residual hearing through traumatic mechanical forces (30,31). This delicate balance in stimulating the apical turns of the cochlea, combined with factors such as electrode array stiffness, variations in cochlear anatomy, and intraoperative events, prevents full utilization of the cochlea for many users. As a result, many patients lack low-frequency sounds associated with the most apical regions of the cochlea (32).

Pitch Range and Clarity Are Reduced

The input frequency range for CIs is often between 100 to 8500 Hz, further limiting pitch processing to a speech-intended frequency spectrum. For many CI users, applying broad frequency bands to the tonotopic basilar membrane results in perceiving a normal acoustic pitch as a higher-pitch sound (33). Additionally, CI users perceive pitch as broad steps rather than as a smooth frequency gradient. A study by Zeng et al. (34) found that changes in pitch between adjacent electrodes were described to be less than the changes reported by NH listeners for the same characteristic frequency along the basilar membrane. These findings suggest that pitch perception in CI users may be affected by frequency compression secondary to sound processing and transmission constraints. Relatedly, the relatively broad electric fields used in CI-mediated hearing lack the precision inner hair cells possess in exciting a particular auditory neuron. This often leads to broad neural excitation, muddling spatial fidelity and creating an altered pitch among CI users.

Melodic Contour Recognition Is Impaired

Melodic contour identification requires excellent perception in detecting changes in pitch, rhythm, and timbre. Given the limits of temporal and spectral resolution in CI users, it is not surprising that CI users are significantly worse than their NH counterparts in musical melody contour performance. In melody recognition tasks without rhythm cues, CI users’ performance was comparable to those of NH listeners using only 1 to 2 spectral channels (35). CI users struggle with extracting melodic pitch particularly when the timbre complexity of a piece is increased (36). Galvin et al. (3,22) found a significant correlation between melodic contour identification and vowel recognition performance, suggesting the importance of frequency allocation and harmonic relationships in melodic contour perception. Interestingly, these sets of experiments demonstrate large intersubject variability with no clear superiority in CI device or sound processing strategy. When CI users were trained on melodic contour recognition, however, their music contour identification performance significantly improved. CI subjects with music experience also tended to be higher performers among the CI cohort and were less susceptible to timbre effects on melodic pitch perception (36).

Polyphonic Pitches Are Perceived as Fused

Although purely monophonic music exists, most Western music is polyphonic. A previous study found that CI subjects are severely impaired in their ability to differentiate between simultaneous pitches compared with their NH counterparts when presented with free-field stimuli because of the perceptual fusion of multiple pitches into a single tone (37). With direct electrical stimulation, however, CI subjects were able to separate polyphonic stimuli using place-pitch, rate-pitch, or a combination of both in pitch perception tasks. In a study involving 10 CI users, subjects were instructed to choose whether a given stimulus consisted of 1, 2, and 3 simultaneous pitches (38,39). Despite the difficulty CI users encountered with the task, they were able to differentiate between polyphonic tone complexes of 2- and 3-pitches when presented electrically. In the 2-pitch condition, subjects were significantly better at accurately identifying 2-pitches when the distance increased between the electrode pairs and there was a reduction in the overlap of stimulated neural populations. At the time of the study, subjects also reported the stimuli to sound pleasant and said it reminded them of how they used to hear music before losing their hearing.

Consonance and Dissonance Are Indistinguishable

Consonance and dissonance are two important features derived from harmonics or pitch intervals between musical notes. The concept of sensory dissonance is thought to be fundamental to the human auditory system and is largely immune to external influence (40–42). Generally speaking, consonance is associated as pleasant and dissonance as unpleasant (43). What is accepted today is that these fundamental features are based mainly on subtle relationships in pitch and frequencies; the simpler the frequency ratio between two tones, the more consonant it sounds (44,45). CI users lack pitch perception and accurate harmonic representation, and consequentially, are deprived of the precise frequency representation required for consonance and dissonance perception. Caldwell et al. (46) presented postlingually deafened CI users with dissonant stimuli structured on the basis of harmonic theories of dissonance and previous studies employing dissonant chords. CI participants ranked these permutations on a Likert scale of very unpleasant (5) to very pleasant (5) and the study results revealed that dissonant melodic stimuli with chord accompaniment did not impact CI user-reported assessment of unpleasantness to the same extent it did with NH controls. The degree that CI participants seemed blunted in their abilities to recognize and process dissonance suggests that CI users may face a major disadvantage in perceiving emotional content in music.

Music Emotion Recognition Is Limited

Numerous studies suggest that CI users, particularly children, have difficulty correctly identifying intended musical emotion compared with their NH peers (47–49). This impairment may be due to the significant handicaps in pitch perception and spectro-temporal fine structure information faced by CI users (50); these limitations likely impact their ability to detect the nuances in frequency ratios and harmonic intervals, a skill critical to understanding intended emotion. One previous study (51) found that CI users rely more heavily on temporal cues than pitch cues in inferring musical emotion, often leading CI users to incorrect conclusions about music's intended valence. This highlights the paucity of accurate pitch fidelity CI users have access to, and therefore, are able to utilize in music perception. Given that a primary purpose of music is conveying emotional information, musical emotion blunting—largely due to constraints posed by spectral processing systems—may help explain why CI users report lower levels of music enjoyment and participation after implantation (52,53).

HOW DO CI USERS PERCEIVE MUSICAL TIMBRE?

The spectral processing limitations observed in pitch perception also significantly impact timbre perception for CI users. When a musical note or a complex sound is played, it produces modes of vibrations of several frequencies. At the core, there is a fundamental frequency of which all the other frequencies are multiples. This music parameter is known as a harmonic series and depends on accurate representation of the integers between these frequencies. In a NH individual, the lower-resolved harmonics dominate pitch perception even when multiple frequencies are being transmitted (54,55). Harmonic integrity is also a critical component of timbre identification and pitch perception, and when these overtones are misaligned or compromised, the listening experience is altered.

When a CI sound processor receives an audio signal, it splits the stimuli into several frequency bands spanning roughly 8000 Hz. Each electrode channel covers a relatively broad band of frequencies, making it difficult to resolve individual harmonics. Theoretically, place-pitch mismatch in CI users compromises the integrity of place-coding by altering the intervals between the varying frequencies (56). Currently, place-coding is used to provide information on spectral shape of an acoustic stimulus. As a result, the fundamental frequency of a complex sound is encoded by temporal fluctuations in the envelope of the electrical current presented by the electrode channels. Looi et al. (28) demonstrated that fundamental frequency may be determined using the amplitude envelope if the stimulation rate is approximately four times or more than the fundamental frequency. While a reliance on single-electrode trains is sufficient for CI-mediated melody recognition, performance drops when the fundamental frequencies exceed 300 Hz (57). Place-coding strategies may have potential in providing information on fundamental frequencies at high frequencies where temporal cues are limited (58). Relatedly, harmonic series reduction improves music enjoyment under CI conditions whereas the harmonics are preferred in NH listeners, demonstrating that technical constraints in harmonic transmission continue to exist with present-day CI processing systems (59).

Musical Instruments Are Difficult to Identify

Timbre is the perceptual quality of an acoustic stimulus that differentiates one sound from other sounds of the same pitch and amplitude, often described as tone color. For a complex tone, perception of timbre relies on the spectral shape and temporal envelope, especially at the onset of the acoustic sound (60). Although CIs can generally encode temporal envelope information accurately, they encounter more difficulty with the spectral shape due to engineering limitations in spectral resolving power and dynamic range compression (61). These limitations blunt the overall performance of CI users in timbre discrimination tasks (62). For example, in a study involving 9 CI users and 25 NH controls, CI users encountered difficulty in timbre recognition (62) and had a tendency to confuse instruments across instrumental families. The CI group also tended to recognize instruments that had a rapid and strong attack time. In general, CI recipients are more readily able to identify percussive envelope instruments (e.g., piano) compared with brass or woodwind instruments (63–65). Although familiarity may play a role in timbre recognition tasks (66), general unstructured exposure to music has not been found to improve music performance in children with CIs (67).

While there does not seem to be a clear consensus on aesthetically pleasing timbres among CI users, there is evidence of greater music enjoyment when timbre sound processing is less demanding. In a study of 15 CI users and 24 hearing aid users, both groups rated music involving several instruments to be less pleasant than music played by one instrument (68). Similarly, Kohlberg et al. (69) demonstrated that reengineering musical pieces to simplify instrumental elements improved music listening in CI users.

In a study by Heng et al. (70), CI users were presented with musical “chimera” instruments that combined the temporal envelope of one instrument with the fine structure of a second instrument. Subjects were asked to choose which source instrument the combined chimera most closely resembled. For NH controls, subjects were able to interchangeably use increasing quantities of fine structure or envelope information to base their judgements. In comparison, CI users used temporal envelope information exclusively in their determinations of musical timbre (70). These studies align with the understanding that timbre discrimination, particularly when the acoustic stimulus involves numerous instrumental elements at a time, remains a major weakness of modern-day CI systems (62).

Musical Sound Quality Is Poor

Despite a number of studies indicating poor sound/music appraisal in CI users— i.e., suggesting that CI users do not enjoy music (53,62,71–73)—the CI-mediated musical sound quality is not well studied. Existing studies on sound quality suggest that it is highly diminished in CI users relative to NH listeners (74–76). One CI user writes, “Often I have described what I hear… like when as children we communicated using two tin-cans connected by a long string–very tinny.” This “tinny” quality is commonly used by CI users to describe music (72). Further musical sound quality impairments stem from limited access to low-frequency information (77) in addition to a compressed dynamic range (15).

HOW DO CI USERS PERCEIVE RHYTHM?

Basic Rhythm Patterns Are Preserved

Rhythm is described as a pattern of rests and beats that contributes to the structure of sound in time. It is generally found that CI users perceive rhythm patterns at satisfactory levels (78–81), partially due to the fact that acoustic onsets are detectable through the temporal envelopes delivered by CIs. In fact, when compared with NH controls, CIs users usually perform at nearly comparable levels on rhythmic tasks (35,79,82,83). In a study conducted by Phillips-Silver et al. (84), a heterogeneous group of CI users were able to pick out the beat and move in time to Latin Merengue music. Participants’ performances improved with unpitched drum tones, highlighting the nearly normal capacity to discriminate timing events in more difficult rhythmic tasks and the degree of synchrony that occurs between electrical pulse and nerve firing. Similarly, prelingually deaf children with CIs were able to identify familiar songs using pitch and timing cues, and marginally above chance with solely timing cues (85).

With regards to rhythmic clocking (as opposed to pattern identification), Kim et al. (86) studied the integrity of internal rhythmic clocking. In an attempt to investigate perception of time independently of rhythmic pattern, CI users were asked to indicate whether the final beat of a four-beat series presented at different tempos was isochronous or anisochronous (e.g., slightly before or after where an isochronous beat should fall). The study results showed CI users performed comparably to their NH counterparts, consistent with the previous literature indicating that basic rhythmic perception is largely intact with current CI-processing strategies. However, it should be emphasized that studies of rhythm perception in CI users have generally relied upon isolation of rhythmic information. In complex real-world music, overlapping streams of information, often separated by timbre or frequency space rather than temporal space, are generally presented together. For example, both the drum track and bass track may provide critical rhythm information. Given the inability to distinguish musical timbres and subsequent limitations on auditory streaming, it is plausible that a rhythm task that integrates timbre information will quickly reveal limitations in complex rhythm processing for CI users.

How Do CI Users Describe the Sound of Music?

Due largely to a lack of objective and reliable measures, how these impairments impact the experience of music listening for a CI user remains difficult to parse out. To learn more about this, we administered an informal survey regarding listening experiences to CI recipients. Their subjective reflections are included throughout this section to supplement the scientific data presented above.

Enjoyment of Music Varies Widely Among CI Users

CI users’ enjoyment or appraisal of music is variable. While one CI recipient reports, “I get much enjoyment from music, it sounds good to me… I do seek out new music.” Another stipulates, “Music is dissonant, out-of-tune, fuzzy, tinny… In general, music is very unpleasant for me now.” This variation stems from multiple sources. CI implantation and rehabilitation is accompanied by a number of varying factors including but not limited to age at implantation, period of profound deafness before implantation, musical engagement pre- and postimplantation, the type of device, length of electrode array, and processing strategy. These variables can contribute to a range of hearing outcomes after activation, and hence a range in musical sound quality.

Music Listening Is Usually More Enjoyable for Prelingually Deafened Individuals Than Postlingually Deafened Individuals

Relatedly, it is crucial to consider the vast differences in listening experiences of prelingually and postlingually deafened CI recipients given factors related to preimplantation hearing and neuroplasticity. Prelingual CI users have never heard speech and music using normal auditory function. Many were additionally implanted at a young age at which the brain is highly plastic, and have had years to allow for neural pathways to adapt to the novel auditory signal transmitted by the implant. Postlingually deafened CI users, however, often compare music heard through their CI with music they heard when they had normal hearing. A CI user who reports enjoying music reflects, “My severe to profound hearing loss started when I was young… Thus, I do not have much musical memory to compare against.” Indeed, existing research suggests that prelingual CI users listen to and enjoy music more than postlingual CI recipients (87,88).

In situations where competing auditory cues are present, music is often described as more of a nuisance than a source of enjoyment. One CI user reports, “Sometimes music is just this cacophonous noise where it's… in the way… and I’m trying to hear other things that are going on around me and the music is just overpowering.”

Music Sounds Distorted

Unsurprisingly, many CI users report distortions of musical constructs related to pitch. Pitch is distorted for a vast portion of the CI population as exemplified by a lack of low-frequency information, multiple frequencies being perceived as single pitches, and difficulty with following melodic contour information (34,74,77). One CI user stipulates, “I do not have a sense of tune (in-tune/out-of-tune).” This lack of adequate pitch perception can have detrimental effects on music listening, and for those who were musicians before going deaf, it can be especially harmful to their quality of life:

“In all my musical growing up years… I swear I had perfect pitch. I still think I have perfect pitch, and yet I cannot distinguish between pitches… I’m sure that's a major factor in my appreciation of music. Losing music has been a major loss in my life.”

Lyrics Are Helpful

While NH listeners are generally able to enjoy a breadth of musical styles, CI recipients’ enjoyment of music can vary tremendously with instruments and genres. One salient example lies in the presence of vocals in music (53,72). When music is entirely vocal, solos seem easier to follow than ensembles. CI users largely report that hearing vocals in music is difficult, and that when music does contain vocals, it is significantly easier to follow if the lyrics are familiar or when captioning is available:

“What I really enjoy most is listening to music where they’re singing songs, and I can actually see the lyrics. Being able to listen to the music with the words helped me… understand the background of the music, the meaning.”

CI Users Describe Preferences for Timbres

Both spectral distribution and temporal envelope are impaired in CI users (89,90). Thus, it is unsurprising that the CI population exhibits not only difficulty with instrument differentiation but also negative ratings of commonly recognized, orchestral instrument sounds compared with NH listeners (62,71,91). Certain instrumental tones may be perceived more pleasantly than others. In particular, CI users seem to dislike instrument sounds with a higher natural frequency range and instruments in the string family (violin, viola, etc.) (71). Percussive instruments seem to sound more pleasant, possibly corresponding to the preserved rhythm perception in CI-mediated listening (35,78). One CI user writes, “The piano sounds better than horn instruments. I can never… identify the instrument playing except for drums and piano.” This preference for piano over other instruments is paralleled in the literature (53).

Genre Preferences Are Linked to Dominant Genre-specific Acoustic Features

Musical genres exhibit a range of melodic, instrumental, and rhythmic pattern combinations. For example, country music tends to have a consistent beat and prominent melodic line sung by a soloist or ensemble, whereas classical music is usually instrumental and performed by an orchestra or solo instrument. Similarly, hip hop or rap music has an extremely strong rhythmic and vocal lyric component, features preferred by CI users (72). Given the instrumental and pitch-related limitations in CI-mediated music listening, it is unsurprising that distinguishing between genres may be difficult for CI users, or that songs in certain genres are more easily recognizable than those in other genres (92). As one CI user writes, “It is hard for me to classify music into genres. Guess it is not part of my language.” The reference to a music-based “language” in this testimonial illuminates an indirect but crucial impact of music perception impairments; music not only sounds distorted through a CI, but these distortions may partially impair its social and relational benefits.

Genres can also differ in musical complexity, defined as variations and novelty of musical structure combined with the previous musical experiences of the listener (93,94). Prevalent theories about the appraisal-complexity relationship stipulate that there is an “optimal complexity” level at which music contains enough novelty to hold the listener's interest, but not so much that she or he becomes overwhelmed or is not able to follow (95). CI users and NH listeners differ in their appraisal of musical complexity. A study by Gfeller et al. (72) suggests that NH listeners enjoy a higher degree of musical complexity while CI users prefer simpler pieces. In the same study, CI users provided lower likability ratings to classical pieces compared with pop or country music, and also rated classical as the most complex (72). Similarly, music consisting of a solo instrument line is perceived more positively by CI users than instrumental ensemble music (68). One CI user remarks, “More complex music like a large band or orchestra is harder for me to relate to.” This preference for simpler music (prominent and repetitive rhythmic and melodic patterns, a simple harmonic structure, single instruments) suggests that musical complexity can create distortions in CI-mediated listening, making music difficult to follow and enjoy.

CONCLUSION

This article highlights the numerous challenges CI users face in music perception and attempts to describe how music sounds through the CI by combining a wide range of studies and sources (Fig. 1). As the literature suggests, CIs have largely focused its efforts on speech comprehension but lack the capacity to address the complex sound processing required for accurate music perception. Some of the critical areas for improvement are pitch and harmonic perception as music continues to sound out-of-tune, dissonant, emotionless, indistinct, and weak in bass frequencies for most CI subjects. With more accurate pitch perception, timbre and harmonic deficits are likely to be partially addressed by true representation of the acoustic stimuli and pitch relationships. In the past decade, the scientific community is becoming more aware of the importance of excellent music perception in CI users; indeed, music perception may represent a higher level of auditory performance than speech perception. As such, a music-focused approach toward CI engineering and development serves as an excellent tool to identify parameters that will lead to improvements in electrical hearing. In parallel, music rehabilitation is slowly gaining attention as a means of improving music performance postimplantation (96–103). While in many cases, this growing interest in music perception has brought positive benefits to some CI users, the degree of improvement that remains to be achieved is vast and further research is needed.

F1
FIG. 1: How music sounds like for a cochlear implant user in light of several specific music processing deficits. The red-colored (see online version for color) circles indicate musical qualities that are poorly represented by the cochlear implant. The green-colored circles indicate items that are well preserved by the cochlear implant with respect to music perception.

Tag » What Do Cochlear Implants Sound Like