Timbre Space

EN | FR

Timbre Space

by Amélie Bernier-Robert
Timbre Lingo | Timbre and Orchestration Writings

Published: October 23, 2023

A sound’s timbre can be represented by a point inside a conceptual timbre space, a space where the axes are usually defined by the principal attributes (e.g., spectral centroid, attack time) that one would implicitly use to perceive or conceive any timbre in a given context. Timbre spaces are obtained using a statistical method called multidimensional scaling analysis (MDS), which generates a simplified multidimensional map from a series of perceptual distances between pairs of timbres (as measured in perceptual experiments). The experimenter then deduces the attributes (acoustic descriptors) that best describe the axes obtained. Timbre spaces provide us with complementary information to the neuro-cognitive mechanisms of timbre perception, and can also be a very useful compositional tool.

__________________________

Have you ever wondered how you are able to differentiate, with ease, two different instruments (say, saxophone and flute) that express identical pitch, loudness, and duration? Timbre is an attribute that makes it possible to categorize or identify the sounding objects that surround us [1]. Timbre is often defined by what it is not—pitch, loudness or duration. But what is timbre? How can it be defined, represented, or quantified? Timbre spaces provide answers to these questions. Timbre—the “quality” or “color” of a sound—does not depend solely on one single parameter. While pitch is correlated with frequency (i.e., higher frequencies tend to elicit perceptions of higher pitches), timbre correlates with more than one parameter, or dimension. This means that instead of situating timbre on a unidimensional scale (like pitch can be situated with respect to frequency, for example), timbre must be placed inside a space represented by at least two axes—we thus understand timbre as a multidimensional attribute [2]. This also means that we can specify a timbre by its coordinates in a multidimensional timbre space.

This might seem a little abstract, but makes more sense if we compare it to the Munsell color system, a 3-dimensional color space where colors can be specified according to their hue (dominant wavelength), chroma (“purity”) and value (lightness). In other words, any color is situated somewhere in the color space, depending on its values of these three parameters [3]. This concept is very similar to a 3-dimensional timbre space, although the parameters would correspond to acoustic descriptors. We could technically add a fourth dimension to the Munsell color system if we were to consider the opacity (transparency) of the colors over an arbitrary background. Thus, any color in a 4-dimensional color space would have to be specified in terms of hue, chroma, value and opacity. Similarly, some timbre spaces can be represented by more than three dimensions.   

Timbre spaces are obtained through studies using multidimensional scaling analysis (MDS), a statistical method that constructs a best-fitting map for a group of sounds displayed as points in the timbre space. To do so, participants typically listen to every possible pair of sounds from a group of samples (equalized in pitch, loudness, and duration), for which they rate each pair for its dissimilarity. These dissimilarity ratings are treated as perceptual “distances” in the analysis, displayed in a square matrix (NxN if there are N samples in the study). This matrix forms the input of the MDS algorithm, which will try to build the best-fitting geometric map with respect to the subjective distances in the dissimilarity ratings. Mathematically, the pairwise dissimilarity ratings of N samples gives us the data to construct a N-dimensional space. Imagine such a study is conducted with 16 different instrumental sounds (oboe, clarinet, violin, etc. all playing the same note). In this case, the timbre space starts with 16 dimensions. The role of the MDS algorithm is then to achieve dimension reduction, in order to position the timbres typically in a 2 to 4-dimensional space, as accurately as possible (the number of dimensions is chosen according to goodness-of-fit statistics). Once the sounds are displayed in the timbre space, the experimenter tries to interpret each dimension based on the acoustic properties of the samples [4], [5], [6]

 

Figure 1: Overview of MDS process. Participants compare the sound events with each other (in pairs) to map their perceptual relations (distances) which are displayed in a perceptual dissimilarity matrix. The MDS program then creates a geometric configuration for the samples with respect to the dissimilarity ratings, using a lower number of dimensions. The obtained map is interpreted by the experimenter to find the best acoustic descriptors which correlate with the dimension axes [1].

 

The nature and number of dimensions in a timbre space depend on the context of the underlying study (western orchestral instruments, impulsive sounds, synthetic sounds, dyads of instrument tones, etc.). We also find individual differences in the saliency of the acoustic descriptors that form the basis of a timbre space. For instance, some people tend to favor temporal information over spectral information, and vice-versa [4]. These individual differences are included as weights in some MDS algorithms (acting as “stretching” coefficients for the timbre space in the direction of an axis) [7], which makes it possible to compare results between individuals or groups (referred to as “latent classes”). This is how researchers found out that there was surprisingly no significant difference between musicians and non-musicians in the resulting timbre spaces with constant pitch [8]. Some other MDS algorithms also include specificities, representing the amount of a unique feature that distinguishes a sound from all others [1]. They act as an additional dimension to the timbre in question, which increases its perceptive distance from the other sounds (this can be thought as adding localized “branches” to the spatial model) [8].   

Regardless of the context, there is consistency in the timbre spaces that are found in psychoacoustic research. We often find the most important descriptors to be related to spectral centroid (the frequency value of the equilibrium point of a spectrum) and attack time (distinguishing blown/bowed from struck/plucked instruments), followed by a dimension that is either correlated with spectro-temporal variations (e.g., spectral flux, which measures the amount of variation through time in the spectrum of a sound) or spectral fine structure (e.g., spectral irregularity, often measured using an odd-to-even harmonics ratio) [1], [2], [4], [9].

This table summarizes results from well-known studies on timbre space: 

 
 

Figure 2: Example of a timbre space [1].

 

Interestingly, one study has shown that exchanging the spectral envelopes of two synthesized instruments reverses their position on a spectral energy distribution axis [11]. This highlights the importance of spectral envelope properties in timbre perception, while validating the authors’ timbre space model. Additionally, another study has shown that perception of timbre dyads can be predicted by a vector sum of the positions of the individual timbres (i.e., if we replace the positions of the individual timbres by arrows going from the origin to their locations and then we place both arrows from tip to tip, we tend to find the perceptive position of their dyad) [8].  

Finding acoustic correlates behind perceptual dimensions can usefully complement neuro-cognitive studies that aim to understand the biological mechanisms behind sound perception [12]. In a practical approach, timbre spaces can be very useful especially for composers. As discussed in this article describing blend, they can be found driving computer-aided orchestration programs, to help predict perceptual results of instrumental combinations [1]. Some composers have also been working on algorithms that can generate timbre spaces based on the analysis of input sounds, making it possible to produce new timbres using coordinates in the newly generated timbre space [13]. This is the basis of composer Dominic Thibault’s current work, which uses concatenative synthesis to generate new sound events based on the segmentation and analysis of input sounds. He is developing a tool that automatically places these small segments in a very intuitive timbre space using artificial intelligence modules. Newly generated timbre spaces can also help composers in creating dynamics and structure for their musical pieces [14]. In addition, they can be found behind some more recent explanations of Klangfarbenmelodien (tone-color melodies), which are the subject of another article in this series. In summary, timbre spaces are a powerful explanatory tool for the perceptual aspect of timbre, and can be mobilized for a variety of practical functions in music research and creation.

REFERENCES

[1] McAdams, S. (2013). Musical Timbre Perception. In Deutsch, D. (ed.), The Psychology of Music (3rd ed., p. 35-67). Academic Press.

[2] Lembke, S-A. (2014). When timbre blends musically: perception and acoustics underlying orchestration and performance [PhD Dissertation, McGill University]. EScholarship. https://escholarship.mcgill.ca/concern/theses/6h440w776

[3] Munsell colour system. (2018, February 1st). In Encyclopaedia Britannica. https://www.britannica.com/science/Munsell-color-system

[4] Caclin, A., McAdams, S., Smith, B. K. and Winsberg, S. (2005). Acoustic correlates of timbre space dimensions: A confirmatory study using synthetic tones. Journal of the Acoustical Society of America, 118(1), 471-482. https://doi.org/10.1121/1.1929229

[5] Grey, J. (1977). Multidimensional perceptual scaling of musical timbres. Journal of the Acoustical Society of America, 61(5), 1270-1277. https://doi.org/10.1121/1.381428

[6] Ramsay, J.O., (1997). Multiscale Manual (Extended Version), McGill University.

[7] Douglas C. J., Arabie P. (1998). Multidimensional Scaling. In B. H. Michael (ed.), Measurement, Judgment and Decision Making. Handbook of Perception and Cognition (2nd ed., p. 179-250). Academic Press.

[8] McAdams, S., Winsberg, S., Donnadieu, S., De Soete, G. and Krimphoff, J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes. Psychol Res, 58, 177-192.

[9] Wei, Y., Gan, L., and Huang, X. (2022). A Review of Research on the Neurocognition for Timbre Perception. Frontiers in Psychology, 13(869475). https://doi.org/10.3389/fpsyg.2022.869475

Previous
Previous

Spectrogram

Next
Next

Path of Miracles - A Multitrack Recording in 3D Audio