B&O Tech: Near… Far…

#27 in a series of articles about the technology behind Bang & Olufsen loudspeakers

Introduction

To begin with, please watch the following video.

One thing to notice is how they made Grover sound near and far. Two things change in his voice (yes, yes, I know. It’s not ACTUALLY Grover’s voice. It’s really Yoda’s). The first change is the level – but if you’re focus on only that you’ll notice that it doesn’t really change so much. Grover is a little louder when he’s near than when he’s far. However, there’s another change that’s more important – the level of the reverberation relative to the level of the “dry” voice (what recording engineers sometimes call the “wet/dry mix”). When Grover is near, the sound is quite “dry” – there’s very little reverberation. When Grover is far, you hear much more of the room (more likely actually a spring or a plate reverb unit, given that this was made in the 1970’s).

This is a trick that has been used by recording engineers for decades. You can simulate distance in a mix by adding reverb to the sound. For example, listen to the drums and horns in the studio version of Penguins by Lyle Lovett. Then listen to the live version of the same people playing the same tune. Of course, there are lots of things (other than reverb) that are different between these two recordings – but it’s a good start for a comparison. As another example, compare this recording to this recording. Of course, these are different recordings of different people singing different songs – but the thing to listen for is the wet/dry mix and the perception of distance in the mix. Another example is this recording compared to this recording.

So, why does this trick work? The answer lies inside your brain – so we’ll have to look there first.

Distance Perception in the Mix

If you’re in a room with your eyes closed, and someone in the room starts talking to you, you’ll be pretty good at estimating where they are in the room – both in terms of angular location (you can point at them) and distance. This is true, even if you’ve never been in the room before. Very generally speaking, what’s going on here is that your brain is automatically comparing:

the two sounds coming into your two ears – the difference between these two signals tells you a lot about which direction the sound is coming from, AND
the direct sound from the source to the reflected sound coming from the room. This comparison gives you lots of information about a sound source’s distance and the size and acoustical characteristics of the room itself.

If we do the same thing in an anechoic chamber (a room where there are no echoes, because the walls absorb all sound) you will still be good at estimating the angle to the sound source (because you still have two ears), but you will fail miserably at the distance estimation (because there are no reflections to help you figure this out).

If you want to try this in real life, go outside (away from any big walls), close your eyes, and try to focus on how far away the sound sources appear to be. You have to work a little to force yourself to ignore the fact that you know where they really are – but when you do, you’ll find that things sound much closer than they are. This is because outdoors is relatively anechoic. If you go to the middle of a frozen lake that’s covered in fluffy snow, you’ll come as close as you’ll probably get to an anechoic environment in real life. (unless you do this as a hobby)

So, the moral of the story here is that, if you’re doing a recording and you want to make things sound far away, add reflections and reverberation – or at least make them louder and the direct sound quieter.

Distance Perception in the Listening Room

Let’s go back to that example of the studio recording of Lyle Lovett recording of Penguins. If you sit in your listening room and play that recording out of a pair of loudspeakers, how far away do the drums and horns sound relative to you? Now we’re not talking about whether one sounds further away than the other within the mix. I’m asking, “If you close your eyes and try to guess how far away the snare drum is from your listening position – what would you guess?”

For many people, the answer will be approximately as far away as the loudspeakers. So, if your loudspeakers are 3 m from the listening position, the horns (in that recording) will sound about 3 m away as well. However, this is not necessarily the case. Remember that the perception of distance is dependent on the relative levels of the direct and reflected sounds at your ears. So, if you listen to that recording in an anechoic chamber, the horns will sound closer than the loudspeakers (because there are no reflections to tell you how far away things are). The more reflective the room’s surfaces, the more the horns will sound further away (but probably no further than the loudspeakers, since the recording is quite dry).

This effect can also be the result of the width of the loudspeaker’s directivity. For example, a loudspeaker that emits a very narrow beam (like a laser, assuming that were possible) would not send any sound towards the walls – only towards the listening position. So, this would have the same effect as having no reflection (because there is no sound going towards the sidewalls to reflect). In other words, the wider the dispersion of the sound from the loudspeaker (in a reflective room) the greater the apparent distance to the sound (but no greater than the distance to the loudspeakers, assuming that the recording is “dry”).

Loudspeaker directivity

So, we’ve established that the apparent distance to a phantom image in a recording is, in part, and in some (perhaps most) cases, dependent on the loudspeaker’s directivity. So, let’s concentrate on that for a bit.

Let’s build a very simple loudspeaker. It’s a model that has been used to simulate the behaviour of a real loudspeaker, so I don’t feel too bad about over-simplifying too much here. We’ll build an infinite wall with a piston in it that moves in and out. For example:

Here, you can see the piston (in red) moving in and out of the wall (in grey) with the resulting sound waves (the expanding curves) moving outwards in the air (in white).

The problem with this video is that it’s a little too simple. We also have to consider how the sound radiation off the front of the piston will be different at different frequencies. Without getting into the physics of “why” (if you’re interested in that, you can look here or here or here for an explanation) a piston has a general behaviour with repeat to the radiation patten of the sound wave it generates. Generally, the higher the frequency, the narrower the “beam” of sound. At low frequencies, there is basically no beam – the sound is emitted in all directions equally. At high frequencies, the beam to be very narrow.

The question then is “how high a frequency is ‘high’?” The answer to that lies in the diameter of the piston (or the diameter of the loudspeaker driver, if we’re interested in real life). For example, take a look at Figure 1, below.

Radiation of 100 Hz (blue) and 1.5 kHz (green) from a 10" diameter piston (i.e. a woofer). — Fig 1: Radiation of 100 Hz (blue) and 1.5 kHz (green) from a 10″ diameter piston (i.e. a woofer).

Figure 1 shows how loud a signal will be if you measure it at different directions relative to the face of a piston that is 10″ (25.4 cm) in diameter. Two frequencies are shown – 100 Hz (the blue curve) and 1.5 kHz (the green curve). Both curves have been normalised to be the same level (100 dB SPL – although the actual value really doesn’t matter) on axis (at 0°). As you can see in the plot, as you move off to the side (either to 90° or 270°) the blue curve stays at 100 dB SPL. So, no matter what your angle relative to on-axis to the woofer, 100 Hz will be the same level (assuming that you maintain your distance). However, look at the green curve in comparison. As you move off to the side, the 1.5 kHz tone drops by more than 20 dB. Remember that this also means that (if the loudspeaker is pointing at you and the sidewall is to the side of the loudspeaker) then 100 Hz and 1.5 kHz will both get to you at the same level. However, the reflection off the wall will have 20 dB more level at 100 Hz than at 1.5 kHz. This also means, generally, that there is more energy in the room at 100 Hz than there is at 1.5 kHz because, if you consider the entire radiation of the loudspeaker averaged over all directions at the same time the lower frequency is louder in more places.

This, in turn, means that, if all you have is a 10″ woofer and you play music, you’ll notice that the high frequency content sounds closer to you in the room than the low frequency content.

If the loudspeaker driver is smaller, the effect is the same, the only difference is that the effect happens at a higher frequency. For example, Figure 2, below shows the off-axis response for two frequencies emitted by a 1″ (2.54 cm) diameter piston (i.e. a tweeter).

Radiation of 1.5 kHz (blue) and 15 kHz (green) from a 1" diameter piston (i.e. a tweeter). — Radiation of 1.5 kHz (blue) and 15 kHz (green) from a 1″ diameter piston (i.e. a tweeter).

Notice that the effect is identical, however, now, 1.5 kHz is the “low frequency region for the small piston, so it radiates in all directions equally (seen as the blue curve). The high frequency (now 15 kHz) becomes lower and lower in level as you move off to the side of the driver, going as low as -20 dB at 90°.

So, again, if you’re listening to music through that tweeter, you’ll notice that the frequency content at 1.5 kHz sounds further away from the listening position than the content at 15 kHz. Again, the higher the frequency, the closer the image.

Same information, shown differently

If you trust me, figures 1 and 2, above, show you that the sound radiating off the front of a loudspeaker driver gets narrower with increasing frequency. If you don’t trust me (and you shouldn’t – I’m very untrustworthy…) then you’ll be saying “but you only showed me the behaviour at two frequencies… what about the others?” Well, let’s plot the same basic info differently, so that we can see more data.

Figure 3, below, shows the same 10″ woofer, although now showing all frequencies from 20 Hz to 20 kHz, and all angles from -90° to +90°. However, now, instead of showing all levels (in dB) we’re only showing 3 values, at -1 dB, -3 dB, and -10 dB. ( These plots are a little tougher to read until you get used to them. However, if you’re used to looking at topographical maps, these are the same.)

A contour plot showing the directivity of a 10" piston (i.e. a woofer). The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB. — Fig 3: A contour plot showing the directivity of a 10″ piston (i.e. a woofer). The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

Now you can see that, as you get higher in frequency, the angles where you are within 1 dB of the on-axis response gets narrower, starting at about 400 Hz. This means that a 10″ diameter piston (which we are pretending to be a woofer) is “omnidirectional” up to 400 Hz, and then gets increasingly more directional as you go up.

Figure 4 shows the same information for a 1″ diameter piston. Now you can see that the driver is omnidirectional up to about 4 kHz. (This is not a coincidence – the frequency is 10 times that of the woofer because the diameter is one tenth.)

A contour plot showing the directivity of a 1" piston (i.e. a tweeter). The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB. — Fig 4: A contour plot showing the directivity of a 1″ piston (i.e. a tweeter). The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

Normally, however, you do not make a loudspeaker out of either a woofer or a tweeter – you put them together to cover the entire frequency range. So, let’s look at a plot of that behaviour. I’ve put together our two pistons using a 4th-order Linkwitz-Riley crossover at 1.5 kHz. I have also not included any weirdness caused by the separation of the drivers in space. This is theoretical world where the tweeter and the woofer are in the same place – an impossible coaxial loudspeaker.

A contour plot showing the directivity of a two-way loudspeaker made of a 1" and a 10" piston. The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB. — Fig 5: A contour plot showing the directivity of a two-way loudspeaker made of a 1″ and a 10″ piston. The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

In Figure 5 you can see the effects of the woofer’s directivity starting to beam below the crossover, and then the tweeter takes over and spreads the radiation wide again before it also narrows.

So what?

Why should you care about understanding the plot in Figure 5? Well, remember that the narrower the radiation of a loudspeaker, the closer the sound will appear to be to you. This means that, for the imaginary loudspeaker shown in Figure 5, if you’re playing a recording without additional reverberation, the low frequency stuff will sound far away (the same distance as the loudspeakers), So will a narrow band between 3 kHz and 4 kHz (where the tweeter pulls the radiation wider). However, the materials in the band around 700 Hz – 2 kHz and in the band above 7 kHz will sound much closer to you.

Another way to express this is to show a graph of the resulting level of the reverberant energy in the listening room relative to the direct sound, an example of which is shown in Figure 6. (This is a plot copied from “Acoustics and Psychoacoustics” by David Howard and Jamie Angus).

Reverberant energy from the room relative to the direct sound from a two-way loudspeaker. (X and Angus, 2000) — Fig 6: Reverberant energy from the room relative to the direct sound from a two-way loudspeaker. (from Howard and Angus, 2000)

This shows a slightly different loudspeaker with a crossover just under 3 kHz. This is easy to see in the plot, since it’s where the tweeter starts putting more sound into the room, thus increasing the amount of reverberant energy.

What does all of this mean? Well, if we simplify a little, it means that things like voices will pull apart in terms of apparent distance. Consonant sounds like “s” and “t” will appear to be closer than vowels like “ooh”.

So, whaddya gonna do about it?

All of this is why one of the really important concerns of the acoustical engineers at Bang & Olufsen is the directivity of the loudspeakers. In a previous posting, I mentioned this a little – but then it was with regards to identifying issues related to diffraction. In that case, directivity is more of a method of identifying a basic problem. In this posting, however, I’m talking about a fundamental goal in the acoustical design of the loudspeaker.

For example, take a look at Figures 7 and 8 and compare them to Figure 9. It’s important to note here that these three plots show the directivities of three different loudspeakers with respect to their on-axis response. The way this is done is to measure the on-axis magnitude response, and call that the reference. Then you measure the magnitude response at a different angle, and then calculate the difference between that and the reference. In essence, you’re pretending that the on-axis response is flat. This is not to be interpreted that the three loudspeakers shown here have the same on-axis response. They don’t. Each is normalised to its own on-axis response. So we’re only considering how the loudspeaker compares to itself.

Fig 7: A contour plot showing the directivity of a commercially available 3-way loudspeaker. The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

Figure 7, above, shows the directivity behaviour of a commercially-available 3-way loudspeaker (not from Bang & Olufsen). You can see that the woofer is increasingly beaming (the directivity gets narrow) up to the 3 – 5 kHz area. The midrange is beaming up above 10 kHz or so. So, a full band signal will sound distant in the low end, in the 6-7 kHz range and around 15 kHz. By comparison, signals at 2-4 kHz and 10-15 kHz will sound quite close.

Fig 8: A contour plot showing the directivity of traditionally designed 4-way loudspeaker. The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

Figure 8, above, shows the directivity behaviour of a 3-way loudspeaker we made as a rough prototype. This is just a woofer, midrange and tweeter, each in its own MDF box – nothing fancy – except that the tweeter box is not as wide as the midrange box which is narrower than the woofer box. You can see that the woofer is beaming (the directivity gets narrow) just above 1 kHz – although it has a very weird wide directivity at around 650 Hz for some reason. The midrange is beaming up from 5kHz to 10 kHz, and then the tweeter gets wide. So, this loudspeaker will have the same problem as the commercial loudspeaker

Fig 9: A contour plot showing the directivity of a BeoLab 5. The red area has a magnitude between 0 and -1 dB. The orange area has a magnitude of -1 down to -3 dB. The blue area has a magnitude of -3 down to -10 dB. The white area is lower than -10 dB.

As you can see, the loudspeaker with the directivity shown in Figure 9 (the BeoLab 5) is much more constant as you change frequency (in other words, the lines are more parallel). It’s not perfect, but it’s a lot better than the other two – assuming that constant directivity is your goal. You can also see that the level of the signal that is within 1 dB of the on-axis response is quite wide compared with the loudspeakers in Figures 7 and 8. The loudspeaker in Figure 7 not only beams in the high frequencies, but also has some strange “lobes” where things are louder off-axis than they are on-axis (the red lines).

When you read B&O’s marketing materials about the reason why we use Acoustic Lenses in our loudspeakers, the main message is that it’s designed to spread the sound – especially the high frequencies – wider than a normal tweeter, so that everyone on the sofa can hear the high hat. This is true. However, if you ask one of the acoustical engineers who worked on the project, they’ll tell you that the real reason is to maintain constant directivity as well as possible in order to ensure that the direct-to-reverberant ratio in your listening room does not vary with frequency. However, that’s a difficult concept to explain in 1 or 2 sentences, so you won’t hear it mentioned often. However, if you read this paper (which was published just after the release of the BeoLab 5), for example, you’ll see that it was part of the original thinking behind the engineers on the project.

Addendum 1.

I’ve been thinking more about this since I wrote it. One thing that I realised that I should add was to draw a comparison to timbre. When you listen to music on your loudspeakers in your living room, in a best-case scenario, you hear the same timbral balance that the recording engineer and the mastering engineer heard when they worked on the recording. In theory, you should not hear more bass or less midrange or more treble than they heard. The directivity of the loudspeaker has a similar influence – but on the spatial performance of the loudspeakers instead of the timbral performance. You want a loudspeaker that doesn’t alter the relative apparent distances to sources in the mix – just like you don’t want the loudspeakers to alter the timbre by delivering too much high frequency content.

Addendum 2.

One more thing… I made the plot below to help simplify the connection between directivity and Grover. Hope this helps.

A contour plot showing the directivity of a commercially available 3-way loudspeaker. The wider the plot (vertically), the farther the image.

Bang & Olufsen BeoVision Avant Reviews

I was responsible for the final sound design (aka tonal balance) of the loudspeakers built into the BeoVision Avant. So, I’m happy to share some of the blame for some of the comments (at least on the sound quality) from the reviews.

from Home Cinema Choice magazine

“Where the Avant really gets space-age, though, is with … its ability to drive 21 audio channels ”

“This effort by the integrated speakers , together with its unprecedented audio flexibility,makes the Avant the finest sounding TV I’ve ever heard.”

from TrustedReviews.com

“Even a high-end sound bar would struggle to match the gorgeous finesse the Avant combines with its raw power. The speakers reproduce soundtrack subtleties more precisely and elegantly than any other TV we’ve heard. And they do so no matter how dense the soundstage becomes, and without so much as a hint of treble harshness.”

“Then there’s that rear-mounted subwoofer. We had worried that the way this angled subwoofer fires up and out through an actually quite narrow vent could cause boominess or distortion, but not a bit of it. Instead very impressive and well-rounded amounts of bass meld immaculately into the bottom end of the wide mid-range delivered by those terrific left, right and centre speakers.”

from flatpanelshd.com

“Compared to all other TVs on the market (non-B&O) there is no competition. Sound is so much better. However, we also have to point out that the TV did not receive the best conditions for a proper audio demonstration.“

Bang & Olufsen BeoLab 18 Reviews

I was part of the development team, and one of the two persons who decided on the final sound design (aka tonal balance) of the B&O BeoLab 18 loudspeakers. So, I’m happy to share some of the blame for some of the comments (at least on the sound quality) from the reviews.

Bernard Dickinson at Live Magazines said:

“The sound reproduction is flawless”

Lyd & Billede’s August 2014 review said

“Lydkvaliteten er rigtig god med en åben, distinkt og fyldig gengivelse, som ikke gør højopløste lydformater til skamme.” (The sound quality is very good with an open, clear and detailed reproduction, which do not put high-resolution audio formats to shame.)

and ”Stemmerne er lige klare og tydelige, hvad enten vi sidder lige i smørhullet eller befinder os langt ude i siden. Det er faktisk ret usædvanligt og gør, at BeoLab 18 egner sig lige godt til både baggrundsmusik og aktiv lytning” (The voices are crisp and clear, whether we are sitting right in the sweet spot or far off to the side. It’s actually quite unusual and makes the BeoLab 18 equally suited for both background music and active listening)

B&O Tech: Combinatorics

#26 in a series of articles about the technology behind Bang & Olufsen loudspeakers

I occasionally drop in to read and comment on the fora at www.beoworld.org. This is a group of Bang & Olufsen enthusiasts who, for the most part, are a great bunch of people, are very supportive of the brand (and yet, like any good family member, are not afraid to offer constructive criticism when it’s warranted…) Anyways, during one discussion about Speaker Groups in the BeoVision Avant, BeoVision 11, BeoPLay V1 and BeoSystem 4, the following question came up:

Speaker Groups

Would be interesting to know how many different ‘Speaker Group constellations’ you actually could make with the new engine. Of course you are limited to 10, if you want to save them. But I guess that should be enough for most of us.

This got me thinking about exactly how many possible combinations of parameters there are in a single Speaker Group in our video products. As a result, I answered with the response copied-and-pasted below:

Multiply the number of audio channels you have (internal + external) by 17 (the total number of possible speaker roles not including subwoofers, but including NONE as an option in in case you want to use a subset of your loudspeakers) or 22 (the total number of possible speaker roles including subwoofers) to get the total number of Loudspeaker Constellations.

If you want to include Speaker Level and Speaker Distance, then you will have to multiply the previous result by 301 (possible distances for each loudspeaker) and multiply again by 61 (the total number of Speaker Levels) to get the total number of possible Configurations.

This means:

If you have a BeoPlay V1 (with 2 internal + 6 external outputs) the answers are

136 constellations without a subwoofer, or
176 constellations with subwoofers, and
a maximum of 3,231,536 possible Speaker Groups Configurations (including Levels and Distances)

If you have a BeoVision 11 without Wireless (2+10 channels), then the totals are

204 constellations without a subwoofer, or
264 constellations with subwoofers, and
4,847,304 possible Speaker Groups Configurations (including Levels and Distances)

If you have a BeoVision 11 with Wireless (2+10+8 channels), then the totals are

340 constellations without a subwoofer, or
440 constellations with subwoofers, and
8,078,840 possible Speaker Groups Configurations (including Levels and Distances)

If you have a BeoVision Avant (3+10+8 channels), then the totals are

357 constellations without a subwoofer, or
462 constellations with subwoofers, and
8,482,782 possible Speaker Groups Configurations (including Levels and Distances)

Note that these numbers are FOR EACH SPEAKER GROUP. So you can multiply each of those by 10 (for the number of Speaker Groups you have available in your TV). The reader is left to do this math on his/her own.

Note as well that I have not included the Speaker Roles MIX LEFT, MIX RIGHT – for those of you who are using your Speaker Groups to make headphone outputs – you know who you are… ;-)

Note as well that I have not included the possibilities for the Bass Management control.

Sound Modes

This also got me thinking about the total number of possible combinations of settings there are for the Sound Modes in the same products. In order to calculate this, you start with the list of the parameters and their possible values which is listed below:

Frequency Tilt: 21
Sound Enhance: 21
Speech Enhance: 11
Loudness On/Off: 2
Bass Mgt On/Off: 2
Balance: 21
Fader: 21
Dynamics (off, med, max): 3
Listening Style: 2
LFE Input on/off: 2
Loudness Bass: 13
Loudness Treble: 13
Spatial Processing: 3
Spatial Surround: 11
Spatial Height: 11
Spatial Stage Width: 11
Spatial Envelopment: 11
Clip Protection On/Off: 2

Multiply all those together and you get 1,524,473,211,092,832 different possible combinations for the parameters in a Sound Mode.

Note that this does not include the global Bass and Treble controls which are not part of the Sound Mode parameters.

However, it’s slightly misleading, since some parameters don’t work in some settings of other parameters. For example:

All four Spatial Controls are disabled when the Spatial Processing is set to either “1:1” or “downmix” – so that takes away 29282 combinations.

If Loudness is set to Off, then the Loudness Bass and Loudness Treble are irrelevant – so that takes away 169 combinations.

So, that reduces the total to only 1,524,473,211,063,381 total possible parameter configurations for a single Sound Mode.

Finally, this calculation assumes that you have all output Speaker Roles in use. For example, if you don’t have any height loudspeakers, then the Spatial Height control won’t do anything useful.

If you’d like more information on what these mean, please check out the Technical Sound Guide for Bang & Olufsen video products downloadable from this page.

B&O Tech: Where great sound starts

#25 in a series of articles about the technology behind Bang & Olufsen loudspeakers

You’ve bought your loudspeakers, you’ve connected your player, your listening chair is in exactly the right place. You sit down, put on a new recording, and you don’t like how it sounds. So, the first question is “who can I blame!?”

Of course, you can blame your loudspeakers (or at least, the people that made them). You could blame the acoustical behaviour of your listening room (that could be expensive). You could blame the format that you chose when you bought the recording (was it 128 kbps MP3 or a CD?). Or, if you’re one of those kinds of people, you could blame the quality of the AC mains cable that provides the last meter of electrical current supply to your amplifier from the hydroelectric dam 3000 km away Or you could blame the people who made the recording.

In fact, if the recording quality is poor (whatever that might mean) then you can stop worrying about your loudspeakers and your room and everything else – they are not the weakest link in the chain.

So, this week, we’ll talk about who those people are that made your recording, how they did it, and what each of them was supposed to look after before someone put a CD on a shelf (or, if you’re a little more current, put a file on a website).

Recording Engineer

The recording engineer is the person you picture of when you think about a recording session. You have the musicians in the studio or the concert hall, singing and playing music. That sound travels to microphones that were set up by a Recording Engineer who then sits behind a mixing console (if you’re American – a “mixing desk” if you’re British) and fiddles with knobs obsessively.

There’s a small detail here that we should not overlook. Generally speaking, a “recording engineer” has to do two things that happen at different times in the process of making a recording. The first is called “tracking” and the second is called “mixing”.

Tracking

Normally, bands don’t like playing together – sometimes because they don’t even like to be in the same room as each other. Sometimes schedules just don’t work out. Sometimes the orchestra and the soloist can’t be in the same city at the same time.

In order to circumvent this problem, the musicians are recorded separately in a process called “tracking”. During tracking, each musician plays their part, with or without other members of the band or ensemble. For example, if you’re a rock band, the bass and the drummer usually arrive first, and they play their parts. In the old days, they would have been recorded to seaport tracks on a very wide (2″!) magnetic tape (hence the term “tracking”) where each instrument is recorded on a separate track. That way, the engineer has a separate recording of the kick drum and the snare drum and each tom-tom and each cymbal, and so on and so on. Nowadays, most people don’t record to magnetic tape because it’s too expensive. Instead, the tracks are recorded on a hard disc on a computer. However, the process is basically the same.

Once the bass player and the drummer are done, then the guitarist comes into the studio to record his or her parts while listening to the previously-recorded bass and drum parts over a pair of headphones. Then the singer comes in and listens to the bass, drums and guitar and sings along. Then the backup vocalists come in, and so on and so on, until everyone has recorded their part.

During the tracking, the recording engineer sets up and positions the microphones to get the optimal sound for each instrument. He or she will make sure that the gain that is applied to each of those microphones is correct – meaning that it’s recorded at a level that is high enough be mask the noise floor of the electronics and the recording medium, but not so high that it distorts. In the old days, this was difficult because the dynamic range of the recording system was quite small – so they had to stay quite close to the ceiling all the time – sometimes hitting it. Nowadays, it’s much easier, since the signal paths have much wider dynamic ranges so there’s more room for error.

In the case of a classical recording, it might be a little different for the musicians, but the technical side is essentially the same. For example, an orchestra will play (so you don’t bring in the trombone section first – everyone plays together) with a lot of microphones in the room. Each microphone will be recorded on its own individual track, just like with the rock band. The only difference is that everyone is playing at the same time.

Fig 2. A typical orchestra recording session. Note that all the musicians are there, and there are a lot of microphones in the room. Each of those microphones is probably being recorded on its own independent track on a hard disc somewhere so that they can be mixed together later.

Once all the tracking is done the musicians are finished. They’ve all been captured, each on their own track that can be played back later in isolation (for example, you can listen to just the snare drum, or just the microphone above the woodwind section). Sometimes, they will even have played or sung their part more than once – so we have different versions or “takes” to choose from later. This means that there may be hundreds of tracks that all need to be mixed together (or perhaps just left out…) in order to make something that normal people can play on their stereo.

Mixing

Now that all the individual tracks are recorded, they have to be combined into a pretty package that can be easily delivered to the customers. This means that all of those individual tracks that have been recorded have to be assembled or “mixed” together into a version that has, say, only two channels – one for the left loudspeaker and one for the right loudspeaker. This is done by feeding each individual track to its own input on a mixing console and listening to them individually to see how they best fit together. This is called the “mixing” process. During this stage, basic decisions are made like “how loud should the vocals be relative to the guitars (and everything else)”. However, it’s a little more detailed than that. Each track will need its own processing or correction (maybe changing the equalisation on the snare drum – or altering the attack and decay of the bass guitar using a dynamic range compressor – or the level of the vocal recording is changed throughout the tune to compensate for the fact that the singer couldn’t stay the same distance from the microphone whilst singing…) that helps it to better fit into the final mix.

If you walk into the control room of a recording studio during a mixing session, you’d see that it looks almost exactly like a recording session – except that there are no musicians playing in the studio. This is because what you usually see on videos like this one is a tracking session – but the recording engineer usually does a “rough mix” during tracking – just to get a preliminary idea of how the puzzle will fit together during mixing.

Once the mixing session for the tune is finished, then you have a nearly-finished product. You at least have something that the musicians can take home to have a listen to see if they’re satisfied with the overall product so far.

Editing

In classical music there is an extra step that happens here. As I said above, with classical recordings, it’s not unusual for all the musicians to play in the same room at the same time when the tracking is happening. However, it is unusual that they are able to play all the way through the piece without making any mistakes or having some small issues that they want to fix. So, usually, in a classical recording, the musicians will play through the piece (or the movement) all the way through 2 or 3 times. While that happens, a Recording Producer is sitting in the control room, listening and making notes on a copy of the score. Each time there is a mistake, the producer makes a note of it – usually with a red make indicating the Take Number in which the mistake was made. If, after 2 or 3 full takes of the piece, there are points in the piece that have not been played correctly, then they go back and fix small bits. The ensemble will be asked to play, say 5 bars leading up to the point that needs fixing – and to continue playing for another 5 bars or so.

Later, those different takes (either full recordings, or bits and pieces) will be cut and spliced together in a process called editing. In the old days, this was done using a razor blade to cut the magnetic tape and stick it back together. For example, if you listen to some of Glen Gould’s recordings, you can hear the piano playing along, but the tape hiss changes suddenly in the background noise. This is the result of a splice between two different recordings – probably made on different days or with different brands of tape. Nowadays, the “splicing” is done on a computer where you fade out of one take and fade into another gradually over 10 ms or so.

Fig 4. A “crossfade” on a modern digital audio workstation. The edit point (what used to be a “tape splice”) is where the recording on the top right is faded out and a different recording (bottom right) of the same music is faded in.

If the editing was perfect, then you’ll never hear that it happened. Sometimes, however, it’s possible to hear the splice. For example, listen to this recording and listen to the overall level and general timbre of the piano. It changes to a quieter, duller sound from about 0′ 27″ to about 0´ 31″. This is a rather obvious tape splice to a different recording than the rest of the track.

Mastering Engineer

The final stage of creating a recording is performed by a Mastering Engineer in a mastering studio. This person gets the (theoretically…) “finished” product and makes it better. He or she will sit in a room that has very little gear in it, listening to the mixed song to hear if there are any small things that need fixing. For example, perhaps the overall timbre of the tune needs a little brightening or some control of the dynamic range.

Another basic role of the mastering engineer is to make sure that all of the tracks on a single album sound about the same level – since you don’t want people sitting at home fiddling with the volume knob from tune to tune.

When the mastering engineer is done, and the various other people have approved the final product, then the recording is finished. All that is left to do is to send the master to a plant to be pressed as a CD – or uploaded to the iTunes server – or whatever.

Fig 5. A mastering engineer sitting at a mastering console. Notice that, unlike a mixing console, a mastering console does not have a massive number of faders and knobs because it doesn’t have a lot of inputs. Also note that the mastering engineer looks better rested and more cleanly shaven than the recording engineer (above) because he doesn’t have to talk to musicians every day at work. Okay, okay, I’m joking… sort of…

In other words the Mastering Engineer is the last person to make decisions about how a recording should sound before you get it.

This is why, when I’m talking to visitors, I say that our goal at Bang & Olufsen is to build loudspeakers that perform so that you, in your listening room, hear what the mastering engineer heard – because the ultimate reference of how the recording should sound is what it sounded like in the mastering studio.

Appendicies

What’s a producer?

The title of Recording Producer means different things for different projects. Sometimes, it’s the person with the money who hires everyone for the recording.

Sometimes (usually in a pop recording) it’s the person sitting in the control room next to the recording engineer who helps the band with the arrangement – suggesting where to put a guitar solo or where to add backup vocals. Some pop producers will even do good ol’ fashioned music arrangements.

A producer for a classical recording usually acts as an extra set of ears for the musicians through the recording process. This person will also sit with the recording engineer in the control room, following the score to ensure that all sections of the piece have been captured to the satisfaction of the performers. He or she may also make suggestions about overall musical issues like tempi, phrasing, interpretation and so on.

But what about film?

The basic procedure for film mixing is the same – however, the “mixing engineer” in a film world is called a “re-recording engineer”. The work is similar, but the name is changed.

So that’s a “Tonmeister”?

A tonmeister is a person who can act simultaneously as a Recording Engineer and a Recording Producer. It’s a person who has been trained to be equally competent in issues about music (typically, tonmeisters are also musicians), acoustics, electronics, as well as recording and studio techniques.

B&O Tech: How loud are my headphones?

#24 in a series of articles about the technology behind Bang & Olufsen loudspeakers

As you may already be aware, Bang & Olufsen makes headphones: the BeoPlay H6 over-ear headphones and the BeoPlay H3 earbuds.

If you read some reviews of the H6 you’ll find some reviewers like them very much and say things like “…excellent clarity and weight, well-defined bass and a sense of openness and space unusual in closed-back headphones. The sound is rich, attractive and ever-so-easy to enjoy.” and “… by no means are these headphones designed only for those wanting a pounding bass-line and an exciting overall balance: as already mentioned the bass extension is impressive, but it’s matched with low-end definition and control that’s just as striking, while a smooth midband and airy, but sweet, treble complete the sonic picture.” (both quotes are from Gramophone Magazine’s April 2014 issue). However, some other reviewers say things like “My only objection to the H6s is their volume level is not quite as loud as I normally would expect.” (A review from an otherwise-satisfied on Amazon.com). And, of course, there are the people whose tastes have been influenced by the unfortunate trend of companies selling headphones with a significantly boosted low-frequency range, and who now believe that all headphones should behave like that. (I sometimes wonder if the same people believe that, if it doesn’t taste like a Big Mac, it’s not a good burger… I also wonder why they don’t know that it’s possible to turn up the bass on most playback devices… But I digress…)

For this week’s posting, I’ll just deal with the first “complaint” – how loud should a pair of headphones be able to play?

Part 1: Sensitivity

One of the characteristics of a pair of headphones, like a passive loudspeaker, is its sensitivity. This is basically a measurement of how efficient the headphones are at converting electrical energy into acoustical output (although you should be careful to not confuse “Sensitivity” with “Efficiency” – sensitivity is a measure of the sound pressure level or SPL output for the voltage at the input whereas efficiency is a measure of the SPL output for the power in milliwatts). The higher the sensitivity of the headphones, the louder they will play for the same input voltage.

So, if you have a pair of headphones that are not very sensitive, and you plug them into your smartphone playing a tune at full volume, it might be relatively quiet. By comparison, a pair of very sensitive headphones plugged into the same smartphone playing the same tune at the same volume might be painfully loud. For example,let’s look at the measured data for three not-very-randomly selected headphones at https://www.stereophile.com/content/innerfidelity-headphone-measurements.

Brand	Model	Vrms to produce 90 dB SPL	dBV to produce 90 dB SPL
Sennheiser	HD600	0.230	-12.77
Beoplay	H6	0.044	-27.13
Etymotic	ER4PT	0.03	-30.46

If we do a little math, this means that, for the same input voltage, the Etymotic’s will be 3.3 dB louder than the H6’s and the Sennheiser’s will be 14.4 dB quieter. This is a very big difference. (The Etymotic’s are 7.7 times louder than the Sennheisers!)

So, in other words, different headphones have different sensitivities. Some will be quieter than others – some will be louder.

Side note: If you want to compare different pairs of headphones for output level, you could either look them up at the stereophile.com site I mentioned above, or you could compare their data sheet specifications using the Sensitivity to Efficiency converter on this page.

The moral of this first part of the story is that, when someone says “these headphones are not very loud” – the question is “compared to what?”

Part 2: The Source

I guess it goes without saying, but if you want more out of your headphones, the easiest solution is to turn up the volume of your source. The question then is: how much output can your source deliver? This answer also varies greatly from product to product. For example, if I take four not-very-randomly selected measurements that I did myself, I can see the following maximum output levels for a 500 Hz, 0 dB FS sine tone at maximum volume sent to a 31 ohm load (a resistor pretending to be a pair of headphones):

Brand	Model	Vrms	dBV
Lenovo	ThinkPad T420	0.317	-9.98
Apple	iPhone 3Ds	0.89	-1.01
Apple	MacBook Pro	2.085	+6.38
Sony	CDP-D500	6.69	+16.51

In other words, the Sony is more than 26 dB (or 21 times) louder than the ThinkPad, if we’re just measuring voltage. This is a very big difference.

So, as you can see, turning the volume all the way up to 11 on different product results in very different output levels. This is even true if you compare iPod Nano’s of different generations, for example – no two products are the same.

The moral of the story here is: if your headphones aren’t loud enough, it might not be the headphones’ fault.

Part 3: The Details, French Law, and How to Cheat

So much for the obvious things – now we are going to get a little ugly.

Let’s begin the ugliness with a little re-hashing of a previous posting. As I talked about in this posting, your ears behave differently at different listening levels. More specifically, you don’t hear bass and treble as well when the signal is quiet. The louder it gets, the more flat your “frequency response”. This means that, when acoustical consultants are making measurements of quiet things, they usually have to make the microphone signal as “bad” as your hearing at low levels. For example, when you’re measuring air conditioning noise in an office space, you want to make your microphone less sensitive to low frequencies, otherwise you’ll get a reading of a high noise level when you can’t actually hear anything. In order to do this, we use something called an “weighting filter” which is an attempt to simulate your frequency response. There are many different weighting curves – but the one we’ll talk about in this posting is an “A-weighting” curve. This is a filter that attenuates the low and high frequencies and has a small boost in the mid-band – just like you do at quiet listening levels. The magnitude response of that curve is shown below in Figure 1. At higher levels (like measuring the noise level at the end of a runway while a plane is taking off over your head), you might want to use a different weighting curve like a “C-weighting” filter – or none at all.

Fig 1. The magnitude response of an A-weighting filter.

So, let’s say that you get enough money on Kickstarter to create the Fly-by-Night Headphone Company and you’re going to make a flagship pair of headphones that will sweep the world by storm. You do a little research and you start coming across something called “BS EN 50332-1” and “BS EN 50332-2“. Hmmmm… what are these? They’re international standards that define how to measure how loudly a pair of headphones plays. The procedure goes something like this:

get some pink noise
filter it to reduce the bass and the treble so that it has a spectrum that is more like music (the actual filter used for this is quite specific)
reduce its crest factor so your measurement doesn’t jump around so much (this basically just gets rid of the peaks in the signal)
do a quick check to make sure that, by limiting the crest factor, you haven’t changed the spectrum beyond the acceptable limits of the test procedure
play the signal through the headphones and measure the sound pressure level using a dummy head
apply an A-weighting to the measurement
calculate how loud it is (averaged over time, just to be sure)

So, now you know how loud your headphones can play using a standard measurement procedure. Then you find out that, according to another international standard called EN 60065 or EN 60950-1 there are maximum limits to what you’re permitted to legally sell… in France… for now… (Okay, okay, these are European standards, but Europe has more than one country in it, so I think that I can safely call them international…)

So, you make your headphones, you make them sound like you want them to sound (I’ll talk about the details of this in a future posting), and then you test them (or have them tested) to see if they’re legal in France. If not (in other words, if they’re too sensitive), then you’ll have to tweak the sensitivity accordingly.

Okay – that’s what you have to do – but let’s look at that procedure a little more carefully.

Step 1 was to get some pink noise. This is nothing special – you can get or make pink noise pretty easily.

Step 2 was to filter the noise so that its spectrum ostensibly better matches the average spectrum of all recorded and transmitted music and speech in the world. The details of this filter are in another international standard called IEC 60268-1. The people who wrote this step mean well – there’s no point in testing your headphones with frequencies that are outside the range of anything you’ll ever hear in them. However, this means that there is probably some track somewhere that includes something that is not represented by the spectral characteristics of the test signal we’re using here. For example: Figure 2, below shows the spectral curve of the test signal that you are supposed to send to the headphones for the test.

Fig 2. The magnitude response of the filter applied to the pink noise before sending it to the headphones.

Compare that to Figure 3, which shows an analysis of a popular Lady Gaga tune that I use as part of my collection of tunes to make a woofer unhappy. This is a commercially-available track that has not been modified in any way.

Fig 3. The spectrum of a Lady Gaga tune. Compare this with the noise filter plot from Figure 2 (plotted in red for your convenience).

As you can see, there is more energy in the music example than there is in the test signal around the 30 – 60 Hz octave – particularly noticeable due to the relative “hole” in the response that ranges between about 70 and 700 Hz.

Of course, if we took LOTS of tunes and analysed them, and averaged their analyses, we’d find out that the IEC test signal shown in Figure 2 is actually not too bad. However, every tune is different from the average in some way.

So, the test signal used in the EN 50332 test is not going to push headphones as hard as some kinds of music (specifically, music that has a lot of bass content),

We’ll skip Step 3, Step 4, and Step 5.

Step 6 is a curiosity. We’re supposed to take the signal that we recorded coming out of the headphones and apply an A-weighting filter to it. Now, remember from above that an A-weighting filter reduces the low and high frequencies in an effort to simulate your bad hearing characteristics at quiet listening levels. However, what we’re measuring here is how loud the headphones can go. So, there is a bit of a contradiction between the detail of the procedure and what it’s being used for. However, to be fair, many people mis-use A-weighting filters when they’re making noise measurements. In fact, you see A-weighted measurements all the time – regardless of the overall level of the noise that’s being measured. One possible reason for this is that people want to be able to compare the results from the loud measurements to the results from their quiet ones – so they apply the same weighting to both – but that’s just a guess.

Let’s, just for a second, consider the impact of combining Steps 2 and 6. Each of the filters in both of these steps reduce the sensitivity of the test to the low and high frequency behaviour of the headphones. If we combine their effects into a single curve, it looks like the one in Figure 4, below.

Fig 4. The magnitude response of the combination of the A-weighting filter and the filter applied to the pink noise signal.

At this point, you may be asking “so what?” Here’s what.

Let’s take two different pairs of headphones and pretend that we measured them using the procedure I described above. The first pair of headphones (we’ll call it “Headphone A”) has a completely flat frequency response +/- < 0.000001 dB from 20 Hz to 20 kHz. The second pair of headphones has a bass boost such that anything below about 120 Hz has a 20 dB gain applied to it (we’ll call that “Headphone B”). The two unweighted measurements of these two simulated headphones are shown in Figure 5.

The magnitude responses of the two simulated headphones. The blue curve is "Headphone A". The red curve is "Headphone B". — Fig 5. The magnitude responses of the two simulated headphones. The blue curve is “Headphone A”. The red curve is “Headphone B”.

After filtering these measurements with the weighting curves from Steps 2 and 6 (above), the way our measurement system “hears” these headphone responses is slightly different – as you can see in Figure 6, below.

The magnitude responses of the two simulated headphones as "seen" by a EN 50332 measurement. The blue curve is "Headphone A". The red curve is "Headphone B". — Fig 6. The magnitude responses of the two simulated headphones as “seen” by a EN 50332 measurement. The blue curve is “Headphone A”. The red curve is “Headphone B”.

So, what happens when we measure the sound pressure level of the pink noise through these headphones?

Well, if we did the measurements without applying the two weighing curves, but just using good ol’ pink noise and no messin’ around, we’d see that Headphone B plays 13.1 dB louder than Headphone A (because of the 20 dB bass boost). However, if we apply the filters from Steps 2 and 6, the measured difference drops to only 0.46 dB.

This is interesting, since the standard measurement “thinks” that a 20 dB boost in the entire low frequency region corresponds to only a 0.46 dB increase in overall level.

Figure 7 shows the relationship between the bass boost applied below 120 Hz and the increase in overall level as measured using the EN 50332 standard.

The relationship between the increase in SPL as measured using the EN 50332 standard vs. the gain of a bass boost applied to the headphones. (filter characteristics are Low shelving, fc=120 Hz, Q=0707) — Fig 7. The relationship between the increase in SPL as measured using the EN 50332 standard vs. the gain of a bass boost applied to the headphones. (filter characteristics are Low shelving, fc=120 Hz, Q=0.707)

So, let’s go back to you, the CEO of the Fly-by-Night Headphone Company. You want to make your headphones louder, but you also need to obey the law in France. What’s a sneaky way to do this? Boost the bass! As you saw above, you can crank up the bass by 20 dB and the regulators will only see a 0.46 dB change in output level. You can totally get away with that one! Some people might complain that you have too much bass in your headphones, but hey – kids love bass. And plus, your competitors will get complaints about how quiet their headphones are compared to yours. All because people listening to children’s records at high listening levels hear much more bass than the EN 50332 measurement can!

Of course, one other way is to just ignore the law and make the headphones louder by increasing their sensitivity… but no one would do that because it’s illegal. In France.

Appendix 1: Listen to your Mother!

My mother always told me “Turn down that Walkman! You’re going to go deaf!” The question is “Was my mother right?” Of course, the answer is “yes” – if you listen to a loud enough sound for a long enough time, you will damage your hearing – and hearing damage, generally speaking, is permanent. The louder the sound, the less time it takes to cause the damage. The question then is “how loud and how long?” The answer is different for everyone, however you can find some recommendations for what’s probably safe for you at sites that deal with occupational health and safety. For example, this site lists the Canadian recommendations for maximum exposure time limits to noise in the workplace. This site shows a graph for the US recommendations for the same thing – I’ve used the formula on that site to make the graph in Figure 8, below.

Fig 8. Recommendations for the maximum time of exposure to noise sources according to The National Institute for Occupational Safety and Health (NIOSH) in the USA.

How do these noise levels compare with what comes out of my headphones? Well, let’s go back to the numbers I gave in Part 1 and Part 2. If we take the measured maximum output levels of the 4 devices listed in Part 2, and calculate what the output level in dB SPL would be through the measured sensitivities of the headphones listed in Part 1 (assuming that everything else was linear and nothing distorted or clipped or became unhappy – and ignoring the fact that the headphones do not have the same impedance as the one I used to do the measurements of the 4 devices… and assuming that the measurements of the headphones are unweighted on that website), then the maximum output level you can get from those devices are shown in Figure 9.

Fig 9. Calculated maximum output levels in dB SPL for the four devices and three headphones listed in Parts 1 and 2, above. Blue is the Etymotic ER4PT, red is the BeoPlay H6, and black is the Sennheiser HD600.

So, if you take the calculations shown in Figure 8 and compare them to the recommendations shown in Figure 7, then you might reach the conclusion that, if you set your volume to maximum (and your tune is full-band pink noise mastered to a constant level of 0 dB FS, and we do a small correction for the A-weighting based on the assumption that the 90 dB SPL headphone measurements listed above are unweighted ), then the maximum recommended time that you should listen to your music, according to the federal government in the USA is as shown in Figure 10.

Recommended maximum exposure time for 3 different headphones connected to 4 different sources playing at maximum volume. — Fig 10. Recommended maximum exposure time for 3 different headphones connected to 4 different sources playing at maximum volume (based on a large number of assumptions that may or may not be correct).

So, if I only listen to full-bandwidth pink noise at 0 dB FS at maximum volume on my MacBook Pro over my BeoPlay H6’s, then the American government thinks that after 0.1486 of a second, I am damaging my hearing.

It seems that my mother was right.

Appendix 2: Why does France care about how loud my headphones are?

This is an interesting question, the answer to which makes sense to some people, and doesn’t make any sense at all to other people – with a likely correlation with your political beliefs and your allegiance to the Destruction of the Tea in Boston. The best answer I’ve read was discussed in this forum where one poster very astutely point out that, in France, the state pays for your medical care. So, France has a right to prevent the French from making themselves go deaf by listening to this week’s top hit on Spotify at (literally) deafening levels. If you live in a place where you have to pay for your own medical care, then you have the right to self-harm and induce your own hearing impairment in order to appreciate the subtle details buried within the latest hip-hop remix of Mr. Achy-Breaky Heart’s daughter “singing” Wrecking Ball while you’re riding on a bus. In France, you don’t.

Appendix 3: Additional Reading

ISVR Consulting’s page

Rodhe and Schwarz’s pdf manual for running the EN 50332 test on their equipment

B&O Tech: Visual Analogies to Problems in Audio

#23 in a series of articles about the technology behind Bang & Olufsen loudspeakers

Audio people throw words around like “frequency” and “distortion” and “resolution” and “” without wondering whether anyone else in the room (a) understands or (b) cares. One of the best ways to explain things to people who do not understand but do care is to use analogies and metaphors. So, this week, I’d like to give some visual analogies of common problems in audio.

Let’s start with a photograph. Assuming that your computer monitor is identical to mine, and the background light in your room is exactly the same as it is in mine, then you’re seeing what I’m seeing when you look at this photo.

Let’s say that you, sitting there, looking at this photo is analogous to you, sitting there, listening to a recording on a pair of loudspeakers or over headphones. So what happens when something in the signal path messes up the signal?

Perhaps, for example, you have a limited range in your system. That could mean that you can’t play the very low and/or high frequencies because you are listening through a smaller set of loudspeakers instead of a full-range model. Limiting the range of brightness levels in the photo is similar to this problem – so nothing is really deep black or bright white. (We could have an argument about whether this is an analogy to a limited dynamic range in an audio system, but I would argue that it isn’t – since audio dynamic range is limited by a noise floor and a clipping level, which we’ll do later…) So, the photo below “sounds” like an audio system with a limited range:

Of course, almost everything is there – sort of – but it doesn’t have the same depth or sparkle as the original photo.

Or what if you have a noisy device in your signal chain For example, maybe you’re listening to a copy of the recording on a cassette tape – or the air conditioning is on in your listening room. Then the result will “sound” like this:

As you can see, you still have the original recording – but there is an added layer of noise with it. This is not only distracting, but it can obscure some of the more subtle details that are on the same order of magnitude as the noise itself.

In audio, the quietest music is buried in the noise of the system (either the playback system or the recording system). On the other extreme is the loud music, which can only go so loud before it “clips” – meaning that the peaks get chopped off because the system just can’t go up enough. In other words, the poor little woofer wants to move out of the loudspeaker by 10 mm, but it can only move 4 mm because the rubber holding on to it just can’t stretch any further. In a photo, this is the same as turning up the brightness too much, resulting in too many things just turning white because they can’t get brighter (in the old days of film, this was called “blowing out” the photo), as is shown below.

This “clipping” of the signal is what many people mean when they say “distorted” – however, distortion is a much broader range of problems then just clipping. To be really pedantic, any time the output of a system is not identical to its input, then the signal is distorted.

A more common problem that many people face is a modification of the frequency response. In audio, the frequency is (very generally speaking) the musical pitch of the notes you’re hearing. Low notes are low frequencies, high notes are high frequencies. Large engines emit low frequencies, tiny bells emit high frequencies. With light, the frequency of the light wavicle hitting your eyeball determines the colour that you see. Red is a low frequency and violet is a high frequency (see the table on this page for details). So, if you have a pair of headphones that, say, emphasises bass (the low frequencies) more than the other areas, then it’s the same as making the photo more red, as shown below.

Of course, not all impairments to the audio signal are accidental. Some are the fault of the user who makes a conscious decision to be more concerned with convenience (i.e. how many songs you can fit on your portable player) than audio quality. When you choose to convert your CD’s to a “lossy” format (like MP3, for example), then (as suggested by the description) you’re losing something. In theory, you are losing things that aren’t important (in other words, your computer thinks that you can’t hear what’s thrown away, so you won’t miss it). However, in practice, that debate is up to you and your computer (and your bitrate, and the codec you’ve chosen, and the quality of the rest of your system, and how you listen to music, and what kind of music you’re listening to, and whether or not there are other things to listen to at the same time, and a bunch of other things…) However, if we’re going to make an analogy, then we have to throw away the details in our photo, keeping enough information to be moderately recognisable.

As you can see, all the colours are still there. And, if you stand far enough away (or if you take off your glasses) it might just look the same. But, if you look carefully enough, then you might notice that something is missing… Keep looking… you’ll see it…

So, as you can see, any impairment of the “signal” is a disruption of its quality – but we should be careful not to confuse this with reality. There are lots of people out there who have a kind of weird religious belief that, when you sit and listen to a recording of an orchestra, you should be magically transported to a concert hall as if you were there (or as if the orchestra were sitting in your listening room). This is silly. That’s like saying when you sit and watch a re-run of Friends on your television, you should feel like you’re actually in the apartment in New York with a bunch of beautiful people. Or, when you watch a movie, you feel like you’re actually in a car chase or a laser battle in space. Music recordings are no more of a “virtual reality” experience than a television show or a film. In all of these cases (the music recording, the TV episode and the film), what you’re hearing and seeing should not be life-like – they should be better than life. You never have to wait for the people in a film to look for a parking space or go out to pee. Similarly, you never hear a mistake in the trumpet solo in a recording of Berlin Philharmonic and you always hear Justin Bieber singing in tune. Even the spatial aspects of an “audiophile” classical recording are better-than-reality. If you sit in a concert hall, you can either be close (and hear the musicians much louder than the reverberation) or far (and hear much more of the reverberation). In a recording, you are sitting both near and far – so you have the presence of the musicians and the spaciousness of the reverb at the same time. Better than real life!

So, what you’re listening to is a story. A recording engineer attended a music performance, and that person is now recounting the story of what happened in his or her own style. If it’s a good recording engineer, then the storytelling is better than being there – it’s more than just a “police report” of a series of events.

To illustrate my point, below is a photo of what that sinking WWII bunker actually looked like when I took the photo that I’ve been messing with.

Of course, you can argue that this is a “better” photo than the one at the top – that’s a matter of your taste versus mine. Maybe you prefer the sound of an orchestra done recorded with only two microphones played through two loudspeakers. Maybe you prefer the sound of the same orchestra recorded with lots of microphones played through a surround system. Maybe you like listening to singers who can sing. Maybe you like listening to singers who need auto tuners to clean up the mess. This is just personal taste. But at least you should be choosing to hear (or see) what the artist intended – not a modified version of it.

This means that the goal of a sound system is to deliver, in your listening room, the same sound as the recording engineer heard in the studio when he or she did the recording. Just like the photos you are looking at on the top of this page should look exactly the same as what I see when I see the same photo.

B&O Tech: Listening Tips & Tricks

#22 in a series of articles about the technology behind Bang & Olufsen loudspeakers

Let’s say that you go to the store and you listen to a pair of loudspeakers with some demo music they have on a shelf there, and you decide that you like the loudspeakers, so you buy them.

Then, you take them home, you set them up, and you put one of your recordings, and you change your mind – you don’t like the loudspeakers.

What happened? Well, there could be a lot of reasons behind this.

Tip #1: Loudness

In the last article, I discussed why a “loudness” function is necessary when you change the volume setting while listening to your system. The setup of this article discussed the issue of Equal Loudness Contours, shown again as a refresher in Figure 1, below.

Fig 2: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phone increments, according to ISO226. — Fig 1: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phon increments, according to ISO226.

Let’s say that, when you heard the loudspeakers at the store, the volume knob was set so that, if you had put in a -20 dB FS, 1 kHz sine wave, it would have produced a level of 70 dB SPL at the listening position in the store. Then, let’s say that you go home and set the volume such that it’s about 10 dB quieter than it was when you heard it at the store. This means that, even if you listen to exactly the same recording, and even if your listening room at home were exactly the same as the room at the store, and even if the placement of the loudspeakers and the listening position in your house were exactly the same as at the store, the loudpspeakers would sound different at home than at the store.

Figure 2 below shows the difference between the 70 phon curve from Figure 1 (sort of what you heard at the store) and the 60 phon curve (sort of what you hear at home, because you turned down the volume). (To find out which curve is which in Fig 1, the phon value of the curve is its value at 1 kHz.)

Fig 2: The normalised difference between the 60 phon curve and the 70 phon curve from Fig 1.

As you can see in Figure 2, by turning down the volume by 10 dB, you’ve changed your natural sensitivity to sound – you’re as much as 5 or 6 dB less sensitive to low frequencies and also less sensitive to the high end by a couple of dB. In other words, by turning down the volume, even though you have changed nothing else, you’ve lost bass and treble.

In fact, even if you only turned down the volume by 1 dB, you would get the same effect, just by a different amount, as is shown in Figure 3.

Fig 3: The difference between the 67 (red), 68 (blue), and 69 (black) phon curve and the 70 phon curve. Note that these have been normalised to remove the frequency-independent gain differences. Only frequency-dependent sensitivity differences are shown.

So, as you can see here, even by changing the volume knob by 1 dB, you change the perceived frequency response of the system by about 0.5 dB in a worst case. The quieter you listen, the less bass and treble you have (relative to the perceived level at 1 kHz).

So, this means that, if you’re comparing two systems (like the loudspeakers at the store and the loudspeakers at home, or two different DAC’s or your system before and after you connect the fancy new speaker wire that you were talked into buying), if you are not listening at exactly the same level, your hearing is not behaving the same way – so any differences you hear may not be a result of the system.

Looking at this a different way, if you were to compare two systems (let’s say a back-to-back comparison of two DAC’s) that had frequency response differences like the ones shown in Figure 3, I would expect that you might expect that you could hear the difference between them. However, this is YOUR frequency response difference, just by virtue of the fact that you are not comparing them at the same listening level. The kicker here is that, if the difference in level is only 1 dB, you might not immediately hear one as being louder than the other – but you might hear the timbral differences between them… So, unless you’ve used a reliable SPL meter to ensure that they’re they same level, then they’re probably not the same level – unless you’re being REALLY careful – and even then, I’d recommend being more careful than that.

This is why, when researchers are doing real listening tests, they have to pay very careful attention to the listening level. And, if the purpose of the listening test is to compare two things, then they absolutely must be at the same level. If they aren’t, then the results of the entire listening test can be thrown out the window – they’re worthless.

It’s also why professionals who work in the audio industry like recording engineers, mastering engineers, and re-recording engineers always work at the same listening level. This, in part, ensures that they have consistency in their mixes – in other words, they have the same bass-midrange-treble balance in all their recordings, because they were all monitored at the same listening level.

Tip #2: Recordings

If you were selling your house, and you got a call from your real estate agent that there were some potential buyers coming tomorrow to see your place, you would probably clean up. If you were really keen, not only would you clean up, but you would put out some fresh flowers in a vase, and, a half-an-hour before your “guests” arrived, you’d be pulling a freshly-based loaf of bread out of the oven (because there’s nothing more welcoming than walking into a house that smells like freshly-baked bread…) You would NOT leave the bathroom in a mess, your bed unmade, dirty dishes in the sink, and yesterday’s dirty socks on the floor. In short, you want your house to look its best – otherwise you won’t get people through the front door (unless the price is REALLY good…)

Similarly, if you worked in a shop selling loudspeakers, part of your job is to sell loudspeakers. This means that you spend a good amount of time listening to a lot of different types of music on the loudspeakers in your shop. Over time, you’ll find that some recordings sound better than others for some reason that has something to do with the interactions between the recordings, the loudspeakers, the room’s acoustics, and your preferences. If you were a smart salesperson, you would make a note of the recordings that sound “bad” (for whatever reason) and you would not play them for potential customers that come into your store. Doing so would be the aural equivalent of leaving your dirty socks on the floor.

So, this means that, if you are the customer in the shop, listening to a pair of loudspeakers that you may or may not buy, you should remember that you’re probably going to be presented with a best-case scenario. At the very least, you should not expect the salesperson to play a recording that makes the loudspeakers sound terrible. Of course, this might mean many things. For example, it might mean that the loudspeakers are GREAT – but if they’re being used to play a really bad recording that you’ve never heard before, then you might think that the reason it sounds bad is the loudspeakers, and not the recording. So, you’ll walk out of the shop hating the loudspeakers instead of the recording.

So, the moral of the story here is simple: if you’re going to a shop to listen to a pair of loudspeakers, bring your own recordings. That way, you know what to expect – and you’ll test the loudspeakers on music that you like. Even if you bring just one CD and listen to just one song – as long as the song is one that you’ve hear A LOT, then you’re going to get a much better idea of how the loudspeakers are behaving than if you let the salesperson choose the demo music. In a perfect reality, you’ll put on your song, and your jaw will drop while you think “I’ve NEVER heard it sound this good!”.

Tip #3: Room Acoustics

It goes without saying that the acoustical behaviour of a room has a huge effect on how a loudspeaker sounds (I talked about this a lot in this posting). So does the specific placement of the loudspeakers and the listening position within a room. (I talked about this a lot in this posting). So, this also means that a pair of loudspeakers in a shop’s showroom will NOT sound the same as the same loudspeakers in your house – not even if you’ve aligned the listening levels and you’re playing the same recording. Maybe you have a strong sidewall reflection in your living room that they didn’t have in the showroom. Maybe the showroom is smaller than your living room, so the room modes are at higher frequencies and “helping out” the upper bass instead of the lower bass. Maybe, in the showroom, the loudspeakers were quite far from the wall behind them, but in your house, you’re going to push the loudspeakers up against the wall. Any of these differences will have massive effects on the sound at the listening position.

Of course, there is only one way around this problem. If you’re buying a pair of loudspeakers, then you should talk to the salesperson about taking a demo pair home for a week or so – so that you can hear how they sound in your room. If you’re buying some other component in the audio chain that might have an impact on the sound, you should ask to take it home and try it out with your loudspeakers.

If you were buying a car, you would take it for a test drive – and you would probably get out of the parking lot of the car dealer when you did so. You have to take it out on the road to see how it feels. The same is true for audio equipment – if you can’t take it home to try it out, make sure that the shop has a good return policy. Just because it sounds good in the shop doesn’t mean that it’s going to sound good in your living room.

Tip #4: Personal Taste

I like single-malt scotch. Personally, I really like peaty, smoky scotch – other people like other kinds of scotch. There’s a good book by Michael Jackson (no, not that Michael Jackson – another Michael Jackson) that rates scotches. Personally, this is a good book for me, because, not only does he give a little background for each of the distilleries, and a description of the taste of each of the scotches in there – but he scores them according to his own personal ranking system. Luckily for me, Michael Jackson and I share a lot of the same preferences – so if he likes a scotch, chances are that I will too. So, his ranking scores are a pretty good indicator for me. However, if he and I had different preferences, then his ranking system would be pretty useless.

One of my favourite quotations is from Duke Ellington who said “If it sounds good, it is good.” I firmly believe that this is true. If you like the sound of a pair of loudspeakers, then they’re good for you. Any measurement or review in a magazine that disagrees is wrong. Of course, a measurement or a reviewer might be able to point you to something that you haven’t noticed about your loudspeakers (which may make you like them a little more or a little less…) but if you like the way they sound, then there’s no need to apologise for that.

However, remember that, when you read a review, that you are reading the words of someone who also has personal taste. Yes, he/she may have heard many more loudspeakers than you have in many more listening rooms – but they still have personal preference. And, part of that personal preference is a ranking of the categories in which a loudspeaker should perform. Personally, I divide an audio system’s (or a recording’s) qualities into 5 broad categories: 1. Timbral (tone colour), 2. Spatial (i.e. imaging and spaciousness), 3. Temporal (i.e. punch, transient response), 4. Dynamics (not just total dynamic range, but also things like short term “dynamic contrast”) and 5. Distortion & Noise. Each of these has sub-headings in my head – but the relative importance of these 5 qualities are really an issue of my personal preference (and my expectations for a given product – nobody expects an iThing dock to deliver good imaging, for example…). If your personal preference of the weighting of these 5 categories (assuming that you agree with my 5 categories) is different from mine, then we’re going to like different audio systems. And that’s okay. No problem – apart from the minor issue that, if I were a reviewer working for an audio magazine, you shouldn’t buy anything I recommend. I like sushi – you like steak – so if I recommend a good restaurant, you should probably eat somewhere else.

Of course, the fact that I will listen to different music played at a different level in a different listening room than you will might also have an effect on the difference between our opinions.

Tip #5: Close your eyes

This one is a no-brainer for people who do “real” listening tests for scientific research – but it still seems to be a mystery to people who review audio gear for a living. If you want to make a fair comparison between two pieces of audio gear, you cannot, under any circumstances, know what it is that you’re comparing. There was a perfect proof of this done by Kristina Busenitz at an Audio Engineering Society convention one year. Throughout the convention, participants were invited to do a listening test on two comparable automotive audio systems. Both were installed in identical cars, parked side-by-side. The two cars were aligned to have identical reproduction levels and you listened to exactly the same materials to answer exactly the same judgements about the systems. You had to sit in the same seat (i.e. Front Passenger side) for both tests, and you had to do the two evaluations back to back. One car had a branded system in it, the other was unbranded – made obvious by the posters hanging on the wall next to one of the cars. The cars were evaluated by lots of people over the 3 or 4 days of the convention. At the end, the results were processed and it was easily proven that the branded system was judged by a vast majority of the participants to be better than the unbranded system.

There was just one catch – every couple of hours, the staff running the test would swap the posters to the opposite wall. The two cars were actually identical. The only difference was the posters that hung outside them.

So, the vast majority of professional audio engineers agreed, in a completely “fair” test, that the car with the posters (which was the opposite car every couple of hours) sounded better than the one that didn’t.

Of course, what Kristina proved was that your eyes have a bigger effect on your opinion than your ears. If you see more expensive loudspeakers, they’ll probably sound better. This is why, when we’re running listening tests internally at Bang & Olufsen, we hide the loudspeakers behind an acoustically transparent, but visually opaque curtain. We can’t help but be influenced by our pre-formed opinions of products. We’ve even seen that a packing box for a (competitor’s) loudspeaker sitting outside the listening room will influence the results of a blind listening test on a loudspeaker that has nothing to do with the label on the box. (the box was a plant – just to see what would happen).

Tip #6: Are you sure?

One last thing that really should go without saying: If you’re doing a back-to-back comparison of two different aspects of an audio system, be absolutely sure that you’re only comparing what you think you’re comparing. For example, I’ve read of people who do comparisons of things like the following:

sending a “bitstream” vs. “PCM” from a Blu-ray or DVD player to an AVR/Surround Processor
PCM vs. DSD
“normal” resolution vs. “high-resolution” recordings

If you’re making such a comparison, and you plan on making some conclusions, be absolutely sure that the only thing that changing in your comparison is what you think you’re comparing. In the three examples I gave above, there are potentially lots of other things changing in your signal path in addition to the thing your changing. If you’re not absolutely sure that you’re only changing one thing, then you can’t be sure that the reason you might hear a difference in the things you’re comparing is due to the thing you’re comparing. (did that make sense?) For example, given the three above examples:

some AVR’s apply different processing to bitstream vs. PCM signals. Some use the metadata in a bitstream, and some players don’t when they convert to PCM. So, the REASON the bitstream and the PCM signals might sound different is not necessarily because of the signals themselves, but how the gear treats them. (see this posting for more information on this)
Some DAC’s (meaning the chip on the circuit board inside the box that you have in your gear) apply different filters to a DSD signal than a PCM signal. Some have a different gain on the DSD signal (some “high resolution” software-based players also apply different gains to DSD and PCM signals). So, don’t just switch from DSD to PCM and think that, because you can hear a difference, that difference is the difference in DSD and PCM. It might just be your Equal Loudness Contours playing tricks on you.
Some DAC’s (see previous point for my current definition of “DAC”) apply different filters to signals at different sampling rates. Don’t judge two recordings you bought at different sampling rates and think that the only difference is the sampling rate. The gear that you’re using to play the files might behave differently at different rates.

And so on.

A good analogy to this is to go to a coffee shop and buy two cups of coffee – one medium roast and one dark roast. While you’re not looking, I’ll add a little sugar to the dark roast cup – and I’ll bribe the person that made your coffee to make the medium one a couple of degrees colder than the dark one. You taste both, and you decide that dark roast is better than medium roast. But is your decision valid? No. Maybe you like sugar in your coffee. Maybe you prefer hotter coffee. Be careful how you make up your mind…

Summary

So, to wrap up, there are (at least) four things to remember when you’re shopping for audio gear:

If you’re comparing systems, make sure that you’re listening at the same level.
Always listen to a system using a recording with which you’re familiar – even if it’s a bad recording. Better something you know than something you don’t.
Evaluate a system that you’re planning on buying in your own listening room.
Don’t let anyone tell you what sounds good or bad. Ask them what they are listening to and for in a recording or a sound system – but decide for yourself what you like.
If the listening test isn’t blind, it’s not worth much. Don’t even trust your own ears if you know what you’re listening to – your ears are easily fooled by your eyes and your pre-conceived notions. And you’re not alone.
Be very sure that if you’re comparing two things, then the things you think you’re comparing are the only things that you’re comparing.

B&O Tech: What is “Loudness”?

#21 in a series of articles about the technology behind Bang & Olufsen loudspeakers

Part 1: Equal Loudness Contours

Let’s start with some depressing news: You can’t trust your ears. Sorry, but none of us can.

There are lots of reasons for this, and the statement is actually far more wide-reaching than any of us would like to admit. However, in this article, we’re going to look at one small aspect of the statement, and what we might be able to do to get around the problem.

We’ll begin with a thought experiment (although, for some of you, this may be an experiment that you have actually done). Imagine that you go into the quietest room that you’ve ever been in, and you are given a button to press and a pair of headphones to put on. Then you sit and wait for a while until you calm down and your ears settle in to the silence… While that’s happening you read the instructions of the task with which you are presented:

Whenever you hear a tone in the headphones in either one of your ears, please press the button.

Simple! Hear a beep, press the button. What could be more difficult to do than that?

Then, the test begins: you hear a beep in your left ear and you press the button. You hear another, quieter beep and you press the button again. You hear an even quieter beep and you press the button. You hear nothing, and you don’t press the button. You hear a beep and you press the button. Then you hear a beep at a lower frequency and so on and so on. This goes on and on at different levels, at different frequencies, in your two ears, until someone comes in the room and says “thank you, that will be all”.

While this test seems like it would be pretty easy to do, it’s a little unnerving. This is because the room that you’re sitting in is so quiet and the beeps are also so quiet that, sometimes you think you hear a beep – but you’re not sure, because things like the sound of your heartbeat, and your breathing, and the “swooshing” of blood in your body, and that faint ringing in your ears, and the noise you made by shifting in your chair are all, relatively speaking VERY loud compared to the beeps that you’re trying to detect.

Anyways, when you’re done, you’ll might be presented with a graph that shows something called your “threshold of hearing”. This is a map of how loud a particular frequency has to be in order for you to hear it. The first thing that you’ll notice is that you are less sensitive to some frequencies than others. Specifically, a very low frequency or a very high frequency has to be much louder for you to hear it than if you’re listening to a mid-range frequency. (There are evolutionary reasons for this that we’ll discuss at the end.) Take a look at the bottom curve on Figure 1, below:

Fig 1: The threshold of hearing (bottom curve) and the Equal Loudness contours for 70 phons (red curve) and 90 phons (top curve) according to ISO226.

The bottom curve on this plot shows a typical result for a threshold of hearing test for a person with average hearing and no serious impairments or temporary issues (like wax build-up in the ear canal). What you can see there is that, for a 1 kHz tone, your threshold of hearing is 0 dB SPL (in fact, this is how 0 dB SPL is defined…) As you go lower in frequency from there, you will have to turn up the actual signal level just in order for you to hear it. So, for example, you would need to have approximately 60 dB SPL at 30 Hz in order to be able to detect that something is coming out of your headphones or loudspeakers. Similarly, you would need something like 10 dB SPL at 10 kHz in order to hear it. However, at 3.5 kHz, you can hear tones that are quieter than 0 dB SPL! It stands to reason, then, that a 30 Hz tone at 60 dB SPL and a 1 kHz tone at 0 dB SPL and a 3.5 kHz tone at about -10 dB SPL and a 10 kHz tone at about 10 dB SPL would all appear to have the same loudness level (since they are all just audible).

Let’s now re-do the test, but we’ll change the instructions slightly. I’ll give you a volume knob instead of a button and I’ll play two tones at different frequencies. The volume knob only changes the level of one of the two tones, and your task is to make the two tones the same apparent level. If you do this over and over for different frequencies, and you plot the results, you might wind up with something like the red or the top curves in Fig 1. These are called “Equal Loudness Contours” (some people call them “Fletcher-Munson Curves because the first two researchers to talk about them were Fletcher and Munson) because they show how loud different frequencies have to be in order for you to think that they have the same loudness. So, (looking at the red curve) a 40 Hz tone at 100 dB SPL sounds like it’s the same loudness as a 1 kHz tone at 70 dB SPL or a 7.5 kHz tone at 80 dB SPL. The loudness level that you think you’re hearing is measured in “phons” – and the phon value of the curve is its value in dB SPL at 1 kHz. For example, the red curve crosses the 1 kHz line at 70 dB SPL, so it’s the “70 phon” curve. Any tone that has an actual level in dB SPL that corresponds to a point on that red line will have an apparent loudness of 70 phons. The top curve is for the 90 phons.

Figure 2 shows the Equal Loudness Contours from 0 phons (the Threshold of Hearing) to 90 phons in steps of 10 phons.

There are two important things to notice about these curves. The first is that they are not “flat”. In other words, your ears do not have a flat frequency response. In fact, if you were measured the same way we measure microphones or loudspeakers, you’d have a frequency response specification that looked something like “20 Hz – 15 kHz ±30 dB” or so… This isn’t something to worry about, because we all have the same problem. So, this means that the orchestra conductor asked the bass section to play louder because he’s bad at hearing low frequencies, and the recording engineer balancing the recording adjusted the bass-to-midrange-to-treble relative levels using his bad hearing, and, assuming that the recording system and your playback system are reasonably flat-ish, then hopefully, your hearing is identically bad to the conductor and recording engineer, so you hear what they want you to.

However, I said that there are two things to notice – that was just the first thing. The second thing is that the curves are different at different levels. For example, if you look at the 0 phon curve (the bottom one) you’ll see that it raises a lot more in the low frequency region than, say, the 90 phon curve (the top one) relative to their mid-range values. This means that, the quieter the signal, the worse your ability to hear bass (and treble). For example, let’s take the curves and assume that the 70 phon line is our reference – so we’ll make that one flat, and adjust all of the others accordingly and plot them so we can see their difference. That’s shown in Figure 3.

Fig 3: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phone increments, according to ISO226. These have all been normalised to the 70 phone curve and subsequently inverted. — Fig 3: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phon increments, according to ISO226. These have all been normalised to the 70 phon curve and subsequently inverted.

What does Figure 3 show us, exactly? Well, one way to think of it is to go back to our “recording engineer vs. you” example. Let’s say that the recording engineer that did the recording set the volume knob in the recording studio so that (s)he was hearing the orchestra with a loudness at the 70 phon line. On other words, if the orchestra was playing a 1 kHz sine tone, then the level of the signal was 70 dB SPL at the listening position – and all other frequencies were balanced by the conductor and the engineer to appear to sound the same level as that. Then you take the recording home and set the volume so that you’re hearing things at the 30 phon level (because you’re having a dinner party and you want to hear the conversation more than you want to hear Beethoven or Justin Bieber, depending on your taste or lack thereof). Look at the curve that intersects the -40 dB line at 1 kHz (the 4th one from the bottom) in Figure 3. This shows you your sensitivity difference relative to the recording engineer’s in this example. The curve slopes downwards – meaning that you can’t hear bass as well – so, your recording playing in the background will appear to have a lot less bass and a little less treble than what the recording engineer heard – just because you turned down the volume. (Of course, this may be a good thing, since you’re having dinner and you probably don’t want to be distracted from the conversation by thumpy bass and sparkly high frequencies.)

Part 2: Compensation

In order to counter-act this “misbehaviour” in your hearing, we have to change the balance of the frequency bands in the opposite direction to what your ears are doing. So if we just take the curves in Figure 3 and flip each of them upside down, you have a “perfect” correction curve showing that, when you turn down the volume by, say 40 dB (hint: look at the value at 1 kHz) then you’ll need to turn up the low end by lots to compensate and make the overall balance sound the same.

Fig 3: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phon increments, according to ISO226. These have all been normalised to the 70 phon curve. — Fig 4: The Equal Loudness contours for 0 phons (bottom curve) to 90 phons (top curve) in 10 phon increments, according to ISO226. These have all been normalised to the 70 phon curve.

Of course, these curves shown in Figure 4 are normalised to one specific curve – in this case, the 70 phon curve. So, if your recording engineer was monitoring at another level (say, 80 phons) then your “perfect” correction curves will be wrong.

And, since there’s no telling (at least with music recordings) what level the recording and mastering engineers used to make the recording that you’re listening to right now (or the one you’ll hear after this one), then there’s no way of predicting what curve you should use to do the correction for your volume setting.

All we can really say is that, generally, if you turn down the volume, you’ll have to turn up the bass and treble to compensate. The more you turn down the volume, the more you’ll have to compensate. However, the EXACT amount by which you should compensate is unknown, since you don’t know anything about the playback (or monitoring) levels when the recording was done. (This isn’t the same for movies, since re-recording engineers are supposed to work at a fixed monitoring level which should be the same as all the cinemas in the world… in theory…)

This compensation is called “loudness” – although in some cases it would be better termed “auto-loudness”. In the old days, a “loudness” switch was one that, when engaged, increased the bass and treble levels for quiet listening. (Of course, what most people did was hit the “loudness”switch and left it on forever.) Nowadays, however, this is usually automatically applied and has different amounts of boost for different volume settings (hence the “auto-” in “auto-loudness”). For example, if you look at Figure 5 you’ll see the various amounts of boost applied to the signal at different volume settings of the BeoPlay V1 / BeoVision 11 / BeoSystem 4 / BeoVision Avant when the default settings have not been changed. The lower the volume setting, the higher the boost.

Fig 5: The equalisation applied by the "Loudness" function at different volume settings in the BeoPlay V1, BeoVision 11, BeoSystem 3 and BeoVision Avant. Note that these are the default settings and are customisable by the user. — Fig 5: The equalisation applied by the “Loudness” function at different volume settings in the BeoPlay V1, BeoVision 11, BeoSystem 3 and BeoVision Avant. Note that these are the default settings and are customisable by the user.

Of course, in a perfect world, the system would know exactly what the monitoring levels was when they did the recording, and the auto-loudness equalisation would change dynamically from recording to recording. However, until there is meta-data included in the recording itself that can tell the system information like that, then there will be no way of knowing how much to add (or subtract).

Historical Note

I mentioned above that the extra sensitivity we have in the 3 kHz region is there due to evolution. In fact, it’s a natural boost applied to the signal hitting your eardrum as a result of the resonance of the ear canal. We have this boost (I guess, more accurately, we have this ear canal) because, if you snap a twig or step on some dry leaves, the noise that you hear is roughly in that frequency region. So, once-upon-a-time, when our ancestors were something else’s lunch, the ones with the ear canals and the resulting mid-frequency boost were more sensitive to the noise of a sabre-toothed tiger trying to sneak up behind them, stepping on a leaf, and had a little extra head start when they were running away. (It’s like the T-shirt that you can buy when you’re visiting Banff, Alberta says: “I don’t need to run faster than the bear. I just need to run faster than you.”)

As an interesting side note to this: the end result of this is that our language has evolved to use this sensitive area. The consonants in our speech – the “s” and”t” sounds, for example, sit right in that sensitive region to make ourselves easiest to understand.

Warning note

You might come across some youtube video or a downloadable file that let’s you “check your hearing” using a swept sine wave. Don’t bother wasting your time with this. Unless the headphones that you’re using (and everything else in the playback chain) are VERY carefully calibrated, then you can’t trust anything about such a demonstration. So don’t bother.

Warning note #2 – Post script…

I just saw on another website here that someone named John Duncan made the following comment about what I wrote in this article. “Having read it a couple of times now, tbh it feels like it is saying something important, I’m just not quite sure what. Is it that a reference volume is the most important thing in assessing hifi?” The answer to this is “Exactly!” If you compare two sound systems (say, two different loudspeakers, or two different DAC’s or two different amplifiers and so on… The moral of the stuff I talk about above is that, not only in such a comparison do you have to make sure that you only change one thing in the system (for example, don’t compare two DAC’s using a different pair of loudspeakers connected to each one) you absolutely must ensure that the two things you’re comparing are EXACTLY the same listening level. A different of 1 dB will have an effect on your “frequency response” and make the two things sound like they have different timbral balances – even when they don’t.

For example, when I’m tuning a new loudspeaker at work, I always work at the same fixed listening level. (for me, this is two channels of -20 dB FS full-band uncorrelated pink noise produces 70 dB SPL, C-weighted at the listening position). Before I start tuning, I set the level to match this so that I don’t get deceived by my own ears. If I tuned loudspeakers quieter than this, I would push up the bass to compensate. If I tuned louder, then I would reduce the bass. This gives me some consistency in my work. Of course, I check to see how the loudspeakers sound at other listening levels, but, when I’m tuning, it’s always at the same level.

B&O Tech: BeoVision Avant Audio

#20 in a series of articles about the technology behind Bang & Olufsen loudspeakers

This was a presentation I did at a technical press event where we showed the BeoVision Avant for the first time. The video was done by one of the journalists at the event. The associated article (in Danish) is here at http://www.recordere.dk.