Traditions

One of my favourite pithy quotes is ‘Tradition is just peer pressure from dead people’. When you start looking at some of the things we ‘just do’, you start asking yourself ‘why, exactly?’

For example, when you attend a wedding, you’ll see the bride standing on the groom’s left. This is so that he can use his sword to fight off her family as he carries away from the town over his left shoulder.

Another example is the story that’s often told about how the distance between railway tracks is related to the width of a horse’s ass.

There’s a similar thing that happens in multichannel audio systems. When people ask me what I would recommend for loudspeakers when building a multichannel (or ‘surround’) system, I always start with the ITU’s Recommendation BS.775 which says that you should use matching loudspeakers all-round. Of course, almost no one does this, so the next best thing is to say something like the following:

  • use big loudspeakers for your Left Front and Right Front
  • use smaller (but matched) loudspeakers for your surround channels (including back and height channels)
  • make some intelligent choice about your Centre Front loudspeaker
    (which is not terribly helpful, but there are many issues to consider when thinking about your centre front loudspeaker)

This raises a question:

‘Why is it okay to use smaller, less capable loudspeakers for the surround channels?’

The simple answer to this is that for most materials, there isn’t as much signal in the surround channels, and there’s certainly less low-frequency, high-level content.

However, let’s keep asking questions:

‘Why isn’t there more content (in terms of both bandwidth and level) in the surround and height channels?’

The answer to this is that surround sound (like stereo, which is in effect the same thing) originated with movies. The first big blockbuster that was released in Dolby Stereo (later re-branded as Dolby Surround) was Star Wars in 1977 or so. Dolby Stereo was a 4-2-4 ‘encoding’ system that relied heavily on M-S encoding and decoding. If I over-simplify this a little, then the basic idea was:

  • The Centre channel was sent to both the Left and Right channels on the film
  • Left channel was sent to Left
  • Right channel was sent to Right
  • The Surround channel (there was only one) was mixed into the Left and Right channels in opposite polarity (aka ‘out of phase’)

So, the re-recording engineer (the film world’s version of a recording engineer) mixed in a 4-channel world: Left, Right, Centre, and Surround, but the film only contained two channels: Left Total and Right Total (with the Centre and Surround content mixed in them).

When the film was shown in a theatre with a Dolby Stereo decoder, the two channels on the film were ‘decoded’ to the original four channels and send out to the loudspeakers in the cinema.

This was a great concept based on an old idea, since M-S processing was part of Blumlein’s original patent for stereo back in 1931. When you’re looking at a two-channel stereo signal, you can think of it as independent Left and Right channels. However, usually, if you look at the content, the two channels contain related information. For example, the lead vocal of almost every pop tune is identical in the Left and Right channels so that its phantom image appears in the centre.

So, another way of thinking of the same two-channel stereo signal is by considering the two channels as

  • ‘M’ (for Mid or Mono, depending on which book you read):
    the signal that is identical in the Left and Right
  • ‘S’ (for Side or Stereo):
    the signal that is identical except in opposite polarity in the Left and Right

For example, FM Stereo is not sent as Left and Right channels, it’s sent as M and S channels. There’s less bandwidth and less level in the S component, so when you lose the FM signal to your antenna, the first thing to go is the S, and you’re left with a Mono radio station.

Wait… there’s that ‘less bandwidth and less level in the S component’ again – just like what I said above about the surround channels in a surround system.

Let’s back up a little to vinyl records. A groove of a vinyl record is a 90º cut, with the needle resting gently on both walls of that trough. If the left wall moves up and down (on a 45º angle to the surface of the vinyl) then the needle bounces up and down with it, but only for that left channel. In other words, it slides along the right wall of the tough.

When a signal is the same in the left and right channels on a vinyl record (the M-component, like the lead vocal) then, when one side of the groove pushes the needle UP, the other side drops DOWN. This means that the M-component signal results in the needle moving horizontally (or laterally), in parallel with the surface of the disc. Signals in the S-component (when the Left and Right channels are ‘out of phase’) result in the two walls moving upwards and downwards together, pushing the needle vertically.

The reason for this is that the old mono shellac discs used laterally-cut grooves, and the reason for this was (apparently) that Emile Berliner was getting around a Thomas Edison patent. Also, by making the needle sensitive to lateral movements, it was less sensitive to vibrations caused by footsteps, which primarily cause the gramophone to vibrate vertically. When they made the first two-channel discs, it was smart to make the format backwards-compatible with Berliner’s existing gramophones.

So, if you have a lot of level and a lot of low-frequency content in the signal on a vinyl record, it causes the needle to jump up and down, and it will likely get thrown out of the groove and cause the record to skip. This is why the bass on vinyl records has to be monophonic, even though the record itself is two-channel stereo. Mono bass causes the needle to wiggle left-right, but not up-down.

So, the historically-accurate answer to explain why it’s okay to use smaller loudspeakers for most of the outputs in a modern 7.1.4 system is that we are maintaining compatibility with a format from 1892.

The original Loudness War

The June, 1968 issue of Wireless World magazine includes an article by R.T. Lovelock called “Loudness Control for a Stereo System”. This article partly addresses the issue of resistance behaviour one or more channels of a variable resistor. However, it also includes the following statement:

It is well known that the sensitivity of the ear does not vary in a linear manner over the whole of the frequency range. The difference in levels between the threshold of audibility and that of pain is much less at very low and very high frequencies than it is in the middle of the audio spectrum. If the frequency response is adjusted to sound correct when the reproduction level is high, it will sound thin and attenuated when the level is turned down to a soft effect. Since some people desire a high level, while others cannot endure it, if the response is maintained constant while the level is altered, the reproduction will be correct at only one of the many preferred levels. If quality is to be maintained at all levels it will be necessary to readjust the tone controls for each setting of the gain control

The article includes a circuit diagram that can be used to introduce a low- and high-frequency boost at lower settings of the volume control, with the following example responses:

These days, almost all audio devices include some version of this kind of variable magnitude response, dependent on volume. However, in 1968, this was a rather new idea that generated some debate.

In the following month’s issue The Letters to the Editor include a rather angry letter from John Crabbe (Editor of Hi-Fi News) where he says

Mr. Lovelock’s article in your June issue raises an old bogey which I naively thought had been buried by most British engineers many years ago. I refer, not to the author’s excellent and useful thesis on achieving an accurate gain control law, but to the notion that our hearing system’s non-linear loudness / frequency behaviour justifies an interference with response when reproducing music at various levels.

Of course, we all know about Fletcher-Munson and Robinson-Dadson, etc, and it is true that l.f. acuity declines with falling sound pressure level; though the h.f. end is different, and latest research does not support a general rise in output of the sort given by Mr. Lovelock’s circuit. However, the point is that applying the inverse of these curves to sound reproduction is completely fallacious, because the hearing mechanism works the way it does in real life, with music loud or quiet, and no one objects. If `live’ music is heard quietly from a distant seat in the concert hall the bass is subjectively less full than if heard loudly from the front row of the stalls. All a `loudness control’ does is to offer the possibility of a distant loudness coupled with a close tonal balance; no doubt an interesting experiment in psycho-acoustics, but nothing to do with realistic reproduction.

In my experience the reaction of most serious music listeners to the unnaturally thick-textured sound (for its loudness) offered at low levels by an amplifier fitted with one of these abominations is to switch it out of circuit. No doubt we must manufacture things to cater for the American market, but for goodness sake don’t let readers of Wireless World think that the Editor endorses the total fallacy on which they are based.

with Lovelock replying:

Mr. Crabbe raises a point of perennial controversy in the matter of variation of amplifier response with volume. It was because I was aware of the difference in opinion on this matter that a switch was fitted which allowed a variation of volume without adjustment of frequency characteristic. By a touch of his finger the user may select that condition which he finds most pleasing, and I still think that the question should be settled by subjective pleasure rather than by pure theory.

and

Mr. Crabbe himself admits that when no compensation is coupled to the control, it is in effect a ‘distance’ control. If the listener wishes to transpose himself from the expensive orchestra stalls to the much cheaper gallery, he is, of course, at liberty to do so. The difference in price should indicate which is the preferred choice however.

In the August edition, Crabbe replies, and an R.E. Pickvance joins the debate with a wise observation:

In his article on loudness controls in your June issue Mr. Lovelock mentions the problem of matching the loudness compensation to the actual sound levels generated. Unfortunately the situation is more complex than he suggests. Take, for example, a sound reproduction system with a record player as the signal source: if the compensation is correct for one record, another record with a different value of modulation for the same sound level in the studio will require a different setting of the loudness control in order to recreate that sound level in the listening room. For this reason the tonal balance will vary from one disc to another. Changing the loudspeakers in the system for others with different efficiencies will have the same effect.

In addition, B.S. Methven also joins in to debate the circuit design.

The debate finally peters out in the September issue.

Apart from the fun that I have reading this debate, there are two things that stick out for me that are worth highlighting:

  • Notice that there is a general agreement that a volume control is, in essence, a distance simulator. This is an old, and very common “philosophy” that we forget these days.
  • Pickvance’s point is possibly more relevant today than ever. Despite the amount of data that we have with respect to equal loudness contours (aka “Fletcher and Munson curves”) there is still no universal standard in the music industry for mastering levels. Now that more and more tracks are being released in a Dolby Atmos-encoded format, there are some rules to follow. However, these are very different from 2-channel materials, which have no rules at all. Consequently, although we know how to compensate for changes in response in our hearing as a function of level, we don’t know what the reference level should be for any given recording.

3-channel vinyl

Another gem of historical information from the Centennial Issue of the JAES in 1977.

This one is from the article titled “The Recording Industry in Japan” by Toshiya Inoue of the Victor Company of Japan. In it, you can find the following:

Notice that this describes a 3-channel system developed by the Victor Company using FM with a carrier frequency of 24 kHz and a modulation of ±4kHz to create a third channel on the vinyl. The resulting signal had a bandwidth of 50 Hz to 5 kHz and a SNR of 47 dB.

Interestingly, this was developed from 1961-1965: starting 9 years before CD-4 quadraphonic was introduced to the market, which used the same basic principle of FM modulation to encode the extra channels.

Phantom imaging

The July 1968 issue of Wireless World Magazine contains a description of an early, but interesting analysis of the relationship between phantom image placement in a 2-channel stereo system and interchannel level differences. This is an old favourite topic of mine, originally inspired by the work of Michael Williams and his “Stereophonic Zoom”, and extending to my first AES paper in 1999.

If you, like me, are interested in this (for example, if you’re making a panning algorithm or you’re testing the veracity of headphone-based “virtual” systems), some important figures from that article are shown below.

The typical way of showing the relationship between IAD and phantom image placement.
This one is interesting because it shows the different results in different rooms, (which would also be influenced by loudspeaker directivity.)

Note that, for the plots above and below, the x-axes show the position of the image in the stereo sound stage, where 0 is the centre point between the two loudspeakers and 0.5 is a position in one of the two loudspeakers. This is 0.5 because it’s one-half of the total angular distance between the two loudspeakers. So, you can consider the loudspeaker aperture as ±0.5.

The relationship between image WIDTH and position. This is something I’ve not seen expressed so clearly before.

For more information similar to this, see these links as a start:

Some predictions come true

In the April, 1968 issue of Wireless World, there is a short article titled “P.C.M. Copes with Everything”

It’s interesting reading the 57-year old predictions in here. One has proven to be not-quite-correct:

While 27 levels are quite adequate for telephonic speech,
211 or 212 need to be used for high quality music.

I doubt that anyone today would be convinced that 11- or 12-bit PCM would deserve the classification of “high quality”. Although some of my earliest digital recordings were made on a Sony PCM 2500 DAT machine, with an ADC that was only reliable down to about 12 or 13 bits, I wouldn’t try to pass those off as “high quality” recordings.

But, towards the end of the article, it says:

The closing talk was given by A. H. Reeves, the inventor of p.c.m. Letting his imagination take over, he spoke of a world in the not too distant future where communication links will permit people to carry out many jobs from the comfort of their homes, conferences using closed-circuit television etc. For this, he said, reliable links capable of bit rates of the order of 109 or 1010 bits will be required. Light is the most probable answer.

Impressive that, in 1968, Reeves predicted fibre optic connections to our houses and the ability to sit at home on Teams meetings (or Facetime or Zoom or Skype, or whatever…)

B&O Tech: Reading Spec’s – Part 2

#96 in a series of articles about the technology behind Bang & Olufsen loudspeakers

Introduction

It’s been a long time (about 11 years or so…) since I wrote Part 1 in this “series”, so it’s about time that I came out with a Part 2. This one is about the ‘Maximum Sound Pressure Level (SPL)’ and the ‘Bass Capability’ values that are shown for each loudspeaker model on the Bang & Olufsen website.

Before I explain either of those numbers, we need to discuss the fact that B&O loudspeakers are fully-active. This means that all the signal processing, including simple things like the volume control and more complicated things like filtering and crossovers for the loudspeaker drivers happen in a digital signal processing (DSP) chain before the amplifiers, which are individually connected to the loudspeaker drivers. (In other words, if you see a woofer and a tweeter, then there are two amplifiers inside the loudspeaker, one for each.)

That DSP chain includes even more complicated features that help to protect the loudspeaker from abuse. This means that, even if you’re playing a signal that’s been mastered at a high level, and you’ve cranked up the volume control, the processor prevents things like:

  • letting the loudspeaker drivers exceed their maximum excursions
  • letting the amplifiers go beyond their voltage or current capabilities
  • letting the power supply try to deliver more current than it can to the entire system
  • letting the loudspeaker’s internal components get so hot that things start to melt.

(None of this means that it’s impossible to break the loudspeaker. It just means that you’d have to try a lot harder than you would with a lot of other companies’ loudspeakers.)

One important side-effect of all of those protection algorithms is that, when you play a loud signal at maximum volume, the loudspeaker will be constantly trying to protect itself. Therefore its maximum output level will vary over time, depending on the signal you’re playing and things like the temperatures of its various individual components.

This, in turn, makes it difficult to state what the “Maximum Sound Pressure Level” will be, since it will change over time with different conditions.

On the other hand, it’s necessary to give a number that states the maximum Sound Pressure Level of each loudspeaker for lots of reasons. It’s also necessary that we use the same procedure to do the measurement so that the values can be compared from loudspeaker to loudspeaker.

The method

So, how do we balance these two things? The answer is to make the measurement short enough that we show the maximum output of the loudspeakers when it’s hitting its limits without being affected by a build-up of heat. This can give you an idea of how loud a short-term signal (like the punch of a kick drum or a snare drum hit) can play: the Maximum SPL. Whatever that number is, the loudspeaker definitely can’t play louder than it (since the amplifier can’t deliver more current and the loudspeaker drivers can’t move in and out any further) but that doesn’t necessarily mean it can play at that level continuously.

The way we measure both the Maximum SPL and the Bass Capability is by placing a microphone 1 m in front of the loudspeaker, and then putting in a short ‘burst’ of 5 periods of a sinusoidal tone at a given frequency. The sound pressure level of the output is measured at the microphone’s position, we wait long enough for everything to cool down, the level of the incoming signal is increased, and then we do the measurement again. This is repeated until the output signal’s level is being automatically reduced by the loudspeaker’s protection algorithms by a pre-determined amount (-6 dB).

If we were a company that made passive loudspeakers, a normal way to do this would be to increase the level until we reached a pre-determined level of total harmonic distortion (say, 10% or 20% THD, for example). However, this won’t work for a B&O loudspeaker because the protection algorithms probably won’t allow the product to distort enough to have a usable threshold.

What’s the difference?

Generally, the method of measuring both the Maximum SPL and the Bass Capability values are the same. The only difference is the range of frequencies that are used for each.

The Bass Capability shows the maximum SPL of the loudspeaker when the input signal is a 50 Hz sinusoidal wave.*

The Maximum Sound Pressure Level is an average of the maximum SPL of the loudspeaker when it is measured using a number of sinusoidal signals ranging from 200 Hz to 2 kHz. Each frequency is measured individually, and the resulting maxima are averaged to produce a single value. (If you’re read Part 1, then the frequency range of this measurement will look familiar.)

How does this correspond to real life?

This is a difficult question to answer, since the measurement is done on-axis to (or ‘directly in front of’) the loudspeaker in the measurement room. This measurement room is different from a ‘normal’ living room, where more of the total power of the loudspeaker that’s radiated in all three dimensions is reflected back to the listening position. This is the reason why some companies list the maximum output level of their loudspeakers with two numbers: one in a ‘free field’ (a room or ‘field’ that is ‘free’ of reflections) and the other in a ‘listening room’ (which may or may not be like your listening room). You’ll probably see that the ‘listening room’ SPL is higher than the ‘free field’ SPL because the room is reflecting more energy back to the measurement microphone, if nothing else…

In other words ‘results may vary’. So, the maximum SPL of a loudspeaker in your living room may not be the same as the Maximum SPL that B&O lists on its website. Time frames are different, signals are different, and rooms are different: and all of these have significant effects on the result.

What happens when I have more than one loudspeaker?

Generally speaking, if your loudspeakers are reasonably far apart, then you can use a simple rule to calculate the maximum SPL if you add more loudspeakers.

+ 3 dB per doubling

In other words, if you have a loudspeaker that can hit 100 dB SPL, and you add a second loudspeaker, then you’ll hit 103 dB SPL. If you then add two more loudspeaker (another doubling of the total number) you’ll hit 106 dB SPL.

This rule is based on a number of assumptions:

  • the loudspeakers are all the same type
  • the loudspeakers are in the same room, but fairly far apart
  • the loudspeakers are all playing their maximum output levels at the same time
  • I’m ignoring room modes, which might make things louder or quieter, depending on the frequency that you’re playing, the placements of the loudspeakers, and the location of the listening position
  • we’re ignoring other protection algorithms like thermal protection

The reason that this rule is a basic one: we’re assuming that every time you double the number of speakers, you double the total power at the listening position (which is a reasonable assumption if the list of assumptions above are true). Two times the acoustic power is the same as an increase of +3 dB SPL (because 10 log10(2) = 3).

If, however, the frequency was very low, and the loudspeakers were very close together, and they were playing exactly the same signals at exactly the same time, you might make the argument that you can say that there is a +6 dB increase for every doubling of loudspeakers, because it’s their amplitudes (and not their acoustic powers) that are added.

Neither of these two basic assumptions is correct, and so the real number is probably between +3 and +6 dB per doubling of loudspeakers, and it will be different for different frequency bands and different loudspeaker separations. However, it’s best to err on the safe side.

One last thing

This should help to explain why, when you compare the Bass Capabilities and the Maximum Sound Pressure Levels of different loudspeakers, the former has much bigger differences than the latter.

For example:

Beosound Explore< difference >Beolab 50
Max SPL @ 1 m91 dB SPL26 dB117 dB SPL
Bass capability59 dB SPL52 dB111 dB SPL

The table above shows a direct comparison of two VERY different loudspeaker models using data taken directly from bang-olufsen.com on 2025 04 01. I’ve converted the numbers for the Beolab 50 to a ‘per loudspeaker’ instead of ‘per pair’ by subtracting 3 dB from the published numbers.

As you can see there:

  • the difference in Bass Capability between a Beosound Explore and a Beolab 50 is (111-59) = 52 dB
  • the difference in Max SPL between a Beosound Explore and a Beolab 50 is (117-91) = 26 dB

Speaking VERY generally, the difference in Max SPL values is less than the difference in Bass Capabilities because the difference in size and power of the drivers producing the midrange frequency band of the two loudspeakers is smaller than the difference in size and power of the woofers. In other words, there is a difference between the different differences. (I think that I got that right…)


* To get the Bass Capability measurement, it looks like I said that we do the measurement of a single 50 Hz sinusoidal tone. This isn’t really true. We do a number of measurements at different frequencies ranging from 20 Hz to 100 Hz and then calculate the equivalent value for a 50 Hz tone using some averaging.

If the loudspeaker is comprised of a single low-frequency driver in a closed cabinet, then the resulting number would be the same as if we just measured using a 50 Hz tone. However, if the loudspeaker is ported or has a passive driver with a resonant frequency in the 20 – 100 Hz range, then this method will probably produce a slightly different number than measuring only with a 50 Hz tone.

I wrote this

Everything that you read on this website was written by me without the help of AI. I promise.

I, for one, am sick and tired of the word salads that are getting put on websites by people who feed some keywords in a large language model, and then copy-and-paste the results without any kind of critical thinking, or even a modicum of copy editing. (Even worse are the automatically-generated webpages that are created from the keywords in your search, where the only human in the process is you.) The result is usually something that looks like it should make sense, but winds up being confusing because you’re actually trying to navigate in a sea of errors.

I’ve also debated for a long time whether to try to block LLMs from scraping this site to feed their salad bowls. For now, I’ve decided that I will give it a try, using the Block AI Crawlers plugin for WordPress.

The Sound of Music

This episode of The Infinite Monkey Cage is worth a listen if you’re interested in the history of recording technologies.

There’s one comment in there by Brian Eno that I COMPLETELY agree with. He mentions that we invented a new word for moving pictures: “movies” to distinguish them from the live equivalent, “plays”. But we never really did this for music… Unless, of course, you distinguish listening to a “concert” from listening to a “recording” – but most of us just say “I’m listening to music”.

100-year old upmixer

I had a little time at work today waiting for some visitors to show up and, as I sometimes do, I pulled an old audio book off the shelf and browsed through it. As usually happens when I do this, something interesting caught my eye.

I was reading the AES publication called “The Phonograph and Sound Recording After One-Hundred Years” which was the centennial issue of the Journal of the AES from October / November 1977.

In that issue of the JAES, there is an article called “Record Changers, Turntables, and Tone Arms – A Brief Technical History” by James H. Kogen of Shure Brothers Incorporated, and in that article he mentions US Patent Number 1,468,455 by William H. Bristol of Waterbury, CT, titled “Multiple Sound-Reproducing Apparatus”.

Before I go any further, let’s put the date of this patent in perspective. In 1923, record players existed, but they were wound by hand and ran on clockwork-driven mechanisms. The steel needle was mechanically connected to a diaphragm at the bottom of a horn. There were no electrical parts, since lots of people still didn’t even have electrical wiring in their homes: radios were battery-powered. Yes, electrically-driven loudspeakers existed, but they weren’t something you’d find just anywhere…

In addition, 3- or 2-channel stereo wasn’t invented yet, Blumlein wouldn’t patent a method for encoding two channels on a record until 1931: 8 years in the future…

But, if we look at Bristol’s patent, we see a couple of astonishing things, in my opinion.

If you look at the top figure, you can see the record, sitting on the gramophone (I will not call it a record player or a turntable…). The needle and diaphragm are connected to the base of the horn (seen on the top right of Figure 3, looking very much like my old Telefunken Lido, shown below.

But, below that, on the bottom of Figure 3 are what looks a modern-ish looking tonearm (item number 18) with a second tonearm connected to it (item number 27). Bristol mentions the pickups on these as “electrical transmitters”: this was “bleeding edge” emerging technology at the time.

So, why two pickups? First a little side-story.

Anyone who works with audio upmixers knows that one of the “tricks” that are used is to derive some signal from the incoming playback, delay it, and then send the result to the rear or “surround” loudspeakers. This is a method that has been around for decades, and is very easy to implement these days, since delaying audio in a digital system is just a matter of putting the signal into a memory and playing it out a little later.

Now look at those two tonearms and their pickups. As the record turns, pickup number 20 in Figure 3 will play the signal first, and then, a little later, the same signal will be played by pickup number 26.

Then if you look at Figure 6, you can see that the first signal gets sent to two loudspeakers on the right of the figure (items number 22) and the second signal gets sent to the “surround” loudspeakers on the left (items number 31).

So, here we have an example of a system that was upmixing a surround playback even before 2-channel stereo was invented.

Mind blown…

NB. If you look at Figure 4, you can see that he thought of making the system compatible with the original needle in the horn. This is more obvious in Figures 1 and 2, shown below.