Mixing closed and ported cabinets: Part 3

Before starting on this portion of the series, I’ll ask you to think about how little energy (or movement) it takes to get a resonant system oscillating. For example, if you have a child on a swing, a series of very gentle pushes at the right times can result in them swinging very high. Also, once the child is swinging back and forth, it takes a lot of effort to stop them quickly.

Moving onwards…

So far, we’ve seen that a loudspeaker driver in a closed cabinet can be thought of as just a mass on a spring, and, as a result, it has some natural resonance where it will oscillate at some frequency.

The driver is normally moved by sending an electrical signal into its voice coil. This causes the coil to produce a magnetic field and, since it’s already sitting in the magnetic field of a permanent magnet, it moves. The surround and spider prevent it from moving sideways, so it can only move outwards (if we send electrical current in one direction) or inwards (if we send current in the other direction).

When you try to move the driver, you’re working against a number of things:

  • the inertia of the mass of the moving parts
    Pick up a heavy book, for example, and try to push and pull it back and forth. It’s hard work!
  • the inertia of the air directly in front of and behind the driver
    Pick up a big sheet of stiff plastic (like the thing you put on the floor under an office chair) and try to push it back and forth. It’s also hard work!
  • the compliance (springiness) of the surround, spider, and air trapped in the cabinet behind the driver
    Blow up a ballon, and use your two hands to squeeze it repeatedly. It’s also hard work!

These three things can be considered separately from each other as a static effect. In other words:

  • It’s hard work to pick up a book or push a car that’s broken down (forget about pushing-and-pulling – just push OR pull)
  • It’s hard work to run into a headwind with that big piece of stiff plastic
  • It’s hard work to squeeze a balloon and keep it compressed

But, if you’re pushing AND pulling the loudspeaker driver there is another effect that’s dynamic.

When you’re moving the driver at a VERY low frequency, you’re mostly working against the “spring” which is probably quite easy to do. So, at a low frequency, the driver is pretty easy to move, and it’s moving so slowly that it doesn’t push back electrically. So, it does not impede the flow of current through the voice coil.

When you’re moving the driver at a VERY high frequency, you’re mostly working against the inertia of the moving parts and the adjacent air molecules. The higher the frequency, the harder it is to move the driver.

However, when you’re trying to moving the driver at exactly the resonant frequency of the driver, you don’t need much energy at all because it “wants” to move at that same rate. However, at that frequency, the voice coil is moving in the magnetic field of the permanent magnet, and it generates electricity that is trying to move current in the opposite direction of what your amp is going. In other words, at the driver’s resonant frequency, when you’re trying to push current into the voice coil, it generates a current that pushes back. When you try to pull current out of the voice coil, it generates a current that pulls back.

In other words, at the driver’s resonant frequency, your amplifier “sees” the driver as as a thing that is trying to impede the flow of electrical current. This means that you get a lot of movement with only a little electrical current; just like the child on the swing gets to go high with only a little effort – but only at one frequency.

This is a nice, simple case where you have a moving mass (the moving parts of the driver) and a spring (the surround, spider, and air in the sealed box). But what happens when the speaker has a port?

On to Part 4…

Mixing closed and ported cabinets: Part 2

Let’s look at a typical moving coil loudspeaker driver like the woofer shown in Figure 1.

Figure 1.

If I were to draw a cross-section of this and display it upside-down, it would look like Figure 2.

Figure 2.

Typically, if we send a positive voltage/current signal to a driver (say, the attack of a kick drum to a woofer) then it moves “forwards” or “outwards” (from the cabinet, for example). It then returns to the rest position. If we send it a negative signal, then it moves “backwards” or “inwards”. This movement is shown in Figure 3.

Figure 3.

Notice in Figure 3 that I left out all of the parts that don’t move: the basket, the magnet and the pole piece. That’s because those aren’t important for this discussion.

Also notice that I used only two colours: red for the moving parts that don’t move relative to each other (because they’re all glued together) and blue for the stretchy parts that act as a spring. These colours relate directly to the colours I used in Part 1, because they’re doing exactly the same thing. In other words, if you hold a woofer by the basket or magnet, and tap it, it will “bounce” up and down because it’s just a mass suspended by a spring. And, just like I talked about in Part 1, this means that it will oscillate at some frequency that’s determined by the relationship of the mass to the spring’s compliance (a fancy word for “springiness” or “stiffness” of a spring. The more compliant it is, the less stiff.) In other words, I’m trying to make it obvious that Figure 3, above is exactly the same as Figures 3 and 5 in Part 1.

However, it’s very rare to see a loudspeaker where the driver is suspended without an enclosure. Yes, there are some companies that do this, but that’s outside the limits of this discussion. So, what happens when we put a loudspeaker driver in a sealed cabinet? For the purposes of this discussion, all it means is that we add an extra spring attached to the moving parts.

Figure 4

I’ve shown the “spring” that the air provides as a blue coil attached to the back of the dust cap. Of course, this is not true; the air is pushing against all surfaces inside the loudspeaker. However, from the outside, if you were actually pushing on the front of the driver with your fingers, you would not be able to tell the difference.

This means that the spring that pushes or pulls the loudspeaker diaphragm back into position is some combination of the surround (typically made of rubber nowadays), the spider (which might be made of different things…) and the air in the sealed cabinet. Those three springs are in parallel, so if you make one REALLY stiff (or lower its compliance) then it becomes the important spring, and the other two make less of a difference.

So, if you make the cabinet too small, then you have less air inside it, and it becomes the predominant spring, making the surround and spider irrelevant. The bigger the cabinet, the more significant a role the surround and spider play in the oscillation of the system.

Sidebar: If you are planning on making a lot of loudspeakers on a production line, then you can use this to your advantage. Since there is some variation in the compliance of the surround and spider from driver-to-driver, then your loudspeakers will behave differently. However, if you make the cabinet small, then it becomes the most important spring in the system, and you get loudspeakers that are more like each other because their volumes are all the same.

Remember from part 1 that if you increase the stiffness of the spring, then the resonant frequency of the oscillation will increase. It will also ring for longer in time. In practical terms, if you put a woofer in a big sealed cabinet and tap it, it will sound like a short “thump”. But if the cabinet is too small, then it will sound like a higher-pitched and longer-ringing “bonnnnnnnggggg”.

So far, we’ve only been talking about physical things: masses and springs. In the next part, we’ll connect the loudspeaker driver to an amplifier and try to push and pull it with electrical signals.

Mixing closed and ported cabinets: Part 1

I made a comment on a forum this week, commenting that, if you mix loudspeakers with closed cabinets with loudspeakers with ported cabinets (or slave drivers), the end result can be a reduction in total output: less sound from more loudspeakers. Of course, the question is “why?” and the short answer is “due to the phase mismatching of the loudspeakers”.

This is the long answer.

Before we begin, we have to get an intuitive understanding of what a ported loudspeaker is. (Note that I’ll keep saying “ported loudspeaker”, but the principle also applies to loudspeakers with slave drivers, as I’ll explain later.) Before we get to a ported loudspeaker, we need to talk about Helmholtz resonators.

Take a block that’s reasonably heavy and hang it using a spring so that it looks like this:

Figure 1.

The spring is a little stretched because the weight of the block (which is the result of its mass and the Earth’s gravity) is pulling downwards. (We’ll ignore the fact that the spring is also holding up its own weight. Let’s keep this simple…) However, it doesn’t fall to the floor because the spring is pulling upwards.

Now pull downwards on the block, so it will look like the example on the right in the figure below.

Figure 2.

The spring is stretched because we’re pulling down on the block. The spring is also pulling upwards more, since it’s pulling against the weight of the block PLUS the force that you’re adding in a downwards direction.

Now you let go of the block. What happens?

The spring is pulling “too hard” on the block, so the block starts rising back to where it started (we’ll call that the “resting position”). However, when it gets there, it has inertia (a body in motion tends to stay in motion… until it hits something big…) so it doesn’t stop. As a result, it moves upwards, higher than the resting position. This squeezes the spring until it gets to some point, at which time the block stops, and then starts going back downwards. When it returns to the resting position, it still has inertia, so it passes that point and goes too far down again. I’ve shown this as a series of positions from left to right in the figure below.

Figure 3

If there were no friction, no air around the block, and no friction within the metal molecules of the spring, then this would bounce up and down forever.

However, there is friction, so some of the movement (“kinetic energy”) is turned into heat and lost. So, each bounce gets smaller and smaller and the maximum velocity of the block (as it passes the resting position) gets lower and lower, until, eventually, it stops moving (at the resting position, where it started).

Notice that I changed the colour of the spring to show when it’s more stretched (lighter blue) and when it’s more compressed (darker blue).

If everything were behaving perfectly, then the RATE at which the bounce repeats wouldn’t change. Only its amplitude (or the excursion of the block, or the height of the bounce) would reduce over time. That bounce rate (let’s say 1 bounce per second, and by “bounce” I mean a full cycle of moment down, up, and back down to where it started again) is the frequency of the repetition (or oscillation).

If you make the weight lighter, then it will bounce faster (because the spring can pull the weight more “easily”). If you make the spring stiffer, then it will bounce faster (because the spring can pull the weight more “easily”). So, we can change the frequency of the oscillation by changing the weight of the block or the stiffness of the spring.

Now take a look at the same weight on a spring next to an up-side down wine bottle that (sadly) has been emptied of wine.

Figure 4.

Notice that I’ve added some colours to the air inside the bottle. The air in the bottle itself is blue, just like the colour of the spring. This is because, if we pull air out of the bottle (downwards), the air inside it will pull back (upwards; just like the metal spring pulling back upwards on the block). I’ve made the small cylinder of air in the neck of the bottle red, just like the block. This is because that air has some mass, and it’s free to move upwards (into the bottle) and downwards (out of the bottle) just like the block.

If I were somehow able to pull the “plug” of air out of the neck of the bottle, the air inside would try to pull it back in. If I then “let go”, the plug would move inwards, go too far (because it also has inertia), squeezing (or compressing) the air inside the bottle, which would then push the plug back out. This is shown in the figure below.

Figure 5.

At the level we’re dealing with, this behaviour is practically identical to the behaviour of the block on the spring. In other words, although the block and the plug are made of different materials, and although the metal spring and the air inside the bottle are different materials, Figures 3 and 5 show the same behaviour of the same kind of system.

How do you pull the plug of air out of the bottle? It’s probably easier to start by pushing it inwards instead, by blowing across the top.

When you do this, a little air leaks into the opening, pushing the plug inwards. The “spring” in the bottle then pushes the plug outwards, and your cycle has started. If you wanted to do the same thing with the block, you’d lift it and let go to start the oscillation.

However, you don’t need to blow across the bottle to make it oscillate. You can just tap it with the palm of your hand, for example. Or, if you put the bottle next to your ear and listen carefully, you’ll hear a note “singing along” with the sound in the room. This is because the air in the bottle resonates; it moves back and forth very easily at the frequency that’s determined by the mass of the air in the neck and the volume of air in of the bottle (the spring).

However, remember that friction can make the oscillation decay (or die away) faster, by turning the movement into heat.

One last thing…

There’s another way to get either the block or the wine bottle oscillating:

You can move the TOP of the spring (for example, if you pull it up, then the spring will pull the block upwards, and it’ll start bouncing). Or, you could tap the bottom of the wine bottle (which is on the top in my drawings).

This method of starting the oscillation will come in handy in part 2.

Dynamic Styli Correlator Pt. 3

I thought that I was finished talking about (and even thinking about) the RCA Dynagroove Dynamic Styli Correlator as well as tracking and tracing distortion… and then I got an email about the last two postings pointing out that I didn’t mention two-channel stereo vinyl, and whether there was something to think about there.

My first reaction was: “There’s nothing interesting about that. It’s just two channels with the same problem, and since (at least in a hypothetical world) the two axes of movement of the needle are orthogonal, then it doesn’t matter. It’ll be the same problem in both channels. End of discussion.”

Then I took the dog out for a walk, and, as often happens when I’m walking the dog, I re-think thoughts and come home with the opposite opinion.

So, by the time I got home, I realised that there actually is something interesting about that after all.

Starting with Emil Berliner, record discs (original lacquer, then vinyl) have been cut so that the “mono” signal (when the two channels are identical) causes the needle to move laterally instead of vertically. This was originally (ostensibly) to isolate the needle’s movement from vibrations caused by footsteps (the reality is that it was probably a clever manoeuvring around Edison’s patent).

This meant that, when records started supporting two audio channels, a lateral movement was necessary to keep things backwards-compatible.

What does THIS mean? It means that, when the two channels have the same signal (say, on the lead vocal of a pop tune, for example) when the groove of the left wall goes up, the groove of the right wall goes down by the same amount. That causes the needle to move sideways, as shown below in Figure 1.

Figure 1. A two-channel groove with identical information in the two channels.

What are the implications of this on tracing distortion? Remember from the previous posting that the error in the movement of the needle is different on a positive slope (where the needle is moving upwards) than a negative slope (downwards). This can be seen in a one-channel representation in Figure 2.

Figure 2. The grey line is the groove wall. The blue line shows the actual movement of the needle and the red line shows the difference between the two – the error contained in the output signal.

Since the two groove walls have an opposite polarity when the audio signals are the same, then the resulting movement of the two channels with the same magnitude of error will look like Figure 3.

Figure 3. The physical movement of the two channels, and their independent errors.

Notice that, because the two groove walls are moving in opposite polarity (in other words, one is going up while the other is going down) this causes the two error signals to shift by 1/2 of a period.

However, Figure 3 doesn’t show the audio’s electrical signals. It shows the physical movement of the needle. In order to show the audio signals, we have to flip the polarity of one of the two channels (which, in a real pickup would be done electrically). That means that the audio signals will look like Figure 4.

Figure 4. The electrical outputs of the two audio channels and their error components.

Notice in Figure 4 that the original signals are identical (that’s why it looks like there’s only one sine wave) but their actual outputs are different because their error components are different.

But here’s the cool thing:

One way to think of the actual output signals is to consider each one as the sum of the original signal and the error signal. Since (for a mono signal like a lead vocal) their original signals are identical, then, if you sit in the right place with a properly configured pair of loudspeakers (or a decent pair of headphones) then you’ll hear that part of the signal as a phantom image in the middle. However, since the error signals are NOT correlated, they will not be localised in the middle with the voice. They’ll move to the sides. They’re not negatively correlated, so they won’t sound “phase-y” but they’re not correlated either, so they won’t be in the same place as the original signal.

So, although the distortion exists (albeit not NEARLY on the scale that I’ve drawn here…) it could be argued that the problem is attenuated by the fact that you’ll localise it in a different place than the signal.

Of course, if the signal is only in one channel (like Aretha Franklin’s backup singers in “Chain of Fools” for example) then this localisation difference will not help. Sorry.

SNR vs DNR

When you look at the datasheet of an audio device, you may see a specification that states its “signal to noise ratio” or “SNR”. Or, you may see the “dynamic range” or “DNR” (or “DR”) lists as well, or instead.

These days, even in the world of “professional audio” (whatever that means), these two things are similar enough to be confused or at least confusing, but that’s because modern audio devices don’t behave like their ancestors. So, if we look back 30 years ago and earlier, then these two terms were obviously different, and therefore independently usable. So, in order to sort this out, let’s take a look at the difference in old audio gear and the new stuff.

Let’s start with two of basic concepts:

  1. All audio devices (or storage media or transmission systems) make noise. If you hold a resistor up in the air and look at the electrical difference across its two terminals and you’ll see noise. There’s no way around this. So, an amplifier, a DAC, magnetic tape, a digital recording stored on a hard drive… everything has some noise floor at the bottom that’s there all the time.
  2. All audio devices have some maximum limit that cannot be exceeded. A woofer can move in and out until it goes so far that it “bottoms out” on the magnet or rips the surround. A power amplifier can deliver some amount of current, but no higher. The headphone output on your iPhone cannot exceed some voltage level.

So, the goal of any recording or device that plays a recording is to try and make sure that the audio signal is loud enough relative to that noise that you don’t notice it, but not so loud that the limit is hit.

Now we have to look a little more closely at the details of this…

If we take the example of a piece of modern audio equipment (which probably means that it’s made of transistors doing the work in the analogue domain, and there’s lots of stuff going on in the digital domain) then you have a device that has some level of constant noise (called the “noise floor”) and maximum limit that is at a very specific level. If the level of your audio signal is just a weeee bit (say, 0.1 dB) lower than this limit, then everything is as it should be. But once you hit that limit, you hit it hard – like a brick wall. If you throw your fist at a brick wall and stop your hand 1 mm before hitting it, then you don’t hit it at all. If you don’t stop your hand, the wall will stop it for you.

In older gear, this “brick wall” didn’t exist in lots of gear. Let’s take the sample of analogue magnetic tape. It also has a noise floor, but the maximum limit is “softer”. As the signal gets louder and louder, it starts to reach a point where the top and bottom of the audio waveform get increasingly “squished” or “compressed” instead of chopping off the top and bottom.

I made a 997 Hz sine wave that starts at a very, very low level and increases to a very high level over a period of 10 seconds. Then, I put it through two simulated devices.

Device “A” is a simulation of a modern device (say, an analogue-to-digital converter). It clips the top and bottom of the signal when some level is exceeded.

Device “B” is a simulation of something like the signal that would be recorded to analogue magnetic tape and then played back. Notice that it slowly “eases in” to a clipped signal; but also notice that this starts happening before Device “A” hits its maximum. So, the signal is being changed before it “has to”.

Let’s zoom in on those two plots at two different times in the ramp in level.

Device “A” is the two plots on the top at around 8.2 seconds and about 9.5 seconds from the previous figure. Device “B” is the bottom two plots, zooming in on the same two moments in time (and therefore input levels).

Notice that when the signal is low enough, both devices have (roughly) the same behaviour. They both output a sine wave. However, when the signal is higher, one device just chops off the top and bottom of the sine wave whereas the other device merely changes its shape.

Now let’s think of this in terms of the signals’ levels in relationship to the levels of the noise floors of the devices and the distortion artefacts that are generated by the change in the signals when they get too loud.

If we measure the output level of a device when the signal level is very, very low, all we’ll see is the level of the inherent noise floor of the device itself. Then, as the signal level increases, it comes up above the noise floor, and the output level is the same as the level of the signal. Then, as the signal’s level gets too high, it will start to distort and we’ll see an increase in the level of the distortion artefacts.

If we plot this as a ratio of the signal’s level (which is increasing over time) to the combined level of the distortion and noise artefacts for the two devices, it will look like this:

On the left side of this plot, the two lines (the black door Device “A” and the red for Device “B”) are horizontal. This is because we’re just seeing the noise floor of the devices. No matter how much lower in level the signals were, the output level would always be the same. (If this were a real, correct Signal-to-THD+N ratio, then it would actually show negative values, because the signal would be quieter than the noise. It would really only be 0 dB when the level of the noise was the same as the signal’s level.)

Then, moving to the right, the levels of the signals come above the noise floor, and we see the two lines increasing in level.

Then, just under a signal level of about -20 dB, we see that the level of the signal relative to the artefacts starts in Device “B” reaches a peak, and then starts heading downwards. This is because as the signal level gets higher and higher, the distortion artefacts increase in level even more.

However, Device “A” keeps increasing until it hits a level 0 dB, at which point a very small increase in level causes a very big jump in the amount of distortion, so the relative level of the signal drops dramatically (not because the signal gets quieter, but because the distortion artefacts get so loud so quickly).

Now let’s think about how best to use those two devices.

For Device “A” (in red) we want to keep the signal as loud as possible without distorting. So, we try to make sure that we stay as close to that 0 dB level on the X-axis as we can most of the time. (Remember that I’m talking about a technical quality of audio – not necessarily something that sounds good if you’re listening to music.) HOWEVER: we must make sure that we NEVER exceed that level.

However, for Device “B”, we want to keep the signal as close to that peak around -20 dB as much as possible – but if we go over that level, it’s no big deal. We can get away with levels above that – it’s just that the higher we go, the worse it might sound because the distortion is increasing.

Notice that the red line and the black line cross each other just above the 0 dB line on the X-axis. This is where the two devices will have the same level of distortion – but the distortion characteristics will be different, so they won’t necessarily sound the same. But let’s pretend that the the only measure of quality is that Y-axis – so they’re the same at about +2 dB on the X-axis.

Now the question is “What are the dynamic ranges of the two systems?” Another way to ask this question is “How much louder is the loudest signal relative to the quietest possible signal for the two devices?” The answer to this is “a little over 100 dB” for both of them, since the two lines have the same behaviour for low signals and they cross each other when the signal is about 100 dB above this (looking at the X-axis, this is the distance between where the two lines are horizontal on the left, and where they cross each other on the right). Of course, I’m over-simplifying, but for the purposes of this discussion, it’s good enough.

The second question is “What are the signal-to-noise ratios of the two systems?” Another way to ask THIS question is “How much louder is the average signal relative to the quietest possible signal for the two devices?” The answer to this question is two different numbers.

  • Device “A” has a signal-to-noise ratio of about 100 dB , because we’re going to use that device, trying to keep the signal as close to clipping as possible without hitting that brick wall. In other words, for Device “A”, the dynamic range and the signal-to-noise ratio are the same because of the way we use it.
  • Device “B” has a signal-to-noise ratio of about 80 dB because we’re going to try to keep the signal level around that peak on the black curve (around -20 dB on the X-axis). So, its signal-to-noise ratio is about 20 dB lower than its dynamic range, again, because of the way we use it.

The problem is, these days, a lot of engineers aren’t old enough to remember the days when things behaved like Device “B”, so they interchange Signal to Noise and Dynamic Range all willy-nilly. Given the way we use audio devices today, that’s okay, except when it isn’t.

For example, if you’re trying to connect a turntable (which plays vinyl records that are mastered to behave more like Device “B”) to a digital audio system, then the makers of those two systems and the recordings you play might not agree on how loud things should be. However, in theory, that’s the problem of the manufacturers, not the customers. In reality, it becomes the problem of the customers when they switch from playing a record to playing a digital audio stream, since these two worlds treat levels differently, and there’s no right answer to the problem. As a result, you might need to adjust your volume when you switch sources.

Excursion vs. Frequency

Last week, I was doing a lecture about the basics of audio and I happened to mention one of the rules of thumb that we use in loudspeaker development:

If you have a single loudspeaker driver and you want to keep the same Sound Pressure Level (or output level) as you change the frequency, then if you go down one octave, you need to increase the excursion of the driver 4 times.

One of the people attending the presentation asked “why?” which is a really good question, and as I was answering it, I realised that it could be that many people don’t know this.

Let’s take this step-by-step and keep things simple. We’ll assume for this posting that a loudspeaker driver is a circular piston that moves in and out of a sealed cabinet. It is perfectly flat, and we’ll pretend that it really acts like a piston (so there’s no rubber or foam surround that’s stretching back and forth to make us argue about changes in the diameter of the circle). Also, we’ll assume that the face of the loudspeaker cabinet is infinite to get rid of diffraction. Finally, we’ll say that the space in front of the driver is infinite and has no reflective surfaces in it, so the waveform just radiates from the front of the driver outwards forever. Simple!

Then, we’ll push and pull the loudspeaker driver in and out using electrical current from a power amplifier that is connected to a sine wave generator. So, the driver moves in and out of the “box” with a sinusoidal motion. This can be graphed like this:

Figure 1: The excursion of a loudspeaker driver playing a 1 kHz sine wave at some output level.

As you can see there, we have one cycle per millisecond, therefore 1000 cycles per second (or 1 kHz), and the driver has a peak excursion of 1 mm. It moves to a maximum of 1 mm out of the box, and 1 mm into the box.

Consider the wave at Time = 0. The driver is passing the 0 mm line, going as fast as it can moving outwards until it gets to 1 mm (at Time = 0.25 ms) by which time it has slowed down and stopped, and then starts moving back in towards the box.

So, the velocity of the driver is the slope of the line in Figure 1, as shown in Figure 2.

Figure 2: The excursion and velocity of the same loudspeaker driver playing the same signal.

As the loudspeaker driver moves in and out of the box, it’s pushing and pulling the air molecules in front of it. Since we’ve over-simplified our system, we can think of the air molecules that are getting pushed and pulled as the cylinder of air that is outlined by the face of the moving piston. In other words, its a “can” of air with the same diameter as the loudspeaker driver, and the same height as the peak-to-peak excursion of the driver (in this case, 2 mm, since it moves 1 mm inwards and 1 mm outwards).

However, sound pressure (which is how loud sounds are) is a measurement of how much the air molecules are compressed and decompressed by the movement of the driver. This is proportional to the acceleration of the driver (neither the velocity nor the excursion, directly…). Luckily, however, we can calculate the driver’s acceleration from the velocity curve. If you look at the bottom plot in Figure 2, you can see that, leading up to Time = 0, the velocity has increased to a maximum (so the acceleration was positive). At Time = 0, the velocity is starting to drop (because the excursion is on its was up to where it will stop at maximum excursion at time = 0.25 ms).

In other words, the acceleration is the slope of the velocity curve, the line in the bottom plot in Figure 2. If we plot this, it looks like Figure 3.

Figure 3: The excursion, velocity and acceleration of the same loudspeaker driver playing the same signal.

Now we have something useful. Since the bottom plot in Figure 3 shows us the acceleration of the driver, then it can be used to compare to a different frequency. For example, if we get the same driver to play a signal that has half of the frequency, and the same excursion, what happens?

Figure 4: Comparing the excursion, velocity and acceleration of the same loudspeaker driver playing two different signals with the same excursion. (The red line is the same in Figure 4 as in Figure 3.)

In Figure 4, two sine waves are shown: the black line is 1/2 of the frequency of the red line, but they both have the same excursion. If you take a look at where both lines cross the Time = 0 point, then you can see that the slopes are different: the red line is steeper than the black. This is why the peak of the red line in the velocity curve is higher, since this is the same thing. Since the maximum slope of the red line in the middle plot is higher than the maximum slope of the black line, then its acceleration must be higher, which is what we see in the bottom plot.

Since the sound pressure level is proportional to the acceleration of the driver, then we can see in the top and bottom plots in Figure 4 that, if we halve the frequency (go down one octave) but maintain the same excursion, then the acceleration drops to 25% of the previous amount, and therefore, so does the sound pressure level (20*log10(0.25) = -12 dB, which is another way to express the drop in level…)

This raises the question: “how much do we have to increase the excursion to maintain the acceleration (and therefore the sound pressure level)?” The answer is in the “25%” in the previous paragraph. Since maintaining the same excursion and multiplying the frequency by 0.5 resulted in multiplying the acceleration by 0.25, we’ll have to increase the excursion by 4 to maintain the same acceleration.

Figure 5: Comparing the excursion, velocity and acceleration of the same loudspeaker driver playing two different signals at two different excursions. Notice that some of the vertical scales in the plots have changed. (The red line is the same in Figure 5 as in Figures 4 and 3.)

Looking at Figure 5: The black line is 1/2 the frequency of the red line. Their accelerations (the bottom plots) have the same peak values (which means that they produce the same sound pressure level). This, means that the slopes of their velocities are the same at their maxima, which, in turn, means that the peak velocity of the black line (the lower frequency) is higher. Since the peak velocity of the black line is higher (by a factor of 2) then the slope of the excursion plot is also twice as steep, which means that the peak of the excursion of the black line is 4x that of the red line. All of that is explained again in Figure 6.

Figure 6. A repeat of Figure 5 with some explanations that (hopefully) help.

Therefore, assuming that we’re using the same loudspeaker driver, we have to increase the excursion by a factor of 4 when we drop the frequency by a factor of 2, in order to maintain a constant sound pressure level.

However, we can play a little trick… what we’re really doing here is increasing the volume of our “cylinder” of air by a factor of 4. Since we don’t change the size of the driver, we have to move it 4 times farther.

However, the volume of a cylinder is

π r2 * height

and we’re just playing with the “height” in that equation. A different way would be to use a different driver with a bigger surface area to play the lower frequency. For example, if we multiply the radius of the driver by 2, and we don’t change the excursion (the “height” of the cylinder) then the total volume increases by a factor of 4 (because the radius is squared in the equation, and 2*2 = 4).

Another way to think of this: if our loudspeaker driver was a square instead of a circle, we could either move it in and out 4 times farther OR we would make the width and the length of the square each twice as big to get the a cube with the same volume. That “r2” in the equation above is basically just the “width * length” of a circle…

This is why woofers are bigger than tweeters. In a hypothetical world, a tweeter can play the same low frequencies as a woofer – but it would have to move REALLY far in and out to do it.

Fibre needles

Reading through some old magazines again…

This time, it’s The Gramophone magazine from October, 1930. In the editorial, Compton Mackenzie says

What caught my eye was the discussion of gramophone needles made of “hard wood”, and also the prediction that “the growth of electrical recording steps … to grapple with that problem of wear and tear.”

The fact that electrical (instead of mechanical) recording and playback was seen as a solution to “wear and tear” reminded me of my first textbook in Sound Recording where “Digital Audio” was introduced only within the chapter on Noise Reduction.

Later in that same issue, there is a little explanation of the “Electrocolor” and “Burmese” needles.

The March 1935 issue raises the point of wear vs. fidelity in the Editorial (which starts by comparing players with over-sized horns).

I like the comment about having to be in the “right mood” for Ravel. Some things never change.

What’s funny is that, now that I’ve seen this, I can’t NOT see it. There are advertisements for fibre, thorn, and wood needles all over the place in 1930s audio magazines.

What is a “virtual” loudspeaker? Part 3

#91.3 in a series of articles about the technology behind Bang & Olufsen

In Part 1 of this series, I talked about how a binaural audio signal can (hypothetically, with HRTFs that match your personal ones) be used to simulate the sound of a source (like a loudspeaker, for example) in space. However, to work, you have to make sure that the left and right ears get completely isolated signals (using earphones, for example).

In Part 2, I showed how, with enough processing power, a large amount of luck (using HRTFs that match your personal ones PLUS the promise that you’re in exactly the correct location), and a room that has no walls, floor or ceiling, you can get a pair of loudspeakers to behave like a pair of headphones using crosstalk cancellation.

There’s not much left to do to create a virtual loudspeaker. All we need to do is to:

  • Take the signal that should be sent to a right surround loudspeaker (for example) and filter it using the HRTFs that correspond to a sound source in the location that this loudspeaker would be in. REMEMBER that this signal has to get to your two ears since you would have used your two ears to hear an actual loudspeaker in that location.
  • Send those two signals through a crosstalk cancellation processing system that causes your two loudspeakers to behave more like a pair of headphones.
Figure 1: A block diagram of the system described above.

One nice thing about this system is that the crosstalk cancellation is only there to ensure that the actual loudspeakers behave more like headphones. So, if you want to create more virtual channels, you don’t need to duplicate the crosstalk cancellation processor. You only need to create the binaurally-processed versions of each input signal and mix those together before sending the total result to the crosstalk cancellation processor, as shown below.

Figure 2: You only need one crosstalk cancellation system for any number of virtual channels.

This is good because it saves on processing power.

So, there are some important things to realise after having read this series:

  • All “virtual” loudspeakers’ signals are actually produced by the left and right loudspeakers in the system. In the case of the Beosound Theatre, these are the Left and Right Front-firing outputs.
  • Any single virtual loudspeaker (for example, the Left Surround) requires BOTH output channels to produce sound.
  • If the delays (aka Speaker Distance) and gains (aka Speaker Levels) of the REAL outputs are incorrect at the listening position, then the crosstalk cancellation will not work and the virtual loudspeaker simulation system won’t work. How badly is doesn’t work depends on how wrong the delays and gains are.
  • The virtual loudspeaker effect will be experienced differently by different persons because it’s depending on how closely your actual personal HRTFs match those predicted in the processor. So, don’t get into fights with your friends on the sofa about where you hear the helicopter…
  • The listening room’s acoustical behaviour will also have an effect on the crosstalk cancellation. For example, strong early reflections will “infect” the signals at the listening position and may/will cause the cancellation to not work as well. So, the results will vary not only with changes in rooms but also speaker locations.

Finally, it’s worth nothing that, in the specific case of the Beosound Theatre, by setting the Speaker Distances and Speaker Levels for the Left and Right Front-firing outputs for your listening position, then you have automatically calibrated the virtual outputs. This is because the Speaker Distances and Speaker Levels are compensations for the ACTUAL outputs of the system, which are the ones producing the signal that simulate the virtual loudspeakers. This is the reason why the four virtual loudspeakers do not have individual Speaker Distances and Speaker Levels. If they did, they would have to be identical to the Left and Right Front-firing outputs’ values.

What is a “virtual” loudspeaker? Part 2

#91.2 in a series of articles about the technology behind Bang & Olufsen

In Part 1, I talked at how a binaural recording is made, and I also mentioned that the spatial effects may or may not work well for you for a number of different reasons.

Let’s go back to the free field with a single “perfect” microphone to measure what’s happening, but this time, we’ll send sound out of two identical “perfect” loudspeakers. The distances from the loudspeakers to the microphone are identical. The only difference in this hypothetical world is that the two loudspeakers are in different positions (measuring as a rotational angle) as shown in Figure 1.

Figure 1: Two identical, “perfect” loudspeakers in a free field with a single “perfect” microphone.

In this example, because everything is perfect, and the space is a free field, then output of the microphone will be the sum of the outputs of the two loudspeakers. (In the same way that if your dog and your cat are both asking for dinner simultaneously, you’ll hear dog+cat and have to decide which is more annoying and therefore gets fed first…)

Figure 2: The output from the microphone is the sum of the outputs from the two loudspeakers. At any moment in time, the value of the top plot + the value of the middle plot = the value of the bottom plot.

IF the system is perfect as I described above, then we can play some tricks that could be useful. For example, since the output of the microphone is the sum of the outputs of the two loudspeakers, what happens if the output of one loudspeaker is identical to the other loudspeaker, but reversed in polarity?

Figure 3: If the output of Loudspeaker 1 is exactly the same as the output of Loudspeaker 2 except for polarity, then the sum (the output of the microphone) is always 0.

In this example, we’re manipulating the signals so that, when they add together, you nothing at the output. This is because, at any moment in time, the value of Loudspeaker 2’s output is the value of Loudspeaker 1’s output * -1. So, in other words, we’re just subtracting the signal from itself at the microphone and we get something called “perfect cancellation” because the two signals cancel each other at all times.

Of course, if anything changes, then this perfect cancellation won’t work. For example, if one of the loudspeakers moves a little farther away than the other, then the system is broken, as shown below.

Figure 4: A small shift in time in the output of Loudspeaker 2 cases the cancellation to stop working so well.

Again, everything that I’ve said above only works when everything is perfect, and the loudspeakers and the microphone are in a free field; so there are no reflections coming in and ruining everything.

We can now combine these two concepts:

  1. using binaural signals to simulate a sound source in a location (although this would normally be done using playback over earphones to keep it simple) and
  2. using signals from loudspeakers to cancel each other at some location in space as a

to create a system for making virtual loudspeakers.

Let’s suspend our adherence to reality and continue with this hypothetical world where everything works as we want… We’ll replace the microphone with a person and consider what happens. To start, let’s just think about the output of the left loudspeaker.

Figure 5: The output of the left loudspeaker reaches both ears with different time/frequency characteristics caused by the HRTF associated with that sound source location.

If we plot the impulse responses at the two ears (the “click” sound from the loudspeaker after it’s been modified by the HRTFs for that loudspeaker location), they’ll look like this:

Figure 6: The impulse responses of the HRTFs for a sound source at 30º left of centre.

What if were were able to send a signal out of the right loudspeaker so that it cancels the signal from the left loudspeaker at the location of the right eardrum?

Figure 7: What if we could cancel the signal from the left loudspeaker at the right ear using the right loudspeaker?

Unfortunately, this is not quite as easy as it sounds, since the HRTF of the right loudspeaker at the right ear is also in the picture, so we have to be a bit clever about this.

So, in order for this to work we:

  • Send a signal out of the left loudspeaker.
    We know that this will get to the right eardrum after it’s been messed up by the HRTF. This is what we want to cancel…
  • …so we take that same signal, and
    • filter it with the inverse of the HRTF of the right loudspeaker
      (to undo the effects of the HRTF of the right loudspeaker’s signal at the right ear)
    • filter that with the HRTF of the left loudspeaker at the right ear
      (to match the filtering that’s done by your head and pinna)
    • multiply by -1
      (so that it will cancel when everything comes together at your right eardrum)
    • and send it out the right loudspeaker.

Hypothetically, that signal (from the right loudspeaker) will reach your right eardrum at the same time as the unprocessed signal from the left loudspeaker and the two will cancel each other, just like the simple example shown in Figure 3. This effect is called crosstalk cancellation, because we use the signal from one loudspeaker to cancel the sound from the other loudspeaker that crosses to the wrong side of your head.

This then means that we have started to build a system where the output of the left loudspeaker is heard ONLY in your left ear. Of course, it’s not perfect because that cancellation signal that I sent out of the right loudspeaker gets to the left ear a little later, so we have to cancel the cancellation signal using the left loudspeaker, and back and forth forever.

If, at the same time, we’re doing the same thing for the other channel, then we’ve built a system where you have the left loudspeaker’s signal in the left ear and the right loudspeaker’s signal in the right ear; just like a pair of headphones!

However, if you get any of these elements wrong, the system will start to under-perform. For example, if the HRTFs that I use to predict your HRTFs are incorrect, then it won’t work as well. Or, if things aren’t time-aligned correctly (because you moved) then the cancellation won’t work.

on to Part 3

What is a “virtual” loudspeaker? Part 1

#91.1 in a series of articles about the technology behind Bang & Olufsen

Without connecting external loudspeakers, Bang & Olufsen’s Beosound Theatre has a total of 11 independent outputs, each of which can be assigned any Speaker Role (or input channel). Four of these are called “virtual” loudspeakers – but what does this mean? There’s a brief explanation of this concept in the Technical Sound Guide for the Theatre (you’ll find the link at the bottom of this page), which I’ve duplicated in a previous posting. However, let’s dig into this concept a little more deeply.

To begin, let’s put a “perfect” loudspeaker in a free field. This means that it’s in a space that has no surfaces to reflect the sound – so it’s an acoustic field where the sound wave is free to travel outwards forever without hitting anything (or at least appear as this is the case). We’ll also put a “perfect” microphone in the same space.

Figure 1: A loudspeaker and a microphone (the circle) in a free field: an infinite space completely free of reflective surfaces.

We then send an impulse; a very short, very loud “click” to the loudspeaker. (Actually a perfect impulse is infinitely short and infinitely loud, but this is not only inadvisable but impossible, and probably illegal.)

Figure 2: The “click” signal that’s sent to the input of the loudspeaker.

That sound radiates outwards through the free field and reaches the microphone which converts the acoustic signal back to an electrical one so we can look at it.

Figure 3: The “click” signal that is received at the microphone’s location and sent out as an electrical signal.

There are three things to notice when you compare Figure 3 to Figure 2:

  • The signal’s level is lower. This is because the microphone is some distance from the loudspeaker.
  • The signal is later. This is because the microphone is some distance from the loudspeaker and sound waves travel pretty slowly.
  • The general shape of the signals are identical. This is because I said that the loudspeaker and the microphone were both “perfect” and we’re in a space that is completely free of reflections.

What happens if we take away the microphone and put you in the same place instead?

Figure 4: The microphone has been replaced by something more familiar.

If we now send the same click to the loudspeaker and look at the “outputs” of your two eardrums (the signals that are sent to your brain), these will look something like this:

Figure 5: The outputs of your two eardrums with the same “click” signal from the loudspeaker.

These two signals are obviously very different from the one that the microphone “hears” which should not be a surprise: ears aren’t microphones. However, there are some specific things of which we should take note:

  • The output of the left eardrum is lower than that of the right eardrum. This is largely because of an effect called “head shadowing” which is exactly what it sounds like. The sound is quieter in your left ear because your head is in the way.
  • The signal at the right eardrum is earlier than at the left eardrum. This is because the left eardrum is not only farther away, but the sound has to go around your head to get there.
  • The signal at the right eardrum is earlier than the output of the microphone output (in Figure 3) because it’s closer to the loudspeaker. (I put the microphone at the location of the centre of the simulated head.) Similarly the left ear output is later because it’s farther away.
  • The signal at the right eardrum is full of spikes. This is mostly caused by reflections off the pinna (the flappy thing on the side of your head that you call your “ear”) that arrive at slightly different times, and all add together to make a mess.
  • The signal at the left eardrum is “smoother”. This is because the head itself acts as a filter reducing the levels of the high frequency content, which tends to make things less “spiky”.
  • Both signals last longer in time. This is the effect of the ear canal (the “hole” in the side of your head that you should NOT stick a pencil in) resonating like a little organ pipe.

The difference between the signals in Figures 2 and 4 is a measurement of the effect that your head (including your shoulders, ears/pinnae) has on the transfer of the sound from the loudspeaker to your eardrums. Consequently, we geeks call it a “head-related transfer function” or HRTF. I’ve plotted this HRTF as a measurement of an impulse in time – but I could have converted it to a frequency response instead (which would include the changes in magnitude and phase for different frequencies).

Here’s the cool thing: If I put a pair of headphones on you and played those two signals in Figure 5 to your two ears, you might be able to convince yourself that you hear the click coming from the same place as where that loudspeaker is located.

Although this sounds magical, don’t get too excited right away. Unfortunately, as with most things in life, reality tends to get in the way for a number of reasons:

  • Your head and ears aren’t the same shape as anyone else’s. Your brain has lived with your head and your ears for a long time, and it’s learned to correlate your HRTFs with the locations of sound sources. If I suddenly feed you a signal that uses my HRTFs, then this trick may or may not work, depending on how similar we are. This is just like borrowing someone else’s glasses. If you have roughly the same prescription, then you can see. However, if the prescriptions are very different, you’ll get a headache very quickly.
  • In reality, you’re always moving. So, even if the sound source is not moving, the specific details of the HRTFs are always changing (because the relative positions and angles to your ears are changing) but my system doesn’t know about this – so I’m simulating a system where the loudspeaker moves around you as you rotate your head. Since this never happens in real life, it tends to break the simulation.
  • The stuff I showed above doesn’t include reflections, which is how you determine distance to sources. If I wanted to include reflections, each reflection would have to have its own HRTF processing, depending on its angle relative to your head.

However, hypothetically, this can work, and lots of people have tried. The easiest way to do this is to not bother measuring anything. You just take a “dummy head” -a thing that is the same size as an average human head (maybe with an average torso) and average pinnae* – but with microphones where the eardrums are – and you plunk it down in a seat in a concert hall and record the outputs of the two “ears”. You then listen to this over earphones (we don’t use headphones because we want to remove your pinnae from the equation) and you get a “you are there” experience (assuming that the dummy head’s dimensions and shape are about the same as yours). This is what’s known as a binaural recording because it’s a recording that’s done with two ears (instead of two or more “simple” microphones).

If you want to experience this for yourself, plug a pair of headphones into your computer and do a search for the “Virtual Barber Shop” video. However, if you find that it doesn’t work for you, don’t be upset. It just means that you’re different: just like everyone else.* Typically, recordings like this have a strange effect of things sounding very close in the front, and farther away as sources go to the sides. (Personally, I typically don’t hear anything in the front. All of the sources sound like they’re sitting on the back of my neck and shoulders. This might be because I have a fat head (yes, yes… I know…) and small pinnae (yes, yes…. I know…) – or it might indicate some inherent paranoia of which I am not conscious.)

* Of course, depressingly typically, it goes without saying that the sizes and shapes of commercially-available dummy heads are based on averages of measurements of men only. Neither women nor children are interested in binaural recordings or have any relevance to such things, apparently…

on to Part 2