I just stumbled across this paper and it struck me as a brilliant idea – detecting symptoms of Parkinson’s disease by analysing frequency modulation of speech.
Category: Analysis
Acoustic measurements: Part 2
In Part 1, I talked about how any measurement of an audio device tells you something about how it behaves, but you need to know a LOT more than what you can learn from one measurements. This is especially true for a loudspeaker where you have the extra dimensions of physical space to consider.
Thought experiment: Fridges vs. Mosquitos
Consider a situation where you’re sitting at your kitchen table, and you can hear the compressor in your fridge humming/buzzing over on the other side of the room. If you make a small movement in your chair, the hum from the fridge sounds the same to you. This is partly because the distance from the fridge to you is much bigger than the changes in that distance that result from you shifting your butt.
Now think about the times you’ve been trying to sleep on a summer night, and there’s a mosquito that is flying near your ear. Very small changes in the location of that mosquito result in VERY big changes in how it sounds to you. This is because, relative to the distance to the mosquito, the changes in distance are big.
In other words, in the case of the fridge (that’s say, 3 m away) by moving 10 cm in your chair, you were changing the distance by about 3%, but the mosquito was changing its distance by more than 100% by moving just from 1 cm to 2 cm away.
In other words, a small change in distance makes a big change in sound when the distance is small to begin with.
The challenges of measuring headphones
The methods we use for measuring the magnitude response of a pair of headphones is similar to that used for measuring a loudspeaker. We send a measurement signal to the headphones from a computer, that signal comes out and is received by a microphone that sends its output back to the computer. The computer then is used to determine the difference between what it sent out and what came back. Simple, right?
Wrong.
The problems start with the fact that there are some fundamental differences between headphones and loudspeakers. For starters, there’s no “listening room” with headphones, so we don’t put a microphone 3 m away from the headphones: that wouldn’t make any sense. Instead, we put the headphones on some kind of a device that either simulates an ear, or a head, or a head with ears (with or without ear canals), and that device has a microphone (roughly) where your eardrum would be. Simple, right?
Wrong.
The problem in that sentence was the word “simulates”. How do you simulate an ear or a head or a head with ears? My ears are not shaped identically to yours or anyone else’s. My head is a different size than yours. I don’t have any hair, but you might. I wear glasses, but you might not. There are many things that make us different physically, so how can the device that we use to measure the headphones “simulate” us all? The simple answer to this question is “it can’t.”
This problem is compounded with the fact that measurement devices are usually made out of plastic and metal instead of human skin, so the headphones themselves “see” a different “acoustic load” on the measurement device than they do when they’re on a human head. (The people I work with call this your acoustic impedance.)
However, if your day job is to develop or test headphones, you need to use something to measure how they’re behaving. So, we do.
Headphone measurement systems
There are three basic types of devices that are used to measure headphones.
- an artificial ear is typically a metal plate with a depression in the middle. At the bottom of the depression is a microphone. In theory, the acoustic impedance of this is similar to a human ear/pinna + the surrounding part of your head. In practice, this is impossible.
- a headphone test fixture looks like a big metal can lying on its side (about the size of an old coffee can, for example) on a base. It might have flat metal sides, or it could have rubber pinnae (the fancy word for ears) mounted on it instead. In the centre of each circular end is a microphone.
- a dummy head looks like a simplified model of a human head (typically a man’s head). It might have pinnae, but it might not. If it does, those pinnae might look very much like human ears, or they could look like simplified versions instead. There are microphones where you would expect them, and they might be at the bottom of ear canals, but you can also get dummy heads without ear canals where the microphones are flush with the side of the head.
The test system you use is up to you – but you have to know that they will all tell you something different. This is not only because each of them has a different acoustic response, but also because their different shapes and materials make the headphones themselves behave differently.
That last sentence is important to remember, not just for headphone measurement systems but also for you. If your head and my head are different from each other, AND your pinnae and my pinnae are different from each other, THEN, if I lend you my headphones, the headphones themselves will behave differently on your head than they do on my head. It’s not just our opinions of how they sound that are different – they actually sound different at our two sets of eardrums.
General headphone types
If I oversimplify headphone design, we can talk about two basic acoustical type of headphones: They can be closed (where the back of the diaphragm is enclosed in a sealed cabinet, and so the outside of the headphones is typically made of metal or plastic) or open (where the back of the diaphragm is exposed to the outside world, typically through a metal screen). I’d say that some kinds of headphones can be called semi-open, which just means that the screen has smaller (and/or fewer) holes in it, so there’s less acoustical “transparency” to the outside world.
Examples
To show that all these combinations are different, I took three pairs of headphones
- open headphones
- semi-open headphones
- closed headphones
and I measured each of them on three test devices
- artificial “simplified” ear
- text fixture with a flat-plate
- dummy head
In addition, to illustrate an additional issue (the “mosquito problem”), I did each of these 9 measurements 5 times, removing and replacing the headphones between each measurement. I was intentionally sloppy when placing the headphones on the devices, but kept my accuracy within ±5 mm of the “correct” location. I also changed the clamping force of the headphones on the test devices (by changing the extension of the headband to a random place each time) since this also has a measurable effect on the measured response.
Do not bother asking which headphones I measured or which test systems I used. I’m not telling, since it doesn’t matter. Not to me, anyway…
The raw results
I did these measurements using a 10-second sinusoidal sweep from 2 Hz to Nyquist, on a system running at 96 kHz. I’m plotting the magnitude responses with a range from 10 Hz to 40 kHz. However, since the sweep starts at 2 Hz, you can’t really trust the results below 20 Hz (a decade below the lowest frequency of interest is a good rule of thumb when using sine sweeps).
Looking at the results in the plots above, you can come to some very quick conclusions:
- All of the measurements are different from each other, even when you’re looking at the same headphones on the same measurement device. This is especially true in the high frequency bands.
- Each pair of headphones looks like it has a different response on each measurement system.
For example, looking at Figure 3, the response of the headphones looks different when measured on a flat plate than on a dummy head. - The difference in the results of the systems are different with the different headphone types.
For example, the three sets of plots for the “semi-open” headphones (Fig. 2) look more similar to each other than the three sets of plots for the “closed” headphones (Fig. 3) - the scale of these differences is big. Notice that we have an 80 dB scale on all plots… We’re not dealing with subtleties here…
In Part 3 of this series, we’ll dig into those raw results a little to compare and contrast them and talk a little about why they are as different as they are.
Acoustic measurements: Part 1
People who work in the audio industry use all kinds of different measurements to evaluate the performance of equipment. In many cases, the measurements we do are chosen because they’re easy to do (or because they were easy to do in “The Old Days”), and not because they accurately represent how the equipment actually behaves.
Magnitude response
One simple example of this is what most people call a frequency response but what is actually a magnitude response. This is a measure of how the level of an audio signal is changed by the device under test (the “DUT”) as a function of frequency. For example, if you’re measuring a RIAA-spec preamplifier (used for converting a turntable’s pickup’s output to a “line” level signal), then it should have a magnitude response that looks like the red line in the plot in Figure 1.
This curve shows that, relative to a signal at 1 kHz, the lower the frequency, the more gain is applied to the signal and the higher the frequency, the more attenuation is applied to the signal. Note that this curve is normalised to the level at 1 kHz, which should actually be +40 dB higher if we were to include the frequency-independent gain of the system.
It’s important to remember that this plot shows us only one thing: the change in level caused by the DUT as a function of a change in frequency of the signal. What this plot does NOT show us is much, much more… For example:
- We don’t know anything about the behaviour of the system outside the boundaries of this plot.
- We don’t know anything about its phase response.
- We don’t know anything about how loud the noise of the DUT is.
- We don’t know if this plot is true if we were to measure the DUT at a different input level.
- We don’t know whether the DUT would have a different behaviour if the device that was feeding it had a different output impedance.
- We don’t know whether the DUT would have a different behaviour if the device that it was feeding had a different input impedance.
- We don’t know anything about whether the signal has any non-linear distortion artefacts.
(Notice that I didn’t say “…whether the signal is distorted” because we know it’s distorted, since the output of the DUT is not the same as the input of the DUT. Any change in the signal is a form of distortion of the signal.)
I’m not saying that a simple magnitude response plot of a DUT is not useful. I’m just saying that it’s not enough information. It’s like asking for the temperature of a cup of coffee. It’s useful information, but it doesn’t tell you enough to know whether you’re going to enjoy drinking it (unless, of course, you hate coffee…)
This problem gets even worse when you’re measuring the acoustic output of a device like a loudspeaker or a pair of headphones, for example. (The acoustic input of a microphone is a similar problem in the opposite direction.)
Let’s start by thinking about a loudspeaker’s output in real life.
- You have a device that radiates sound in space in all directions. Let’s look at that space from the loudspeaker’s perspective and say that this means an angle of rotation around the loudspeaker, and an angle of elevation above/below the loudspeaker. That makes two dimensions.
- If we’re talking about the loudspeaker’s magnitude response, then we’re looking at its output level (one dimension) as a function of frequency (one more dimension).
- That speaker is (usually) in a room, and you’re probably also there too. We can then that this is in three-dimensional space when we talk about the walls, floor, ceiling, and your location inside that space.
- Since the surfaces in the room reflect the audio signal, then the time at which the signal arrives at the listening position must also be considered. The “sound” of a loudspeaker at a listening position before the first reflection arrives is different than after a bunch of reflections are coming in and the room has started resonating as well. So, time adds one more dimension to the problem.
- We’ll ignore the non-linear distortion artefacts produced by the loudspeaker and the fact that they radiate in different directions differently, since it’s already complicated enough… However, if we were to add things like changes in the response due to temperature of the voice coil or directionally-dependent distortion artefacts like breakup, this would wind up being a much longer discussion…
So, just looking at the small list of “usual suspects” above, we can see that evaluating the sound of a single loudspeaker in a listening room is at least an 8-dimensional problem. And this doesn’t even take things like 2-channel stereo or 7.1.4 multichannel or whether you’re listening to Aretha Franklin or Stockhausen into account…
In other words, it’s complicated. So, we use reductionism to try to start to get an idea of what’s going on. We put a microphone directly in front of a loudspeaker and measure its magnitude response at one level using one kind of test signal (e.g. a swept sine wave or an MLS) and we remove all the room’s reflections somehow. This reduces our 8-dimensional problem to a 2-dimensional version: we have level as a function of frequency and nothing else, since we’ve chosen to throw away everything else by the way we did the measurement.
For example, take a look at the magnitude response shown in Figure 2, which is a real measurement of a real loudspeaker. This measurement was performed using a swept-sine (a sinusoidal wave with a frequency that changes smoothly over time, typically from low to high) with a microphone on-axis to the loudspeaker at a distance of 3 m. The measurement was time-windowed to remove the room reflections, and therefore can be considered to be a “free field” (a sound field that is free of reflections) measurement. However, the roll-off in the low end is actually a combination of the actual response of the loudspeaker and the artefacts of using a shorter time window. (We would have needed to use a much bigger room to get less influence from the time windowing.)
So, this plot ONLY tells us how the loudspeaker behaves at one point in infinite space, when we’re ONLY asking “how does the level of the loudspeaker’s output vary with changes in frequency and we ONLY play sinusoidal signals at one level.” This is all useful information, but we need to know more – otherwise, we’ll jump to conclusions about whether this loudspeaker sounds “good” or not.
Just like looking at ONLY the temperature of a cup of coffee, this doesn’t give us enough of the story to know how the loudspeaker will “sound” (no matter what a magazine reviewer will try and tell you…).
In other words, if we use reductionism to understand the problem, you simplify the question so much that the problem you wind up understanding is not the same as the thing you’re trying to understand in the first place.
For example, if we measure that same loudspeaker at a different angle (by rotating the loudspeaker and leaving the microphone in place) we’ll see a magnitude response like the one shown in Figure 3.
This magnitude response is the output of the same loudspeaker at 90º off-axis, which might be what’s heading towards your side-wall. If your side wall is perfectly reflective, then this is therefore the magnitude response of your first reflection, which might be a bad thing if you think that it’s important.
So, when you’re looking at any one measurement of anything, you don’t have enough information to know enough to make a general evaluation. However, unfortunately, many people will run with this information and make the evaluation anyway. It’s data, and data doesn’t lie, so this tells the truth, right?
Wrong. Because it’s only a portion of the total truth.
For example, you can say that “organic food is good for me” but I have an allergy to peanuts. So if I eat organic peanuts, I have about 20 minutes to get to a hospital. Much longer than that and I need a funeral home instead. “Organic” is true, but not enough information for me to know whether or not it’ll be an uneventful meal.
A Foundation for Electronic Music
I found this document from Roland, published in 1978. The information in here is still valuable – and presented as an excellent introduction.
Aliasing is weird: Part 3
After I posted the last two parts of this series (which I thought wrapped it up…) I received an email asking about whether there’s a similar thing happening if you remove the reconstruction (low-pass) filter in the digital-to-analogue part of the signal path.
The answer to this question turned out to be more interesting than I expected… So I wound up turning it into a “Part 3” in the series.
Let’s take a case where you have a 1 kHz signal in a 48 kHz system. The figure below shows three plots. The top plot shows the individual sample values as black circles on a red line, which is the analogue output of a DAC with a reconstruction filter.
The middle plot shows what the analogue output of the DAC would look like if we implemented a Sample-and-hold on the sample values, and we had an infinite analogue bandwidth (which means that the steps have instantaneous transitions and perfect right angles).
The bottom plot shows what the analogue output of the DAC would look like if we implemented the signal as a pulse wave instead, but if we still we had an infinite analogue bandwidth. (Well… sort of…. Those pulses aren’t infinitely short. But they’re short enough to continue with this story.)
If we calculate the spectra of these three signals , they’ll look like the responses shown in Figure 2.
Notice that all three have a spike at 1 kHz, as we would expect. The outputs of the stepped wave and the pulsed wave have much higher “noise” floors, as well as artefacts in the high frequencies. I’ve indicated the sampling rate at 48 kHz as a vertical black line to make things easy to see.
We’ll come back to those artefacts below.
Let’s do the same thing for a 5 kHz sine wave, still in a 48 kHz system, seen in Figures 3 and 4.
Compare the high-frequency artefacts in Figure 4 to those in Figure 2.
Now, we’ll do it again for a 15 kHz sine wave.
There are three things to notice, comparing Figures 2, 4, and 6.
The first thing is that artefacts for the stepped and pulsed waves have the same frequency components.
The second thing is that those artefacts are related to the signal frequency and the sampling rate. For example, the two spikes immediately adjacent to the sampling rate are Fs ± Fc where Fs is the sampling rate and Fc is the frequency of the sine wave. The higher-frequency artefacts are mirrors around multiples of the sampling rate. So, we can generalise to say that the artefacts will appear at
n * Fs ± Fc
where n is an integer value.
This is interesting because it’s aliasing, but it’s aliasing around the sampling rate instead of the Nyquist Frequency, which is what happens at the ADC and inside the digital domain before the DAC.
The third thing is a minor issue. This is the fact that the level of the fundamental frequency in the pulsed wave is lower than it is for the stepped wave. This should not be a surprise, since there’s inherently less energy in that wave (since, most of the time, it’s sitting at 0). However, the artefacts have roughly the same levels; the higher-frequency ones have even higher levels than in the case of the stepped wave. So, the “signal to THD+N” of the pulsed wave is lower than for the stepped wave.
Aliasing is Weird: Part 2
In Part 1, we looked at what happens when you try to record a signal whose frequency is higher than 1/2 the sampling rate (which, from now on, I’ll call the Nyquist Frequency, named after Harry Nyquist who was one of the people that first realised that this limit existed). You record a signal, but it winds up having a different frequency at the output than it had at the input. In addition, that frequency is related to the signal’s frequency and the sampling rate itself.
In order to prevent this from happening, digital recording systems use a low-pass filter that hypothetically prevents any signals above the Nyquist frequency from getting into the analogue-to-digital conversion process. This filter is called an anti-aliasing filter because it prevents any signals that would produce an alias frequency from getting into the system. (In practice, these filters aren’t perfect, and so it’s typical that some energy above the Nyquist frequency leaks into the converter.)
So, this means that if you put a signal that contains high frequency components into the analogue input of an analogue-to-digital converter (or ADC), it will be filtered. An example of this is shown in Figure 1, below. The top plot is a square wave before filtering. The bottom plot is the result of low-pass filtering the square wave, thus heavily attenuating its higher harmonics. This results in a reduction in the slope when the wave transitions between low and high states.
This means that, if I have an analogue square wave and I record it digitally, the signal that I actually record will be something like the bottom plot rather than the top one, depending on many things like the frequency of the square wave, the characteristics of the anti-aliasing filter, the sampling rate, and so on. Don’t go jumping to conclusions here. The plot above uses an aggressively exaggerated filter to make it obvious that we do something to prevent aliasing in the recorded signal. Do NOT use the plots as proof that “analogue is better than digital” because that’s a one-dimensional and therefore very silly thing to claim.
However…
… just because we keep signals with frequency content above the Nyquist frequency out of the input of the system doesn’t mean that they can’t exist inside the system. In other words, it’s possible to create a signal that produces aliasing after the ADC. You can either do this by
- creating signals from scratch (for example, generating a sine tone with a frequency above Nyquist)
or - by producing artefacts because of some processing applied to the signal (like clipping, for example).
Let’s take a sine wave and clip it after it’s been converted to a digital signal with a 48 kHz sampling rate, as is shown in Figure 2.
When we clip a signal, we generate high-frequency harmonics. For example, the signal in Figure 2 is a 1 kHz sine wave that I clipped at ±0.5. If I analyse the magnitude response of that, it will look something like Figure 3:
The red curve in Figure 2 is not a ‘perfect’ square wave, so the harmonics seen in Figure 3 won’t follow the pattern that you would expect for such a thing. But that’s not the only reason this plot will be weird…
Figure 3 is actually hiding something from you… I clipped a 1 kHz sine wave, which makes it square-ish. This means that I’ve generated harmonics at 3 kHz, 5 kHz, 7 kHz, and so on, up to ∞ Hz..
Notice there that I didn’t say “up to the Nyquist frequency”, which, in this example with a sampling rate of 48 kHz, would be 24 kHz.
Those harmonics above the Nyquist frequency were generated, but then stored as their aliases. So, although there’s a new harmonic at 25 kHz, the system records it as being at 48 kHz – 25 kHz = 23 kHz, which is right on top of the harmonic just below it.
In other words, when you look at all the spikes in the graph in Figure 3, you’re actually seeing at least two spikes sitting on top of each other. One of them is the “real” harmonic, and the other is an alias (there are actually more, but we’ll get to that…). However, since I clipped a 1 kHz sine wave in a 48 kHz world, this lines up all the aliases to be sitting on top of the lower harmonics.
So, what happens if I clip a sine wave with a frequency that isn’t nicely related to the sampling rate, like 900 Hz in a 48 kHz system, for example? Then the result will look more like Figure 4, which is a LOT messier.
A 900 Hz square wave will have harmonics at odd multiples of the fundamental, therefore at 2.7 kHz, 4.5 kHz, and so on up to 22.5 kHz (900 Hz * 25).
The next harmonic is 24.3 kHz (900 Hz * 27), which will show up in the plots at 48 kHz – 24.3 kHz = 23.7 kHz. The next one will be 26.1 kHz (900 Hz * 29) which shows up in the plots at 21.9 kHz. This will continue back DOWN in frequency through the plot until you get to 900 Hz * 53 = 47.7 kHz which will show up as a 300 Hz tone, and now we’re on our way back up again… (Take a look at Figure 7, below for another way to think of this.)
The next harmonic will be 900 Hz * 55 = 49.5 kHz which will show up in the plot as a 1.5 kHz tone (49.5 kHz – 48 kHz).
Depending on the relationship between the square wave’s frequency and the sampling rate, you either get a “pretty” plot, like for the 6 kHz square wave in a 48 kHz system, as shown in Figure 5.
Or, it’s messy, like the 7 kHz square wave in a 48 kHz system in Figure 6.
The moral of the story
There are three things to remember from this little pair of posts:
- Some aliased artefacts are negative frequencies, meaning that they appear to be going backwards in time as compared to the original (just like the wheel appearing to rotate backwards in Part 1).
- Just because you have an antialiasing filter at the input of your ADC does NOT protect you from aliasing, because it can be generated internally, after the signal has been converted to the digital domain.
- Once this aliasing has happened (e.g. because you clipped the signal in the digital domain), then the aliases are in the signal below the Nyquist frequency and therefore will not be removed by the reconstruction low-pass filter in the DAC. Once they’re mixed in there with the signal, you can’t get them out again.
One additional, but smaller problem with all of this is that, when you look at the output of an FFT analysis of a signal (like the top plot in Figure 7, for example), there’s no way for you to know which components are “normal” harmonics, and which are aliased artefacts that are actually above the Nyquist frequency. It’s another case proving that you need to understand what to expect from the output of the FFT in order to understand what you’re actually getting.
Aliasing is weird: Part 1
One of the best-known things about digital audio is the fact that you cannot record a signal that has a frequency that is higher than 1/2 the sampling rate.
Now, to be fair, that statement is not true. You CAN record a signal that has a frequency that is higher than 1/2 the sampling rate. You just won’t be able to play it back properly, because what comes out of the playback will not be the original frequency, but an alias of it.
If you record a one-spoked wheel with a series of photographs (in the old days, we called this ‘a movie’), the photos (the frames of the movie) might look something like this:
As you can see there, the wheel happens to be turning at a speed that results in it rotating 45º every frame.
The equivalent of this in a digital audio world would be if we were recording a sine wave that rotated (yes…. rotated…) 45º every sample, like this:
Notice that the red lines indicating the sample values are equivalent to the height of the spoke at the wheel rim in the first figure.
If we speed up the wheel’s rotation so that it rotated 90º per frame, it looks like this:
And the audio equivalent would look like this:
Speeding up even more to 135º per frame, we get this:
and this:
Then we get to a magical speed where the wheel rotated 180º per frame. At this speed, it appears when we look at the playback of the film that the wheel has stopped, and it now has two spokes.
In the audio equivalent, it looks like the result is that we have no output, as shown below.
However, this isn’t really true. It’s just an artefact of the fact that I chose to plot a sine wave. If I were to change the phase of this to be a cosine wave (at the same frequency) instead, for example, then it would definitely have an output.
At this point, the frequency of the audio signal is 1/2 the sampling rate.
What happens if the wheel goes even faster (and audio signal’s frequency goes above this)?
Notice that the wheel is now making more than a half-turn per frame. We can still record it. However, when we play it back, it doesn’t look like what happened. It looks like the wheel is going backwards like this:
Similarly, if we record a sine wave that has a frequency that is higher than 1/2 the sampling rate like this:
Then, when we play it back, we get a lower frequency that fits the samples, like this:
Just a little math
There is a simple way to calculate the frequency of the signal that you get out of the system if you know the sampling rate and the frequency of the signal that you tried to record.
Let’s use the following abbreviations to make it easy to state:
- Fs = Sampling rate
- F_in = frequency of the input signal
- F_out = frequency of the output signal
IF
F_in < Fs/2
THEN
F_out = F_in
IF
Fs > F_in > Fs/2
THEN
F_out = Fs/2 – (F_in – Fs/2) = Fs – F_in
Some examples:
If your sampling rate is 48 kHz, and you try to record a 25 kHz sine wave, then the signal that you will play back will be:
48000 – 25000 = 23000 Hz
If your sampling rate is 48 kHz, and you try to record a 42 kHz sine wave, then the signal that you will play back will be:
48000 – 42000 = 6000 Hz
So, as you can see there, as the input signal’s frequency goes up, the alias frequency of the signal (the one you hear at the output) will go down.
There’s one more thing…
Go back and look at that last figure showing the playback signal of the sine wave. It looks like the sine wave has an inverted polarity compared to the signal that came into the system (notice that it starts on a downwards-slope whereas the input signal started on an upwards-slope). However, the polarity of the sine wave is NOT inverted. Nor has the phase shifted. The sine wave that you’re hearing at the output is going backwards in time compared to the signal at the input, just like the wheel appears to be rotating backwards when it’s actually going forwards.
In Part 2, we’ll talk about why you don’t need to worry about this in the real world, except when you REALLY need to worry about it.
Distortion effects on Linear measurements, Part 4
In Part 3, I showed that a magnitude responses calculated from impulse responses produced by the MLS and swept sine methods produce different results when the measurement signals themselves are distorted.
In this posting, I’ll focus on the swept sine method which showed that the apparent magnitude response of the system looked like a strange version of a low shelving filter, but there’s a really easy explanation for this that goes back to something I wrote in Part 1.
The way these systems work is to cross-correlate the signal that comes back from the DUT with the signal that was sent to it. Cross-correlation (in this case) is a bit of math that tells you how similar two signals are when they’re compared over a change in time (sort of…). So, if the incoming signal is identical to the outgoing signal at one moment in time but no other, then the result (the impulse response) looks like a spike that hits 1 (meaning “identical”) at one moment, and is 0 (meaning “not at all alike in any way…”) at all other times.
However, one important thing to remember is that both an MLS signal and a swept sine wave take some time to play. So, on the one hand, it’s a little weird to think of a 10-second sweep or MLS signal being converted to a theoretically-infinitely short impulse. On the other hand, this can be done if the system doesn’t change in time and therefore never changes: something we call a Linear Time-Invariant (or LTI) system.
But what happens if the DUT’s behaviour DOES change over time? Then things get weird.
At the end of Part 1, I said
For both the MLS and the sine sweep, I’m applying a pre-emphasis filter to the signal sent to the DUT and a reciprocal de-emphasis filter to the signal coming from it. This puts a bass-heavy tilt on the signal to be more like the spectrum of music. However, it’s not a “pinking” filter, which would cause a loss of SNR due to the frequency-domain slope starting at too low a frequency.
Then, in Part 2 I said that, to distort the signals, I
look for the peak value of the measurement signal coming into the DUT, and then clip it.
It’s the combination of these two things that results in the magnitude response of the system measured using a swept sine wave looking the way it does.
If I look at the signal that I actually send to the input of the DUT, it looks like this:
I’m normalising this to have a maximum value of 1 and then clipping it at some value like ±0.5, for example, like this:
So, it should be immediately obvious that, by choosing to clip the signal at 1/2 of the maximum value of the whole sweep, I’m not clipping the entire thing. I’m only distorting signals below some frequency that is related to the level at which I’m clipping. The harder I clip, the higher the frequency I mess up.
This is why, when we look at the magnitude response, it looks like this:
In the very low frequencies, the magnitude response is flat, but lower than expected, because the signal is clipped by the same amount. In the high frequencies, the signal is not clipped at all, so everything is behaving. In between these two bands, there is a transition between “not-behaving” and “behaving”.
This means that
- if the signal I was sending into the system was clipped by the same amount at all frequencies, OR
- if the pre-emphasis wasn’t applied to the signal, boosting the low frequencies
Then the magnitude response would look almost flat, but lower than expected (by the amount that is related to how much it’s clipped). In other words, we would (mostly) see the linear response of the system, even though it was behaving non-linearly – almost like if we had only sent a click through it.
However, if we chose to not apply the pre-emphasis to the signal, then the DUT wouldn’t be behaving the way it normally does, since this is very roughly equivalent to the spectral balance of music. For example, if you send a swept sine wave from 20 Hz to 20,000 Hz to a loudspeaker without applying that bass boost, you’ll could either get almost nothing out of your woofer, or you’ll burn out your tweeter (depending on how loudly you’re playing the sweep).
How does the result look without the pre-emphasis filter applied to the swept sine wave? For example, if we sent this to the DUT:
… and then we clipped it at 1/2 the maximum value, so it looks like this:
(notice that everything is clipped)
then the impulse response and magnitude response look like this instead:
… which is more similar to the results when we clip the MLS measurement signal in that we see the effects on the top end of the signal. However, it’s still not a real representation of how the DUT “sounds” whatever that means…
Distortion effects on Linear measurements, Part 3
This posting will just be some more examples of the artefacts caused by symmetrical clipping of the measurement signal for the MLS and swept-sine methods, clipping at different levels.
Remember that the clip level is relative to the peak level of the measurement signal.
MLS
Swept Sine
The take-home message here is that, although both the MLS and the swept sine methods suffer from showing you strange things when the DUT is clipping, the swept sine method is much less cranky…
In the next posting, I’ll explain why this is the case.
Distortion effects on Linear measurements, Part 2
Let’s make a DUT with a simple distortion problem: It clips the signal symmetrically at 0.5 of the peak value of the signal, so if I send in a sine wave at 1 kHz, it looks like this:
Now, to be fair, what I’m REALLY doing here is to look for the peak value of the measurement signal coming into the DUT, and then clipping it. This would be equivalent to doing a measurement of the DUT and adjusting the input gain so that it looks like a peak level of – 6 dB relative to maximum is coming in.
Also, because what I’m about to do through this series is going to have radical effects on the level after processing, I’m normalising the levels. So, some things won’t look right from a perspective of how-loud-it-appears-to-be.
If I measure that DUT using the three methods, the results look like this:
As can easily be seen above, the three systems show very different responses. So, unlike what I claimed this post (which is admittedly over-simplified, although intentionally so to make a point…), the fact that they are measuring the impulse response does not mean that we can’t see the effects of the non-linear response. We can obviously see artefacts in the linear response that are caused by the distortion, but those artefacts don’t look like distortion, and they don’t really show us the real linear response.