Wednesday, May 18, 2022

Thinking about EQ

 In the last few days I've been re-examining my upper midrange EQ.  So far, it seems as if something like I've dialed in for years, my extended version of a "Linkwitz-Gundry" dip, is needed, though I'm not exactly sure what form it should take, what process should be used to optimize it, and how it should be implemented.

But there are many ways of creating this dip.  The actual center frequencies could be anywhere within a wide range of possibilities, from 2kHz-12kHz.  There has to be more than one Parametric EQ (PEQ) or Graphic EQ slider in use.  It has to start falling around 2-3kHz and keep attenuating until at least 7kHz if not 12kHz or 16kHz.  The resulting response curve should be fairly smooth and more or less monotonically falling at higher frequencies.  Those are about the only things I think I know for sure.  There are infinite combinations of PEQ's that would meet that general sketch.  The smooth and monotonic response criteria are about the hardest to achieve.  I'm not sure if the monotonic part actually matters at high enough frequencies (above about 12kHz).  I've long used a supertweeter that peaks around 22kHz and thought that was dandy, however perhaps the supertweeter doesn't even make any difference at all (I haven't bothered to test in double blind experiments...I suspect it would not be easy to prove it as I believe nobody else has!...the widely known "proof" of the audibility of ultrasonics is currently from brain wave experiments...and even those results are uncertain if not speculative...but I do ultrasonics because I can and it always seems to make an important difference in sighted testing and it's nice to believe I'm ahead of the curve in this way).

But at audible frequencies, roughness and lack of monotonicity often seem to be especially audible, about as much as the actual response levels involved.  So I think it's reasonable to try to minimize those even lacking proof that they are, specifically, a problem.

I have to be very clear in saying that something like Real Time Analyzer response curves, like the many pictures included in the previous post, are not the truth in and of themselves.  This is true regardless of resolution, accuracy, repeatability, or other qualitative factors...but those factors might make the RTA even less true.

It's like taking a picture of a crowd.  Every picture is going to be different.  It could happen that even every pixel in every picture will be different, and yet it's the same crowd.  Meanwhile, the information in any one picture isn't very complete.  Some people, for example, might be turned away from the camera and you don't see their faces at all, yet their faces may appear in other pictures.  Either way you know everyone has a face, etc, even if you can't see it.

So it is with RTA's and in fact any other kind of measurement.  It's true of sighted and blind listening tests as well.  They are all incomplete.  Make that very incomplete.  There are deeper truths that no one measurement, no one listening test, etc, no matter how good, will capture.  And most measurements we make are very very incomplete.  As with a snapshot of a crowd.

We know, for example, that along with the frequency response variation shown in an RTA there is also time delay variation, sometimes called phase, or more correctly Group Delay, and this is NOT shown in the RTA.

You can get the Group Delay from some kind of comparison measurements, where you are able to compare the input and output signals.  The measurement could be made with simple impulses, "maximum" length sequences, or even gated noise, so long as an input to output comparison is being made, and typically this is done with Fast Fourier Transforms (FFT) or something similar.  I have several programs that do that and they show both the frequency and the phase response.

That's just one tiny way in which an RTA is incomplete, and it may not even be that important anyway, for one of several reasons:

1) It appears we are far less sensitive to variations in time delay than to frequency amplitude response. 

2) We don't even have any simple ways to describe it.  We could, hypothetically describe distortions in the depth or other aspects of imaging that might result from it (though in some cases it might not even be relevant, imaging depth being more dependent on inherent reverb applied to some voice...and reproducing that might be mostly a matter of low noise and distortion, not linear phase reproduction per se).

3) If we are to fix the frequency response, in nearly all cases we will also fix the group delay response.  This is because most physical phenomena are minimum phase, which means they can be inverted to an equal and opposite minimum phase process which perfectly cancels it.

I'm not entirely clear that all room response related variation is minimum phase, but at least one audio reviewer and Mathematician REG has said.  It does in fact seem you can model the room system with linear equations.

Non-minimum phase processes are made possible by things that are not linear...like discrete sampling.  Thus, we can have non-minimum phase processes in digital audio, essentially we can model or approximate anything.  We can make the phase and/or amplitude change as we wish and not only in minimum phase ways.

Now, imagine a series of non-interacting frequency response deviations a system might have.  You could imagine them as a list of unplanned/unwanted parametric EQ's (UEQ).  For each UEQ, we create an equal and opposite PEQ, and all these deviations are corrected.

But there are many issues here.  For one thing we don't really know the parameters of each UEQ.  We don't even know how many UEQ's there are!  Judging by the complexity of response curves, which get more complex the more well you can measure them, I'd guess the answer is way more than we can easily count.

So, probably even in the best of cases, were are only correcting to a simplification of the actual problem.  But imagine the difference between a system we could perfectly correct, and one in which we didn't quite perfectly correct, but approximate.

Imagine the true UEQ has a full octave bandwidth at 1kHz.  But what we have, instead, is a 1/3 octave graphic equalizer that gives us 800, 900, 1000, 1200, and 1500 Hz sliders.  We can push the sliders up to approximate the inverse of the full octave bandwidth UEQ, but it will never perfectly match (though Behringer had a trademarked method which was supposed to remove the response ripples from using sliders similar to this, something like "True Response").  Instead, compared with the the original, there will be tiny ripples in the amplitude responses at the transitions, and ripples in the phase response as well.

Should we go ahead with the equalization anyway?  I believe the answer would be yes, even if we can't (and in fact we never can) perfectly correct the input, our approximation helps, and it may in fact help in both the frequency response AND the group delay domains, restoring a better approximation of the original.

I'm a bit shaky, however, about the ripples.  If the ripples we make with our imperfect corrections approximation are too big, then the answer might be no.

So I can see the wiggles in the frequency response RTA, to at least 1/6 octave accuracy, to get some idea if my alterations are making things better or worse (but each measurement is fairly adhoc, I have no perfect positioning mechanism, there are various kinds of home noises included, random junk in the living room, etc etc).  But I can't see the changes in the ripples in the phase responses.  But see the points made above, I think it's a second order concern at best.  I need merely keep to keep the resulting frequency response fairly smooth, and most likely the group delay ripples won't be too big either.  And we especially have to avoid jaggedness or lack of monotonicity.

So this is justifying a kind of "correct to the measurements" approach I've long been abhorring, as compared to "correct to the model."  The problem is, with anything other than the bass response, I don't really have a very good model yet.  These are simple "response peaks" caused by room modes, though they may be caused or influenced by room modes.  They could also be affected by the geometries and materials used in the Acoustat speakers, including their compound transformer system.

Now a couple of days ago, I was saying how wonderful that I could boost two dips around 1kHz (at 1013Hz and 857 Hz specifically) for a better sounding midrange, and how that was better than using 1/3 octave sliders (actually, it can't very well be done with sliders...maybe...now I'm not sure I did it very well with two PEQ's because I keeps seeing that 900 Hz dip in some if not all RTA's, which just tells you how variable RTA's can be, sometimes 900 Hz is even higher than 1000 Hz, etc.  But I'm now thinking my magic didn't fix the problem that well).

I'm about ready to re-do the whole thing in graphic EQ and see how that works.  But I've come to appreciate the digital accuracy and control of the DEQ 2496 as compared with an analog Graphic EQ.  I need to liberate one DEQ from the bedroom, where it has long been slated to be replaced by a new miniDSP which has been in my inventory about a year without being set up yet.  I need that to do more convenient tests, I've been straining my body getting down on the floor to adjust the DEQ currently in line for the panels (and repeating every minute or two for readjusting and then back up again for the measuring).

A fairly reasonable strategy would be to do most correction using the 1/3 octave graphic, then fix remaining irregularities with the PEQ which can do very tiny corrections down to 1/10 octave.  Though I'm still tempted to tame the big 4.5 kHz rise with a one or more octave PEQ, just for starters.

Now about the Linkwitz-Gudry dip I remain uncertain.  But also it seems to me that a planar radiator is radiating A LOT more highs into the room than a traditional point source tweeter, which has very reduced dispersion at high frequencies, aimed right at the listener for best effect.  As a result, more of the highs from are indeed reflected from the room into the side if the head and straight down the ear canal, much more than in a real venue or room with point source tweeter.  So I think it's very reasonable to believe planar speakers require a falling high frequency response.  Possibly point source tweeters do also.

I need to do sweeps on my other systems to get a clearer idea too.

*** Update

I forgot to make a key point that's been in my mind.  There is, it turns out, only one way to determine the frequency components that make up an unwanted resonance, and that is by actually cancelling it out.  The proof is in the pudding, so to speak.  Measurements alone don't get deep enough into the mix, or so it has always seemed to me, besides the measurements themselves can vary quite a bit depending on technological choices and random factors.

FFT measurements are limited by numerous factors, including the size of windows, the windowing function, etc, etc.  Then, what we get in RTA compounds this construction further with averaging.

I've never seen anything like FFT magically pulling out node frequencies that need to be cancelled.  FFT analysis systems seems particularly weak in the bass, especially if they are trying to do a full spectrum analysis, the bass gets relatively few points to begin with.

Meanwhile, I've seen very little discussion of how program like REW compute required equalizer settings, or even how well it does that.  People seem to think it just works, but in my experience, my time honored methods, including slowly sweeping with an oscillator, seem better.  (Slow sweeping with an oscillator is very revealing in the bass, anyway, above the bass it often seems useless, unless there is some little thing physically rattling.  Above the bass, RTAs and FFTs work better than oscillator sweeping, it seems.)

It has long been known by experts that simple pulse-type transients (such as the on/off "dirac") don't really give enough information to see much.  An FFT of a single pulse looks very revealing, but there may be a lot buried under its noise floor.  So for decades now different stimuli have been used, including "maximum length sequences" (which may have been maximum length decades ago) which generally are some kind of fast sweep, and even canned truncated bits of pink noise.

Real music may contain long bass tones, long enough to do many reflections and build up a nodal response, whereas diracs do not.  So using an oscillator you can show 40dB peaks and nulls where an FFT spectrum of the bass looks like a nice roll of flubber.



The higher frequency resonances seem to be something like nodal





No comments:

Post a Comment