Sunday, June 22, 2014

Audiophile Information Model

A musical event, either live, or the sum total of pre-recorded takes, consists of a very large amount of information, though perhaps much of it redundant.  A produced album consists of a huge reduction in the original information, but presumably selected or translated as to concentrate the musical information from the redundant, and have a particular style.

Listening to a recording, and audiophile gathers information of many kinds, originating from the recording,  constructed (interpolated or stylized) mentally from information in the recording, and caused by the interaction between the recording and the reproducing system.

In no single listening can all the information be fully gathered originating from the recording, in each listening only a subset of information originating from the recording is actually gathered, and the subset gathered on successive listenings differs, though usually with large central overlap.  But while the subset of information gathered originating from the recording may vary little, the impact on mental reconstruction may be discontinuously huge, because the construction process is itself highly discontinuous and non-linear.  The information gathered resulting from interaction with the reproducing system may also vary only slightly, but have large discontinuous impact upon mental reconstruction.

Now many audio reproducing systems are simplifying systems.  They simplify the information available from the recording.  A classic and ubiquitous way is by limiting frequency response.  Few reproducing systems do not restrict frequency response in some way, though most ubiquitously in the very deep bass.  So while we may hear to 20 Hz and feel to 16 Hz, few reproduction systems do a good job of reproducing the octave below 32 Hz.  Meanwhile, systems with 32 Hz response may have high end response to 22kHz, which is actually a violation of a longstanding rule of frequency response bandwidth limiting…low frequency restriction should be matched by high frequency restriction to sound balanced, and response to 20kHz corresponds with low frequency response to 20Hz.

But simplification in other forms is actually more serious.  Most of these are simplification by obscurity.  Resonances draw attention to themselves and their intermodulations with the music, but obscure adjacent details.

Systems involving dynamic compression are another type.  The worst of those can simplify by reduction.  MP3 is an example of such a system, the missing information is simply gone.

I maintain that DSD is such a dynamic compression system, as are all PWM and delta sigma systems.  They often rely on reduced human sensitivity to high frequency dynamic range.  Canonical DSD has 1 bit of resolution at 64fs, which means little more than 65^2 possible states at 20kHz.  Redbook 16 bit 44.1kHz digital has 65536^2 possible states at that frequency.

Now you can see from the vast amount of information potentially available at just one cycle of the highest frequency we can hear…we can never hear it all, but only a subset.  But if there is vastly reduced information in the recording itself, then more overlapping subsets of information may be heard on each listening.  Ultimately making it boring.

