In the world of digital audio recording, we hear regular conversations (or arguments) about how to most accurately capture a waveform. Two of the larger topics include bit depth and sample rate. At this point, I don’t think anyone in their right mind would argue that recording at a 24-bit word length isn’t supremely better than 16 bits. The low noise floor we get in a 24-bit environment is plenty enough reason, and if you’ve been at it for a while you’ll fondly remember the first time you mixed a song in 24-bits. That being said, the biggest debates I’ve read and heard over the past 10 years have been regarding sample rate.
I’ve pondered this question for years, and for a time I was convinced that recording at a higher sample rate yields better results – the higher, the better. This belief was justified by the simple fact that a waveform, when sampled at 88.2K, more accurately represents the original waveform than when sampled at 44.1K. It made logical sense – twice the sampled data is twice as accurate and equals a twice-as-smooth waveform. All well and good, but with the higher sample rates come downsides: twice the file size, twice the CPU demand, twice the storage space required, and ultimately half the effectiveness of your DAW – and the engineer.
So, is such an increase in information completely necessary for high fidelity? Is it really worth the added expense in equipment necessary to operate at the highest sample rates allowed by our DAWs? Or, have we all just be fooled into false beliefs by strategic marketing campaigns designed to mislead us into thinking that we were recording utter crap at 44.1K, and now we must use 192K (or higher) gear to maintain professionalism? I’ve been on a decade-long quest to decide once and for all if higher sample rates truly equal a more accurate reproduction of audio. After dozens of technical papers, dozens of books, working with top-notch music producers who can mix me into the ground, an A.S. in Recording Arts and 16 years of digital recording experience, I have finally (for now) made a decision on the matter – 48k is my standard. Anything else is a waste of my resources. Before you ream me out, hear me out – you probably have the same ideas I had before I settled down.
If you’re going to argue, you better first understand the Nyquist Theorem. The Nyquist Theorem tells us that if we record at the Red Book audio CD standard sample rate of 44.1K, we will be able to capture any audio that occurs at 22.05K and below. This seems safe and acceptable, knowing that only a rare freak of nature can hear a 20K tone, let alone a 22K tone. Therefore, 44.1K is surely capable of capturing every possible sound perceivable by the average human ear. Great – but just because we can’t hear something doesn’t mean it’s not there.
“We do not resonate with how many samples were used to capture that magic vocal – we respond to the emotion and feeling it portrays.”
In the quest of perfectly capturing waveforms and perfectly reproducing them, it stands to reason that there could be imperceivable audio energy happening at frequencies above 22K that affect the energy of the audio we do perceive, therefore harboring a certain quality that is lost if not recorded at a very high sample rate (88.2K, 96K, 192K or even higher). I have seen evidence to suggest that instruments produce sounds occurring at ultra high frequencies that affect what we do hear by way of harmonic resonance. Therefore, shouldn’t we be trying to capture all those ultra high frequencies so as to accurately reproduce the effects of harmonic resonance from the recorded medium? I used to think so.
When I first considered this harmonic resonance concept, I thought it was sufficiently logical to support the argument that higher sample rates are better. Then I started really thinking. There is a fatal flaw in that theory: if harmonic resonance from ultra high frequencies affects what we actually can hear, aren’t we already capturing that quality when we record using at least a 40K sample rate to satisfy the Nyquist Theorem requirements of our human ears? The answer is a resounding, “Yes we are”. Moreover, our microphones don’t typically pick up anything above 20K anyway! It is generally impossible to capture and reproduce the ultra high frequencies that 192K can record. Even on the most expensive Pro Tools system, we’re still monitoring on speakers that don’t reproduce anything over 20K! What’s the deal?
The quest for perfection has made people susceptible to devious marketing tactics, designed solely to make you feel inadequate and compelled to spend unnecessary money on the latest and greatest do-hickey. In my experience, there is nothing meaningful to be gained by recording at any rate above 48K. I like 48K because it’s a more even number than 44.1K. It is also the DVD audio standard and allows you to capture up to 24K tones (still far above what we can actually hear with our ears). Even better, it allows me to use a higher track count in my sessions, many more DAW plug-ins, substantially less storage space, and less CPU cycles. In the end, I have a much more efficient machine with which to craft my vision. No one has ever said to me, “Hey man, I wish you would have recorded that song at 192K instead of 48K – it would sound so much better…”, and I don’t ever expect to hear that. In the end, we are not robots. We do not resonate with how many samples were used to capture that magic vocal – we respond to the emotion and feeling it portrays.
For further reading, Dan Lavry has penned the most amazing technical paper covering this subject that I have ever feasted my eyes upon. If you want some hard science, I strongly recommend that you devour it.