Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do computers process audio data? [closed]

Tags:

audio

I have used several audio programs such as SDL mixer, audacity, etc. but I want to see what's inside these little audio toys. How does audio data get processed and such. I've also seen some sample codes of an MP3 player in C++ that it uses void* for audio data.

But all these do not help me understand in general about how audio work in computer. So could somebody explain to me (or introduce me some books) on how do computers store and process digital audio data? (for instance, if you store a triangle waveform into a .wav file, how does this waveform get stored as bit pattern?)

like image 470
Karl Avatar asked Mar 26 '11 17:03

Karl


3 Answers

How Waveforms are represented

There is a more detailed explanation of how audio is represented in the Audacity manual:

Waveform

...the height of each vertical line is represented as a signed number.


More about Digital Audio

  • The Audacity wiki has some information about how algorithms in Audacity work. If there is a specific audio effect in Audacity that you want to know more about, that isn't already covered, you can leave a question there.
  • If you are looking at source code, the echo effect is a good place to start.
  • For much much more about digital audio, click the Wikipedia buttons for links that interest you on this page. The ones at the foot of that page are particularly useful for digging deeper into the different audio file formats that are out there.

You may notice that all these links come from the Audacity project. That's not a coincidence.

like image 82
James Crook Avatar answered Oct 28 '22 06:10

James Crook


Digital audio is stored as a sequence of numbers, called samples. Example:

5, 18, 6, -4, -12, -3, 7, 14, 4

If you plot these numbers as points on a Cartesian graph: the sample value determines the position along the Y axis, and the sample's sequence number (0, 1, 2, 3, etc) determines the position along the X axis. The X axis is just a monotonically increasing number line.

Now trace a line through the points you've just plotted.

Congratulations, you have just rendered the waveform of your digital audio. :-)

The Y axis is amplitude and the X axis is time.

"Sample rate" determines how quickly the playback device (e.g. soundcard) advances through the samples. This is the "time value" of a sample. For example CD quality digital audio traverses 44,100 samples every second, reading the amplitude (Y axis value) at every sample point.

† The discussion above ignores compression. Compression changes little about the essential nature of digital audio. Much like zipping up a bitmap image doesn't change the core nature of a bitmap image. (The topic of audio compression is a rich one - I don't mean to oversimplify it, it's just that all compressed audio is eventually uncompressed before it is rendered -- that is, played as audible sound or drawn as a waveform -- at which point its compressed origins are of little consequence.)

like image 23
Mike Clark Avatar answered Oct 28 '22 07:10

Mike Clark


Taking your WAV file example:

A WAV file will have a header, which specifies key information to a player or audio processor about the number of channels, sample rate, bit depth, length of data etc. After the header comes the raw bit pattern, which stores the audio samples (I'm assuming you know what sampling is - if not, see Wikipedia). Each sample is made up of a number of bytes (specified in the header) and specifies the amplitude of the waveform at any given point in time. Each sample could be stored in signed or unsigned form (also specified in the header).

like image 33
tw39124 Avatar answered Oct 28 '22 06:10

tw39124