I am starting out with audio recording using my Android smartphone. I successfully saved voice recordings to a PCM file. When I parse the data and print out the signed, 16-bit values, I can create a graph like the one below. However, I do not understand the amplitude values along the y-axis. <ol> <li>What exactly are the units for the amplitude values? The values are signed 16-bit, so they must range from -32K to +32K. But what do these values represent? Decibels? </li> <li>If I use 8-bit values, then the values must range from -128 to +128. How would that get mapped to the volume/"loudness" of the 16-bit values? Would you just use a 16-to-1 quantisation mapping?</li> <li>Why are there negative values? I would think that complete silence would result in values of 0.</li> </ol> If someone can point me to a website with information on what's being recorded, I would appreciate it. I found webpages on the PCM file format, but not what the data values are. <img src="https://i.stack.imgur.com/W6SL8.png" alt="enter image description here">

Think of the surface of the microphone. When it's silent, the surface is motionless at position zero. When you talk, that causes the air around your mouth to vibrate. Vibrations are spring like, and have movement in both directions, as in back and forth, or up and down, or in and out. The vibrations in the air cause the microphone surface to vibrate as well, as in move up and down. When it moves down, that might be measured or sampled a positive value. When it moves up that might be sampled as a negative value. (Or it could be the opposite.) When you stop talking the surface settles back down to the zero position. What numbers you get from your PCM recording data depend on the gain of the system. With common 16 bit samples, the range is from -32768 to 32767 for the largest possible excursion of a vibration that can be recorded without distortion, clipping or overflow. Usually the gain is set a bit lower so that the maximum values aren't right on the edge of distortion. ADDED: 8-bit PCM audio is often an unsigned data type, with the range from 0..255, with a value of 128 indicating "silence". So you have to add/subtract this bias, as well as scale by about 256 to convert between 8-bit and 16-bit audio PCM waveforms.

PCM audio amplitude values?

Tags:

android

iphone

audio

audio-recording

pcm

I am starting out with audio recording using my Android smartphone.

I successfully saved voice recordings to a PCM file. When I parse the data and print out the signed, 16-bit values, I can create a graph like the one below. However, I do not understand the amplitude values along the y-axis.

What exactly are the units for the amplitude values? The values are signed 16-bit, so they must range from -32K to +32K. But what do these values represent? Decibels?
If I use 8-bit values, then the values must range from -128 to +128. How would that get mapped to the volume/"loudness" of the 16-bit values? Would you just use a 16-to-1 quantisation mapping?
Why are there negative values? I would think that complete silence would result in values of 0.

If someone can point me to a website with information on what's being recorded, I would appreciate it. I found webpages on the PCM file format, but not what the data values are.

enter image description here

555

asked May 04 '11 22:05

stackoverflowuser2010

2 Answers

Think of the surface of the microphone. When it's silent, the surface is motionless at position zero. When you talk, that causes the air around your mouth to vibrate. Vibrations are spring like, and have movement in both directions, as in back and forth, or up and down, or in and out. The vibrations in the air cause the microphone surface to vibrate as well, as in move up and down. When it moves down, that might be measured or sampled a positive value. When it moves up that might be sampled as a negative value. (Or it could be the opposite.) When you stop talking the surface settles back down to the zero position.

What numbers you get from your PCM recording data depend on the gain of the system. With common 16 bit samples, the range is from -32768 to 32767 for the largest possible excursion of a vibration that can be recorded without distortion, clipping or overflow. Usually the gain is set a bit lower so that the maximum values aren't right on the edge of distortion.

ADDED:

8-bit PCM audio is often an unsigned data type, with the range from 0..255, with a value of 128 indicating "silence". So you have to add/subtract this bias, as well as scale by about 256 to convert between 8-bit and 16-bit audio PCM waveforms.

196

answered Sep 30 '22 20:09

hotpaw2

The raw numbers are an artefact of the quantization process used to convert an analog audio signal into digital. It makes more sense to think of an audio signal as a vibration around 0, extending as far as +1 and -1 for maximum excursion of the signal. Outside that, you get clipping, which distorts the harmonics and sounds terrible.

However, computers don't work all that well in terms of fractions, so discrete integers from 0 to 65536 are used to map that range. In most applications like this, a +32767 is considered maximum positive excursion of the microphone's or speaker's diaphragm. There is no correlation between a sample point and a sound pressure level, unless you start factoring in the characteristics of the recording (or playback) circuits.

(BTW, 16-bit audio is very standard and widely used. It is a good balance of signal-to-noise ratio and dynamic range. 8-bit is noisy unless you do some funky non-standard scaling.)

answered Sep 30 '22 18:09

staticsan

Related questions
                            
                                Deploy from Xcode 4.6.2 to iOS 7 (beta) device
                            
                                blend two uiimages based on alpha/transparency of top image
                            
                                How do I correctly use ABAddressBookCreateWithOptions method in iOS 6?
                            
                                How can you read a files MIME-type in objective-c
                            
                                CAKeyframeAnimation delay before repeating
                            
                                How do I scroll a UITableView to a section that contains no rows?
                            
                                Easiest way to force a crash in Swift
                            
                                textFieldShouldReturn not being called in iOS
                            
                                How to place UIBarButtonItem on the right side of a UIToolbar?
                            
                                CALayer - Shadow causes a performance hit?
                            
                                iOS 9 UIWebview embedded video fullscreen play cause a constraint error
                            
                                UINavigationBar UIBarButtonItems much larger click area than required
                            
                                convertPoint:toView: in landscape mode giving wrong values
                            
                                Understanding the memory consumption on iPhone
                            
                                Why is object not dealloc'ed when using ARC + NSZombieEnabled
                            
                                How do You structure an iPhone Xcode project?
                            
                                How to open preferences/settings with iOS 5.1?
                            
                                Changing language on the fly, in running iOS, programmatically
                            
                                How can I remove iPad support from AppStore
                            
                                "This item cannot be shared. Please select a different item." WhatsApp iOS share extension failure message

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With