I've been playing around with ruby recently, and I decided to start a simple project to write a ruby script that records line-in sound to a .wav
file. I discovered that ruby doesn't provide very good access to hardware devices (and it probably shouldn't), but that PortAudio does, and I discovered a great wrapper for PA here (it is not a gem, I think because it uses ruby's ffi
to attach to PortAudio, and the PA library could be in a variety of places). I've been muddling through PortAudio's documentation and examples to figure out how PA works. I haven't written or read C
in years.
I'm running into difficulty with what parameters I should be passing to a stream during creation, and a buffer during creation. For example, what exactly is a frame
, and how is it related to other parameters like channel
and sample rate
. I'm totally new to audio programming in general as well, so if anyone could point me to some general tutorials, etc, about device level audio, I'd appreciate it.
ruby-portaudio
provides a single example that creates a stream and a buffer, writes a sin wave to the buffer, then sends the buffer to the stream to be played. Some of the ruby I'm having trouble with in the example, specifically the loop block.
PortAudio.init
block_size = 1024
sr = 44100
step = 1.0/sr
time = 0.0
stream = PortAudio::Stream.open(
:sample_rate => sr,
:frames => block_size,
:output => {
:device => PortAudio::Device.default_output,
:channels => 1,
:sample_format => :float32
})
buffer = PortAudio::SampleBuffer.new(
:format => :float32,
:channels => 1,
:frames => block_size)
playing = true
Signal.trap('INT') { playing = false }
puts "Ctrl-C to exit"
stream.start
loop do
stream << buffer.fill { |frame, channel|
time += step
Math.cos(time * 2 * Math::PI * 440.0) * Math.cos(time * 2 * Math::PI)
}
break unless playing
end
stream.stop
If I'm going to be recording, I should be reading a stream into a buffer, then manipulating that buffer and writing it to file, right?
Also, if I'm barking up the wrong tree here, and there is an easier way to do this (in ruby), some direction would be nice.
Let's first clarify the terms you were asking about. For this purpose i will try to explain the audio pipeline in a simplified way. When you are generating a sound as in your example, your sound card periodically requests frames (= buffers = blocks) from your code, which you fill with your samples. The sampling rate defines how many samples you provide within a second and thus the speed with which your samples are played back. The frame size (= buffer size = block size) determines how many samples you provide in one request from the sound card. A buffer is typically quite small, because the buffer size directly affects the latency (large buffer => high latency) and large arrays can be slow (especially ruby arrays are slow).
Similar things happen when you are recording sound from your sound card. Your function gets called every now and then, and the samples from the microphone are typically passed in as an argument to the function (or even just a reference to such a buffer). You are then expected to process these samples, e.g. by writing them to disk.
I know that the thought of "doing everything in Ruby" is quite tempting, because it is such a beautiful language. When you are planning on doing audio processing in real time, i would recommend to switch to a compiled language (C, C++, Obj-C, ...) though. These can handle audio much better, because they're much closer to the hardware than Ruby and thus generally faster, which can be quite an issue in audio processing. This is probably also the reason why there are so few Ruby audio libraries around, so maybe Ruby just isn't the right tool for the job.
By the way, i tried out ruby-portaudio, ffi-portaudio as well as ruby-audio and none of them were working properly on my Macbook (tried to generate a sine wave) which sadly shows again, how Ruby is not capable of handling this stuff (yet?).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With