Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast display of waveform in C/C++

I'm interested in implementing an audio editor in C or C++ on Windows and Linux. I can't figure out how to display the waveform quickly enough in its fully zoomed out view. I'm not looking for information about fast frame buffer techniques. This is a question about algorithms and data-structures to efficiently determine what to display.

Say I want to be able to edit a 5 channel, 48 KHz, 24-bit sound that is 2 hours long. That's 5 gigabytes of sample data. I want to be able to zoom out from one-pixel-per-sample all the way until all the sample data is visible at once. I want the application to feel responsive, even on a slow machine, like, for arguments sake, a 1 GHz Atom. When I say responsive, I'd like the GUI updates to generally occur within 1/30th of a second of the user input.

A naive implementation would scan every sample in the whole waveform when deciding what to render for the fully zoomed-out view - it needs to find the max and min sample values for all samples "covered" by each pixel width of the display. I wrote a simple app to test the speed of this approach. I tested with a 1 hour long, mono, 16-bit, 44.1 KHz sample on my 2015 3.5 GHz Xeon. It takes 0.12 seconds. This is hundreds of times too slow.

You can imagine maintaining a cache of zoomed out data, but I can't see how to avoid having to recalculate the entire cache after most inserts or deletes. It feels like there must be a better way.

Here's a diagram showing what I want to achieve:

enter image description here

This is how the display in most currently available audio editors works. Users are likely to expect this behaviour. I tested with Audacity, and it works this way (although it also shows something like the mean of the samples in a lighter colour too). It can handle arbitrary inserts into large sounds, seemingly instantly. I'm not going to read the 75 megabytes of source code to find out how it does it.

EDIT:

Various people have suggested schemes that involve only considering a subset of the samples when showing the zoomed out view. I've come to the conclusion that I don't want to do that because it loses too much useful information. For example, including all the samples is important if you are looking for a glitch in the sound, like a click in a vinyl conversion. In the worst case, if the glitch is only one sample long, I still want a guarantee that it is shown in the fully zoomed out view.

like image 940
Andrew Bainbridge Avatar asked May 31 '16 19:05

Andrew Bainbridge


People also ask

How do you find the period of an AC waveform?

One cycle of the waveform is four division in length. Therefore, the period (T) of the waveform is found as: The period of the AC waveform can be measured from any given point on the waveform to the identical point in the next cycle, as illustrated in Figure 4.

What is the waveform data display scale used for?

Waveform Data Display Scale (003A,0230) specifies the recommended display scale in the time dimension for the waveform data in units of mm/s. The display application needs to know the horizontal pixel scaling of the display device to effectively apply this attribute.

How do you find the frequency of a waveform?

Waveform period is measured by multiplying the time base by the number of horizontal divisions that a complete cycle of a waveform occupies on the oscilloscope display. The frequency can now be determined by calculating the reciprocal of the period. Did you find apk for android?

What is periodic waveform?

A waveform that repeats itself continuously (like the one shown in Figure 3b) is called Periodic Waveform. The time required to complete one cycle of an AC waveform is referred to as its Period. Example 1 demonstrates the concept of waveform period.


Video Answer


2 Answers

After reading Peter Stock's answer, I've come up with the following scheme. I think it will allow display calculation about 500 times faster than the naive scheme and shouldn't add any noticeable cost to inserts or deletes. The memory overhead is less than 1%.

The sound data will be allocated in blocks of 131072 samples, so that inserts and deletes don't require the entire sound to be reallocated and copied. When the sound is first loaded, each block will be completely filled (except probably the last one). Inserts and deletes will lead to a kind of fragmentation. For simplicity, I will arrange for the start of each block to always contain valid sample data, and any gaps will be at the end of the block.

Each block has two look-up tables associated with it, one for max values and one for min. Each item in the look-up tables corresponds to 1024 samples.

The diagram below shows how to calculate the max value for one pixel width of the display. It shows a few blocks relevant to the calculation. It assumes there is no "fragmentation".

Display calculation for one pixel width (no fragmentation)

After an insert, the situation is slightly more complicated. Two blocks now have invalid regions at their ends. There are entries in the max look-up table that now corresponds to a part-empty region of samples. The value for these entries are found by just taking the max of the samples that are present.

Display calculation for one pixel width (with fragmentation)

like image 74
Andrew Bainbridge Avatar answered Nov 10 '22 16:11

Andrew Bainbridge


When the zoom is at the point where you have multiple samples per pixel it is not worth calculating accurately the mean sample value for each pixel. The user can't align the GUI tooling accurately at that level of zoom so it's no benefit. The user just needs a qualitative view.

I would just select one sample per screen pixel for the window area, skipping over the unnecessary samples.

Something like this completely untested code:

std::vector<double> samples(1024*1024); // [-1.0 < s < 1.0]

int window_x = 1024; // window size in pixels
int window_y = 768; // window size in pixels

// visit every window pixel
for(int x = 0; x < window_x; ++x)
{
    // select relevant sample for the current screen pixel x
    double s = samples[(x * samples.size()) / window_x];

    int y = (window_y / 2) * s; // get y size for sample value

    // draw sample point/line at coordinate (x, f(y))
    gd.draw_line(x, (window_y / 2) - y, x, (window_y / 2) + y);
}

Obviously you also need to account for window scrolling etc...

like image 34
Galik Avatar answered Nov 10 '22 15:11

Galik