How to make waveform rendering more interesting?

Tags:

I wrote a waveform renderer that takes an audio file and creates something like this:

enter image description here

The logic is pretty simple. I calculate the number of audio samples required for each pixel, read those samples, average them and draw a column of pixels according to the resulting value.

Typically, I will render a whole song on around 600-800 pixels, so the wave is pretty compressed. Unfortunately this usually results in unappealing visuals as almost the entire song is just rendered at almost the same heights. There is no variation.

Interestingly, if you look at the waveforms on SoundCloud almost none of them are as boring as my results. They all have some variation. What could be the trick here? I don't think they just add random noise.

634

asked Oct 19 '14 15:10

Jawap

1 Answers

I don't think SoundCloud is doing anything particularly special. There are plenty of songs I see on their front page that are very flat. It has more to do with the way detail is perceived and what the overall dynamics of the song are like. The main difference is that SoundCloud is drawing absolute value. (The negative side of the image is just a mirror.)

For demonstration, here is a basic white noise plot with straight lines:

regular plot

Now, typically a fill is used to make the overall outline easier to see. This already does a lot for the appearance:

fill

Larger waveforms ("zoomed out" in particular) typically use a mirror effect because the dynamics become more pronounced:

wrap

Bars are another way to visualize and can give an illusion of detail:

step

A pseudo routine for a typical waveform graphic (average of abs and mirror) might look like this:

for (each pixel in width of image) {
    var sum = 0

    for (each sample in subset contained within pixel) {
        sum = sum + abs(sample)
    }

    var avg = sum / length of subset

    draw line(avg to -avg)
}

This is effectively like compressing the time axis as RMS of the window. (RMS could also be used but they are almost the same.) Now the waveform shows overall dynamics.

That is not too different from what you are already doing, just abs, mirror and fill. For boxes like SoundCloud uses, you would be drawing rectangles.

Just as a bonus, here is an MCVE written in Java to generate a waveform with boxes as described. (Sorry if Java is not your language.) The actual drawing code is near the top. This program also normalizes, i.e., the waveform is "stretched" to the height of the image.

This simple output is the same as the above pseudo routine:

normal output

This output with boxes is very similar to SoundCloud:

box waveform

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.awt.image.*;
import java.io.*;
import javax.sound.sampled.*;

public class BoxWaveform {
    static int boxWidth = 4;
    static Dimension size = new Dimension(boxWidth == 1 ? 512 : 513, 97);

    static BufferedImage img;
    static JPanel view;

    // draw the image
    static void drawImage(float[] samples) {
        Graphics2D g2d = img.createGraphics();

        int numSubsets = size.width / boxWidth;
        int subsetLength = samples.length / numSubsets;

        float[] subsets = new float[numSubsets];

        // find average(abs) of each box subset
        int s = 0;
        for(int i = 0; i < subsets.length; i++) {

            double sum = 0;
            for(int k = 0; k < subsetLength; k++) {
                sum += Math.abs(samples[s++]);
            }

            subsets[i] = (float)(sum / subsetLength);
        }

        // find the peak so the waveform can be normalized
        // to the height of the image
        float normal = 0;
        for(float sample : subsets) {
            if(sample > normal)
                normal = sample;
        }

        // normalize and scale
        normal = 32768.0f / normal;
        for(int i = 0; i < subsets.length; i++) {
            subsets[i] *= normal;
            subsets[i] = (subsets[i] / 32768.0f) * (size.height / 2);
        }

        g2d.setColor(Color.GRAY);

        // convert to image coords and do actual drawing
        for(int i = 0; i < subsets.length; i++) {
            int sample = (int)subsets[i];

            int posY = (size.height / 2) - sample;
            int negY = (size.height / 2) + sample;

            int x = i * boxWidth;

            if(boxWidth == 1) {
                g2d.drawLine(x, posY, x, negY);
            } else {
                g2d.setColor(Color.GRAY);
                g2d.fillRect(x + 1, posY + 1, boxWidth - 1, negY - posY - 1);
                g2d.setColor(Color.DARK_GRAY);
                g2d.drawRect(x, posY, boxWidth, negY - posY);
            }
        }

        g2d.dispose();
        view.repaint();
        view.requestFocus();
    }

    // handle most WAV and AIFF files
    static void loadImage() {
        JFileChooser chooser = new JFileChooser();
        int val = chooser.showOpenDialog(null);
        if(val != JFileChooser.APPROVE_OPTION) {
            return;
        }

        File file = chooser.getSelectedFile();
        float[] samples;

        try {
            AudioInputStream in = AudioSystem.getAudioInputStream(file);
            AudioFormat fmt = in.getFormat();

            if(fmt.getEncoding() != AudioFormat.Encoding.PCM_SIGNED) {
                throw new UnsupportedAudioFileException("unsigned");
            }

            boolean big = fmt.isBigEndian();
            int chans = fmt.getChannels();
            int bits = fmt.getSampleSizeInBits();
            int bytes = bits + 7 >> 3;

            int frameLength = (int)in.getFrameLength();
            int bufferLength = chans * bytes * 1024;

            samples = new float[frameLength];
            byte[] buf = new byte[bufferLength];

            int i = 0;
            int bRead;
            while((bRead = in.read(buf)) > -1) {

                for(int b = 0; b < bRead;) {
                    double sum = 0;

                    // (sums to mono if multiple channels)
                    for(int c = 0; c < chans; c++) {
                        if(bytes == 1) {
                            sum += buf[b++] << 8;

                        } else {
                            int sample = 0;

                            // (quantizes to 16-bit)
                            if(big) {
                                sample |= (buf[b++] & 0xFF) << 8;
                                sample |= (buf[b++] & 0xFF);
                                b += bytes - 2;
                            } else {
                                b += bytes - 2;
                                sample |= (buf[b++] & 0xFF);
                                sample |= (buf[b++] & 0xFF) << 8;
                            }

                            final int sign = 1 << 15;
                            final int mask = -1 << 16;
                            if((sample & sign) == sign) {
                                sample |= mask;
                            }

                            sum += sample;
                        }
                    }

                    samples[i++] = (float)(sum / chans);
                }
            }

        } catch(Exception e) {
            problem(e);
            return;
        }

        if(img == null) {
            img = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
        }

        drawImage(samples);
    }

    static void problem(Object msg) {
        JOptionPane.showMessageDialog(null, String.valueOf(msg));
    }

    public static void main(String[] args) {
        SwingUtilities.invokeLater(new Runnable() {
            @Override
            public void run() {
                JFrame frame = new JFrame("Box Waveform");
                JPanel content = new JPanel(new BorderLayout());
                frame.setContentPane(content);

                JButton load = new JButton("Load");
                load.addActionListener(new ActionListener() {
                    @Override
                    public void actionPerformed(ActionEvent ae) {
                        loadImage();
                    }
                });

                view = new JPanel() {
                    @Override
                    protected void paintComponent(Graphics g) {
                        super.paintComponent(g);

                        if(img != null) {
                            g.drawImage(img, 1, 1, img.getWidth(), img.getHeight(), null);
                        }
                    }
                };

                view.setBackground(Color.WHITE);
                view.setPreferredSize(new Dimension(size.width + 2, size.height + 2));

                content.add(view, BorderLayout.CENTER);
                content.add(load, BorderLayout.SOUTH);

                frame.pack();
                frame.setResizable(false);
                frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
                frame.setLocationRelativeTo(null);
                frame.setVisible(true);
            }
        });
    }
}

Note: for the sake of simplicity, this program loads the entire audio file in to memory. Some JVMs may throw OutOfMemoryError. To correct this, run with increased heap size as described here.

answered Oct 22 '22 21:10

Radiodef

Related questions
                            
                                Generate equation with the result value closest to the requested one, have speed problems
                            
                                Generate all combinations of mathematical expressions that add to target (Java homework/interview)
                            
                                Uniqueness of Inorder, Preorder, and Postorder traversal with null elements
                            
                                Optimizing (minimizing) the number of lines in file: an optimization problem in line with permutations and agenda scheduling
                            
                                Shortest distance between points on a toroidally wrapped (x- and y- wrapping) map?
                            
                                Marching Cube Ambiguities Versus Marching Tetrahedron
                            
                                How does DP helps if there are no overlapping in sub problems [0/1 knapsack]
                            
                                Choosing buffer size for FTP and HTTP transfers
                            
                                Decomposing a 3d mesh into a 2d net
                            
                                Designing a twenty questions algorithm
                            
                                Compute social distance between two users
                            
                                How to convert n-ary CSP to binary CSP using dual graph transformation
                            
                                Binary Search Terminating Condition
                            
                                What is the reverse postorder?
                            
                                Programming: Minimum steps required to convert a binary number to zero
                            
                                What algorithms did Dijkstra develop?
                            
                                Are there algorithms for generating psychologically random numbers? [closed]
                            
                                C++ algorithm for applying a function to consecutive elements
                            
                                Unable to understand correctness of Peterson Algorithm
                            
                                QuickSort and Hoare Partition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to make waveform rendering more interesting?

Tags:

algorithm

graphics

rendering

audio

waveform

Jawap

People also ask

1 Answers

Radiodef

Recent Activity

Donate For Us