Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate noisy mock time series or signal (in Python)

Quite often I have to work with a bunch of noisy, somewhat correlated time series. Sometimes I need some mock data to test my code, or to provide some sample data for a question on Stack Overflow. I usually end up either loading some similar dataset from a different project, or just adding a few sine functions and noise and spending some time to tweak it.

What's your approach? How do you generate noisy signals with certain specs? Have I just overlooked some blatantly obvious standard package that does exactly this?

The features I would generally like to get in my mock data:

  • Varying noise levels over time
  • Some history in the signal (like a random walk?)
  • Periodicity in the signal
  • Being able to produce another time series with similar (but not exactly the same) features
  • Maybe a bunch of weird dips/peaks/plateaus
  • Being able to reproduce it (some seed and a few parameters?)

I would like to get a time series similar to the two below [A]:

Real time series 1 Real time series 2

I usually end up creating a time series with a bit of code like this:

import numpy as np

n = 1000
limit_low = 0
limit_high = 0.48
my_data = np.random.normal(0, 0.5, n) \
          + np.abs(np.random.normal(0, 2, n) \
                   * np.sin(np.linspace(0, 3*np.pi, n)) ) \
          + np.sin(np.linspace(0, 5*np.pi, n))**2 \
          + np.sin(np.linspace(1, 6*np.pi, n))**2

scaling = (limit_high - limit_low) / (max(my_data) - min(my_data))
my_data = my_data * scaling
my_data = my_data + (limit_low - min(my_data))

Which results in a time series like this:

Mock time series

Which is something I can work with, but still not quite what I want. The problem here is mainly that:

  1. it doesn't have the history/random walk aspect
  2. it's quite a bit of code and tweaking (this is especially a problem if i want to share a sample time series)
  3. I need to retweak the values (freq. of sines etc.) to produce another similar but not exactly the same time series.

[A]: For those wondering, the time series depicted in the first two images is the traffic intensity at two points along one road over three days (midnight to 6 am is clipped) in cars per second (moving hanning window average over 2 min). Resampled to 1000 points.

like image 215
Swier Avatar asked Mar 29 '16 13:03

Swier


People also ask

Can Python be used for signal processing?

One of the key advantages of Python is that packages can be used to extend the language to provide advanced capabilities such as array and matrix manipulation [5], image processing [12], digital signal processing [5], and visualization [7].


1 Answers

Have you looked into TSimulus? By using Generators, you should be able generate data with specific patterns, periodicity, and cycles.

The TSimulus project provides tools for specifying the shape of a time series (general patterns, cycles, importance of the added noise, etc.) and for converting this specification into time series values.


Otherwise, you can try "drawing" the data yourself and exporting those data points using Time Series Maker.

like image 131
PeterWhy Avatar answered Sep 24 '22 20:09

PeterWhy