So say I'm trying to create a 100-sample dataset that follows a certain line, maybe 2x+2. And I want the values on my X-axis to range from 0-1000. To do this, I use the following.
X = np.random.random(100,1) * 1000
Y = (2*X) + 2
data = np.hstack(X,Y)
The hstack gives me the array with corresponding x and y values. That part works. But if I want to inject noise into it in order to scatter the datapoints further away from that 2x+2 line...that's what I can't figure out.
Say for example, I want that Y array to have a standard deviation of 20. How would I inject that noise into the y values?
numpy random normal creates an array of normally distributed random numbers. The loc argument is the mean and the scale argument is the standard deviation. This is one way to generate a white noise series.
Maybe I'm missing something, but have you tried adding numpy.random.normal
(scale=20,size=100)
to Y
? You can even write
Y=numpy.random.normal(2*X+2,20)
and do it all at once (and without repeating the array size).
To simulate noise use a normally distributed random number generator like np.random.randn
.
Is this what you are trying to do:
X = np.linspace(0, 1000, 100)
Y = (2*X) + 2 + 20*np.random.randn(100)
data = np.hstack((X.reshape(100,1),Y.reshape(100,1)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With