Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference Between numpy.genfromtxt and numpy.loadtxt, and Unpack

Tags:

I am curious to know the difference between the two functions alluded to in the title of this thread. From the website containing the documentation, it says, "numpy.loadtxt [is] [an] equivalent function when no data is missing." What exactly is meant by this? Does this mean, for instance, if I have a csv file that has a blank column between two columns containing data, I should not numpy.loadtxt?

Also, what does this mean,

"unpack : bool, optional If True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...)" 

I am not quite certain as to what this means.

I'd appreciate your help, thank you!

like image 751
Mack Avatar asked Nov 27 '13 14:11

Mack


People also ask

What is Numpy Genfromtxt?

genfromtxt() function. The genfromtxt() used to load data from a text file, with missing values handled as specified. Each line past the first skip_header lines is split at the delimiter character, and characters following the comments character are discarded.

What is Loadtxt Numpy?

loadtxt() function. The loadtxt() function is used to load data from a text file. Each row in the text file must have the same number of values.

Which argument should be passed into Genfromtxt If you have many column names to define from the data?

The only mandatory argument of genfromtxt is the source of the data. It can be a string, a list of strings, a generator or an open file-like object with a read method, for example, a file or io.


1 Answers

You are correct. Using np.genfromtxt gives you some options like the parameters missing_values, filling_values that can help you dealing with an incomplete csv. Example:

1,2,,,5 6,,8,, 11,,,, 

Could be read with:

filling_values = (111, 222, 333, 444, 555) # one for each column np.genfromtxt(filename, delimiter=',', filling_values=filling_values)  #array([[   1.,    2.,  333.,  444.,    5.], #       [   6.,  222.,    8.,  444.,  555.], #       [  11.,  222.,  333.,  444.,  555.]]) 

The parameter unpack is useful when you want to put each column of the text file in a different variable. Example, you have the text file with columns x, y, z, then:

x, y, z = np.loadtxt(filename, unpack=True) 

Note that this works the same as

x, y, z = np.loadtxt(filename).T 

By default iterating over a 2-D array means iterating over the lines, that's why you have to transpose or use unpack=True in this example.

like image 151
Saullo G. P. Castro Avatar answered Sep 22 '22 14:09

Saullo G. P. Castro