I'm trying to follow along this tensorflow tutorial which uses a load_csv
function. TUTORIAL_LINK
One of two lines in question is:
IRIS_TEST = "iris_test.csv" test_set = tf.contrib.learn.datasets.base.load_csv( filename=IRIS_TEST, target_dtype=np.int )
Where "iris_test.csv"
looks like:
30,4,setosa,versicolor,virginica
5.9,3.0,4.2,1.5,1
6.9,3.1,5.4,2.1,2
5.1,3.3,1.7,0.5,0
6.0,3.4,4.5,1.6,1
5.5,2.5,4.0,1.3,1
6.2,2.9,4.3,1.3,1
5.5,4.2,1.4,0.2,0
6.3,2.8,5.1,1.5,2
5.6,3.0,4.1,1.3,1
6.7,2.5,5.8,1.8,2
7.1,3.0,5.9,2.1,2
4.3,3.0,1.1,0.1,0
I'm pretty sure the target of the machine learning exercise is the verginica
column but I've no idea how it's specified as such.
Is it implied as the last column?
In memory data For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a pandas Dataframe or a NumPy array. A relatively simple example is the abalone dataset. The dataset is small. All the input features are all limited-range floating point values.
It is the data that we need to load for starting any of the ML project. With respect to data, the most common format of data for ML projects is CSV (comma-separated values). Basically, CSV is a simple file format which is used to store tabular data (number and text) such as a spreadsheet in plain text.
From the code:
def load_csv(filename, target_dtype, target_column=-1, has_header=True):
"""Load dataset from CSV file."""
default for target_column
is -1
. So, last column, good to know.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With