I am following the IRIS example of tensorflow.
My case now is I have all data in a single CSV file, not separated, and I want to apply k-fold cross validation on that data.
I have
data_set = tf.contrib.learn.datasets.base.load_csv(filename="mydata.csv", target_dtype=np.int)
How can I perform k-fold cross validation on this dataset with multi-layer neural network as same as IRIS example?
K-Fold cross-validation has a single parameter called k that refers to the number of groups that a given dataset is to be split(fold). First Split the dataset into k groups than take the group as a test data set the remaining groups as a training data set.
I know this question is old but in case someone is looking to do something similar, expanding on ahmedhosny's answer:
The new tensorflow datasets API has the ability to create dataset objects using python generators, so along with scikit-learn's KFold one option can be to create a dataset from the KFold.split() generator:
import numpy as np from sklearn.model_selection import LeaveOneOut,KFold import tensorflow as tf import tensorflow.contrib.eager as tfe tf.enable_eager_execution() from sklearn.datasets import load_iris data = load_iris() X=data['data'] y=data['target'] def make_dataset(X_data,y_data,n_splits): def gen(): for train_index, test_index in KFold(n_splits).split(X_data): X_train, X_test = X_data[train_index], X_data[test_index] y_train, y_test = y_data[train_index], y_data[test_index] yield X_train,y_train,X_test,y_test return tf.data.Dataset.from_generator(gen, (tf.float64,tf.float64,tf.float64,tf.float64)) dataset=make_dataset(X,y,10)
Then one can iterate through the dataset either in the graph based tensorflow or using eager execution. Using eager execution:
for X_train,y_train,X_test,y_test in tfe.Iterator(dataset): ....
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With