numpy: How can I select specific indexes in an np array for k-fold cross validation?

Tags:

I have a training data set in matrix form of dimensions 5000 x 3027 (CIFAR-10 data set). Using array_split in numpy, I partitioned it into 5 different parts, and I want to select just one of the parts as the cross validation fold. However my problem comes when I use something like XTrain[[Indexes]] where indexes is an array like [0,1,2,3], because doing this gives me a 3D tensor of dimensions 4 x 1000 x 3027, and not a matrix. How do I collapse the "4 x 1000" into 4000 rows, to get a matrix of 4000 x 3027?

for fold in range(len(X_train_folds)):
    indexes = np.delete(np.arange(len(X_train_folds)), fold) 
    XTrain = X_train_folds[indexes]
    X_cv = X_train_folds[fold]
    yTrain = y_train_folds[indexes]
    y_cv = y_train_folds[fold]

    classifier.train(XTrain, yTrain)
    dists = classifier.compute_distances_no_loops(X_cv)
    y_test_pred = classifier.predict_labels(dists, k)

    num_correct = np.sum(y_test_pred == y_test)
    accuracy = float(num_correct/num_test)
    k_to_accuracy[k] = accuracy

668

asked May 22 '16 03:05

kwotsin

1 Answers

Perhaps you can try this instead (new to numpy so if I am doing something inefficient/wrong, would be happy to be corrected)

X_train_folds = np.array_split(X_train, num_folds)
y_train_folds = np.array_split(y_train, num_folds)
k_to_accuracies = {}

for k in k_choices:
    k_to_accuracies[k] = []
    for i in range(num_folds):
        training_data, test_data = np.concatenate(X_train_folds[:i] + X_train_folds[i+1:]), X_train_folds[i]
        training_labels, test_labels = np.concatenate(y_train_folds[:i] + y_train_folds[i+1:]), y_train_folds[i]
        classifier.train(training_data, training_labels)
        predicted_labels = classifier.predict(test_data, k)
        k_to_accuracies[k].append(np.sum(predicted_labels == test_labels)/len(test_labels))

184

answered Sep 27 '22 16:09

Abhas Sinha

Related questions
                            
                                OAuth 2.0 for Server to Server Applications using Python 3.4, cannot import name 'SERVICE_ACCOUNT'
                            
                                python mean of list of lists
                            
                                Python Testing how to run parameterised Testcases and pass a parameter to setupClass
                            
                                Combinations with repetition in python, where order MATTERS
                            
                                Python Insert Image into the middle of an existing PowerPoint
                            
                                What is the difference between the title() method and wm_title() method in the Tkinter class?
                            
                                Tkinter TTK Button Bold Font
                            
                                Unexpected Behavior of itertools.groupby
                            
                                Flask application on uwsgi gives a TypeError: 'Flask' object is not iterable
                            
                                how to remove a object in a python list
                            
                                ScrapyJS - How to properly wait for page load?
                            
                                What is the difference between an S3 Object and an ObjectSummary?
                            
                                Explicit passing of Self when calling super class's __init__ in python
                            
                                Installing imutils in ubuntu
                            
                                Plotting with SymPy
                            
                                Cumulative operations on dtype objects
                            
                                Django - Filter a date within a range with validation
                            
                                Convert a Haskell code to Python or pseudocode
                            
                                FFT in numpy vs FFT in MATLAB do not have the same results
                            
                                Array of ints in numba

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

numpy: How can I select specific indexes in an np array for k-fold cross validation?

Tags:

python

arrays

machine-learning

numpy

cross-validation

kwotsin

People also ask

1 Answers

Abhas Sinha

Recent Activity

Donate For Us