Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pulling random files out of a folder for sampling

I needed a way to pull 10% of the files in a folder, at random, for sampling after every "run." Luckily, my current files are numbered numerically, and sequentially. So my current method is to list file names, parse the numerical portion, pull max and min values, count the number of files and multiply by .1, then use random.sample to get a "random [10%] sample." I also write these names to a .txt then use shutil.copy to move the actual files.

Obviously, this does not work if I have an outlier, i.e. if I have a file 345.txt among other files from 513.txt - 678.txt. I was wondering if there was a direct way to simply pull a number of files from a folder, randomly? I have looked it up and cannot find a better method.

Thanks.

like image 748
physlexic Avatar asked Mar 14 '18 14:03

physlexic


People also ask

How do I select random files in a folder?

Right-click the folder in the left-hand pane — not the right — and click the new “Select Random” option. RandomSelectionTool then selects something from the contents of that folder — either a file, or a folder — and the right-hand pane should be updated to display it.

Does random sample return a list?

sample() function. sample() is an inbuilt function of random module in Python that returns a particular length list of items chosen from the sequence i.e. list, tuple, string or set.

How do you select random images in python?

Example to show a random picture from a folder in Python: First, you select the path of the folder where the picture is present like->c\\user\\folder. By using the listdir() method store all the images present in the folder. By using random. choice() method to select a image and os.


2 Answers

Using numpy.random.choice(array, N) you can select N items at random from an array.

import numpy as np
import os

# list all files in dir
files = [f for f in os.listdir('.') if os.path.isfile(f)]

# select 0.1 of the files randomly 
random_files = np.random.choice(files, int(len(files)*.1))
like image 53
Karl Anka Avatar answered Sep 18 '22 14:09

Karl Anka


I was unable to get the other methods to work easily with my code, but I came up with this.

output_folder = 'C:/path/to/folder'
for x in range(int(len(files) *.1)):
    to_copy = choice(files)
    shutil.copy(os.path.join(subdir, to_copy), output_folder)            
like image 26
physlexic Avatar answered Sep 22 '22 14:09

physlexic