Everything is in the title. I'm wondering if any one knows a quick and with reasonable memory demands way of randomly mixing all the lines of a 3 million lines file. I guess it is not possible with a simple vim command, so any simple script using Python. I tried with python by using a random number generator, but did not manage to find a simple way out.
Using the shuf Command The shuf utility is a member of the GNU Coreutils package. It outputs a random permutation of the input lines. The shuf command will load all input data into memory during the shuffling, and it won't work if the input file is larger than the free memory.
To shuffle strings or tuples, use random. sample() , which creates a new object. random. sample() returns a list even when a string or tuple is specified to the first argument, so it is necessary to convert it to a string or tuple.
Python Random shuffle() Method The shuffle() method takes a sequence, like a list, and reorganize the order of the items. Note: This method changes the original list, it does not return a new list.
Takes only a few seconds in Python:
import random lines = open('3mil.txt').readlines() random.shuffle(lines) open('3mil.txt', 'w').writelines(lines)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With