I have a file with ~1000 lines that looks like this:
ABC C5A 1
CFD D5G 4
E1E FDF 3
CFF VBV 1
FGH F4R 2
K8K F9F 3
... etc
I would like to select 100 random lines, but with 10 of each third column value (so random 10 lines from all lines with value "1" in column 3, random 10 lines from all lines with value "2" in column 3, etc).
Is this possible using bash?
First grep
all the files with a certain number, shuffle them and pick the first 10 using shuf -n 10
.
for i in {1..10}; do
grep " ${i}$" file | shuf -n 10
done > randomFile
If you don't have shuf
, use sort -R
to randomly sort them instead:
for i in {1..10}; do
grep " ${i}$" file | sort -R | head -10
done > randomFile
If you can use awk
, you can do the same with a one-liner
sort -R file | awk '{if (count[$3] < 10) {count[$3]++; print $0}}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With