Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux: create random directory/file hierarchy

For testing a tool I need a directory with a whole bunch of different Office files in a deep nested structure. I already have the files in a directory, but now need to create some random nested sub directories and spread out the files in them.

I could sit down and write a proper program in a programming language of my choice, but I wonder if there might be a clever combination of Linux command line tools + Bash to achieve what I want.

Edit: to clarify, my input is a directory with a about 200 files. The output should be a directory hierarchy containing these files more or less evenly spread. Directory names should be more than single letters, vary randomly in length and use various allowed characters (utf-8 filesystem).

like image 916
Andreas Gohr Avatar asked Nov 15 '12 15:11

Andreas Gohr


4 Answers

None of these solutions were fast enough since they rely on bash, so I created a Rust crate that generates pseudo-random directory hierarchies: https://crates.io/crates/ftzz.

Note that I only cared about the hierarchy itself, not its contents, so this program creates empty files or files filled with random data.

like image 102
SUPERCILEX Avatar answered Sep 22 '22 05:09

SUPERCILEX


This is a script that generate a random dir structure :

#!/bin/bash

# Decimal ASCII codes (see man ascii)
ARR=( {48..57} {65..90} {97..122} )

# Array count
arrcount=${#ARR[@]}

# return a random string
get_rand_dir(){
    for ((i=1; i<$((RANDOM%30)); i++)) {
        printf \\$(printf '%03o' ${ARR[RANDOM%arrcount]});
    }
}

dir=/tmp/

# appending random characters to make a hierarchy
for ((i=0; i<$((RANDOM%100)); i++)) {
    dir+="$(get_rand_dir)/"
}

echo $dir
mkdir -p "$dir"

oldir=$(echo "$dir" | cut -d '/' -f1-3)

while [[ $dir ]]; do
    dir=${dir%/*}
    cd $dir
    for ((i=0; i<$((RANDOM%100)); i++)) {
        mkdir &>/dev/null -p $(get_rand_dir)
    }
done

tree "$oldir"

OUTPUT

/tmp/x
├── egeDVPW
├── iOkr
├── l
├── o1gye8uF
├── q
│   ├── 4Dlrfagv
│   ├── 4Yxmoqf
│   ├── 8LkyIrXA
│   ├── 8m9kse8s
│   ├── aV
│   ├── in
│   │   ├── 12zdLso68HWlPK
│   │   │   ├── C
│   │   │   ├── DOYt8wUW
│   │   │   ├── FXP
│   │   │   ├── hFLem8
│   │   │   ├── hhHIv
│   │   │   ├── iD87kxs54x04
│   │   │   ├── oFM
│   │   │   ├── OjFT

Now you can create an array of dirs :

shopt -s globstar # require bash4
dirs=( /tmp/x/** )
printf '%s\n' ${dirs[@]}

and populate dirs with files randomly. You have enough examples to do so. I've done the most hard work.

like image 44
Gilles Quenot Avatar answered Nov 15 '22 23:11

Gilles Quenot


You can use bash brace-expansion:

mkdir -p {a,b}/{e,f,g}/{h,i,j}

├───a
│   ├───e
│   │   ├───h
│   │   ├───i
│   │   └───j
│   ├───f
│   │   ├───h
│   │   ├───i
│   │   └───j
│   └───g
│       ├───h
│       ├───i
│       └───j
└───b
    ├───e
    │   ├───h
    │   ├───i
    │   └───j
    ├───f
    │   ├───h
    │   ├───i
    │   └───j
    └───g
        ├───h
        ├───i
        └───j
like image 10
kev Avatar answered Nov 15 '22 22:11

kev


Thanks to all who posted here; it turns out, it wasn't really trivial to escape filenames with special characters, so I built my own script based on those here; here is how it behaves with special character filenames:

$ ~/rndtree.sh ./rndpath 0 3 1
Warning: will create random tree at: ./rndpath
Proceed (y/n)? y
Removing old outdir ./rndpath
mkdir -p ./rndpath/";"/{")?DxVBBJ{w2","L,|+","^VC)Vn.6!"}/"D+,IFJ( LN"
> > > > > > > > > > > 
./rndpath
└── [       4096]  ;
    ├── [       4096]  )?DxVBBJ{w2
    │   ├── [       4096]  D+,IFJ( LN
    │   │   └── [        929]  r2.bin
    │   ├── [       8557]  %3fsaG# Rl;ffXf.bin
    │   └── [      19945]  Dzk .bin
    ├── [       4096]  L,|+
    │   ├── [       4096]  D+,IFJ( LN
    │   │   ├── [       2325]  6Qg#pe5j'&ji49oqrO.bin
    │   │   ├── [      16345]  #?.bin
    │   │   └── [       2057]  Uz-0XtLVWz#}0lI.bin
    │   ├── [       2543]  bbtA-^s22vdTu.bin
    │   └── [      10848]  K46+kh7L9.bin
    ├── [       4096]  ^VC)Vn.6!
    │   ├── [       4096]  D+,IFJ( LN
    │   ├── [      10502]  8yY,MqZ ^5+_SA^.r4{.bin
    │   └── [      17628]  ipO"|69.bin
    └── [      12376]  a2Y% }G1.qDir.bin

7 directories, 11 files
total bytes: 136823 ./rndpath

and here with a safe subset of ASCII:

$ ~/rndtree.sh ./rndpath 1 3 1
Warning: will create random tree at: ./rndpath
Proceed (y/n)? y
Removing old outdir ./rndpath
mkdir -p ./rndpath/"48SLS"/{"nyG","jIC6vj"}/{"PSLd5tMn","V R"}
> > > > > > > 
./rndpath
├── [       4096]  48SLS
│   ├── [       4096]  jIC6vj
│   │   ├── [       4096]  PSLd5tMn
│   │   ├── [       4096]  V R
│   │   │   ├── [        922]  lg.bin
│   │   │   └── [       9600]  VVYG.bin
│   │   ├── [      10883]  B7nt06p.bin
│   │   └── [      19339]  g5uT i.bin
│   ├── [       4096]  nyG
│   │   ├── [       4096]  PSLd5tMn
│   │   └── [       4096]  V R
│   │       └── [       6128]  1tfLR.bin
│   └── [       5448]  Jda.bin
└── [      18196]  KwEXu2O2H9s.bin

Spaces should be handled in both cases - however, note that subdirectory names repeat (while filenames do not).

The script rndtree.sh:

#!/usr/bin/env bash

# http://stackoverflow.com/questions/13400312/linux-create-random-directory-file-hierarchy
# Decimal ASCII codes (see man ascii); added space
AARR=( 32 {48..57} {65..90} {97..122} )
# Array count
aarrcount=${#AARR[@]}

if [ "$1" == "" ] ; then
  OUTDIR="./rndpath" ;
else
  OUTDIR="$1" ;
fi

if [ "$2" != "" ] ; then
  ASCIIONLY="$2" ;
else
  ASCIIONLY=1 ;
fi

if [ "$3" != "" ] ; then
  DIRDEPTH="$3" ;
else
  DIRDEPTH=3 ;
fi

if [ "$4" != "" ] ; then
  MAXFIRSTLEVELDIRS="$4" ;
else
  MAXFIRSTLEVELDIRS=2 ;
fi

if [ "$5" != "" ] ; then
  MAXDIRCHILDREN="$5" ;
else
  MAXDIRCHILDREN=4 ;
fi

if [ "$6" != "" ] ; then
  MAXDIRNAMELEN="$6" ;
else
  MAXDIRNAMELEN=12 ;
fi

if [ "$7" != "" ] ; then
  MAXFILECHILDREN="$7" ;
else
  MAXFILECHILDREN=4 ;
fi

if [ "$8" != "" ] ; then
  MAXFILENAMELEN="$8" ;
else
  MAXFILENAMELEN=20 ;
fi

if [ "$9" != "" ] ; then
  MAXFILESIZE="$9" ;
else
  MAXFILESIZE=20000 ;
fi

MINDIRNAMELEN=1
MINFILENAMELEN=1
MINDIRCHILDREN=1
MINFILECHILDREN=0
MINFILESIZE=1
FILEEXT=".bin"
VERBOSE=0 #1

get_rand_dirname() {
  if [ "$ASCIIONLY" == "1" ]; then
    for ((i=0; i<$((MINDIRNAMELEN+RANDOM%MAXDIRNAMELEN)); i++)) {
      printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
    }
  else
    cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINDIRNAMELEN + RANDOM % MAXDIRNAMELEN)) | sed 's/\(["]\)/\\\1/g'
  fi
  #echo -e " " # debug last dirname space
}

get_rand_filename() {
  if [ "$ASCIIONLY" == "1" ]; then
    for ((i=0; i<$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)); i++)) {
      printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
    }
  else
    # no need to escape double quotes for filename
    cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINFILENAMELEN + RANDOM % MAXFILENAMELEN)) #| sed 's/\(["]\)/\\\1/g'
  fi
  printf "%s" $FILEEXT
}


echo "Warning: will create random tree at: $OUTDIR"
[ "$VERBOSE" == "1" ] && echo "  MAXFIRSTLEVELDIRS $MAXFIRSTLEVELDIRS ASCIIONLY $ASCIIONLY DIRDEPTH $DIRDEPTH MAXDIRCHILDREN $MAXDIRCHILDREN MAXDIRNAMELEN $MAXDIRNAMELEN MAXFILECHILDREN $MAXFILECHILDREN MAXFILENAMELEN $MAXFILENAMELEN MAXFILESIZE $MAXFILESIZE"

read -p "Proceed (y/n)? " READANS
if [ "$READANS" != "y" ]; then
  exit
fi

if [ -d "$OUTDIR" ]; then
  echo "Removing old outdir $OUTDIR"
  rm -rf "$OUTDIR"
fi

mkdir "$OUTDIR"

if [ $MAXFIRSTLEVELDIRS -gt 0 ]; then
  NUMFIRSTLEVELDIRS=$((1+RANDOM%MAXFIRSTLEVELDIRS))
else
  NUMFIRSTLEVELDIRS=0
fi



# create directories
for (( ifl=0;ifl<$((NUMFIRSTLEVELDIRS));ifl++ )) {
  FLDIR="$(get_rand_dirname)"
  FLCHILDREN="";
  for (( ird=0;ird<$((DIRDEPTH-1));ird++ )) {
    DIRCHILDREN=""; MOREDC=0;
    for ((idc=0; idc<$((MINDIRCHILDREN+RANDOM%MAXDIRCHILDREN)); idc++)) {
      CDIR="$(get_rand_dirname)" ;
      # make sure comma is last, so brace expansion works even for 1 element? that can mess with expansion math, though
      if [ "$DIRCHILDREN" == "" ]; then DIRCHILDREN="\"$CDIR\"" ;
      else DIRCHILDREN="$DIRCHILDREN,\"$CDIR\"" ; MOREDC=1 ; fi
    }
    if [ "$MOREDC" == "1" ] ; then
      if [ "$FLCHILDREN" == "" ]; then FLCHILDREN="{$DIRCHILDREN}" ;
      else FLCHILDREN="$FLCHILDREN/{$DIRCHILDREN}" ; fi
    else
      if [ "$FLCHILDREN" == "" ]; then FLCHILDREN="$DIRCHILDREN" ;
      else FLCHILDREN="$FLCHILDREN/$DIRCHILDREN" ; fi
    fi
  }
  DIRCMD="mkdir -p $OUTDIR/\"$FLDIR\"/$FLCHILDREN"
  eval "$DIRCMD"
  echo "$DIRCMD"
}

# now loop through all directories, create random files inside
# note printf '%q' escapes to preserve spaces; also here
# escape, and don't wrap path parts in double quotes (e.g. | sed 's_/_"/"_g');
# note then we STILL have to eval to use it!
# but now ls "$D" works, so noneed for QD
# unfortunately backslashes can make '%q' barf - prevent them
find "$OUTDIR" -type d | while IFS= read D ; do
  QD="$(printf '%q' "$(echo "$D")" )" ;
  [ "$VERBOSE" == "1" ] && echo "$D"; #echo "$QD"; ls -la "$D"; #eval "ls -la $QD";
  for ((ifc=0; ifc<$((MINFILECHILDREN+RANDOM%MAXFILECHILDREN)); ifc++)) {
    CFILE="$(get_rand_filename)" ;
    echo -n '> '
    [ "$VERBOSE" == "1" ] && echo "$D"/"$CFILE"
    cat /dev/urandom \
    | head -c$((MINFILESIZE + RANDOM % MAXFILESIZE)) \
    > "$D"/"$CFILE"
  }
done

echo
tree -a --dirsfirst -s "$OUTDIR"
echo "total bytes: $(du -bs $(echo "$OUTDIR"))"
like image 4
sdaau Avatar answered Nov 15 '22 22:11

sdaau