I have input files with the structure like the next:
a1
b1
c1
c2
c3
b2
c1
d1
d2
b3
b4
a2
a3
b1
b2
c1
c2
Each level is indented by 2 spaces. The needed output is:
a1/b1/c1
a1/b1/c2
a1/b1/c3
a1/b2/c1/d1
a1/b2/c1/d2
a1/b3
a1/b4
a2
a3/b1
a3/b2/c1
a3/b2/c2
It is like a filesystem, if the next line have bigger indentation, the current one is like a "directory" and when have same indentation it is like a "file". Need print full paths of "files".
Trying to solve this without any high-level language, like python
, perl
- with only basic bash commands.
My current code/idea is based on recursive function call and working with a stack, but have problem with the "logic". The code currently outputs the next:
a1 b1 c1
a1 b1
a1
DD: line 8: [0-1]: bad array subscript
only the 1st line is OK - so handling the recursion is wrong...
input="ifile.tree"
#stack array
declare -a stack
#stack manipulation
pushstack() { stack+=("$1"); }
popstack() { unset stack[${#stack[@]}-1]; }
printstack() { echo "${stack[*]}"; }
#recursive function
checkline() {
local uplev=$1
#read line - if no more lines - print the stack and return
read -r level text || (printstack; exit 1) || return
#if the current line level is largest than previous level
if [[ $uplev < $level ]]
then
pushstack "$text"
checkline $level #recurse
fi
printstack
popstack
}
# MAIN PROGRAM
# change the input from indented spaces to
# level_number<space>text
(
#subshell - change IFS
IFS=,
while read -r spaces content
do
echo $(( (${#spaces} / 2) + 1 )) "$content"
done < <(sed 's/[^ ]/,&/' < "$input")
) | ( #pipe to another subshell
checkline 0 #recurse by levels
)
Sry for the long code - can anybody help?
interesting question.
this awk (could be one-liner) command does the job:
awk -F' ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
{a[NF] =$NF;p=NF }
END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")}' file
you can see above, there are duplicated codes, you can extract them into a function if you like.
test with your data:
kent$ cat f
a1
b1
c1
c2
c3
b2
c1
d1
d2
b3
b4
a2
a3
b1
b2
c1
c2
kent$ awk -F' ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
{a[NF] =$NF;p=NF }END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")} ' f
a1/b1/c1
a1/b1/c2
a1/b1/c3
a1/b2/c1/d1
a1/b2/c1/d2
a1/b3
a1/b4
a2
a3/b1
a3/b2/c1
a3/b2/c2
I recently had to do something similar enough that with a few tweaks I can post my script here:
#!/bin/bash
prev_level=-1
# Index into node array
i=0
# Regex to screen-scrape all nodes
tc_re="^(( )*)(.*)$"
while IFS= read -r ln; do
if [[ $ln =~ $tc_re ]]; then
# folder level indicated by spaces in preceding node name
spaces=${#BASH_REMATCH[1]}
# 2 space characters per level
level=$(($spaces / 2))
# Name of the folder or node
node=${BASH_REMATCH[3]}
# get the rest of the node path from the previous entry
curpath=( ${curpath[@]:0:$level} $node )
# increment i only if the current level is <= the level of the previous
# entry
if [ $level -le $prev_level ]; then
((i++))
fi
# add this entry (overwrite previous if $i was not incremented)
tc[$i]="${curpath[@]}"
# save level for next iteration
prev_level=$level
fi
done
for p in "${tc[@]}"; do
echo "${p// //}"
done
Input is taken from STDIN, so you'd have to do something like this:
$ ./tree2path.sh < ifile.tree
a1/b1/c1
a1/b1/c2
a1/b1/c3
a1/b2/c1/d1
a1/b2/c1/d2
a1/b3
a1/b4
a2
a3/b1
a3/b2/c1
a3/b2/c2
$
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With