Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting directory name from an absolute path using sed or awk

Tags:

bash

sed

awk

I want to split this line

/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh

to

/home/edwprod/abortive_visit/bin

using sed or awk scripts? Could you help on this?

like image 273
AruM Avatar asked Dec 19 '11 10:12

AruM


6 Answers

dirname

kent$  dirname "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh"
/home/edwprod/abortive_visit/bin

sed

kent$  echo "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh"|sed 's#/[^/]*$##'
/home/edwprod/abortive_visit/bin

grep

kent$  echo "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh"|grep  -oP '^/.*(?=/)'
/home/edwprod/abortive_visit/bin

awk

kent$  echo "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh"|awk -F'/[^/]*$' '{print $1}'
/home/edwprod/abortive_visit/bin
like image 118
Kent Avatar answered Oct 21 '22 21:10

Kent


May be command dirname is what you searching for?

dirname /home/edwprod/abortive_visit/bin/abortive_proc_call.ksh

Or if you want sed, so see my solution:

echo /home/edwprod/abortive_visit/bin/abortive_proc_call.ksh | sed 's/\(.*\)\/.*/\1/'
like image 41
4ndrew Avatar answered Oct 21 '22 21:10

4ndrew


For most platforms and Unix/Linux shells now available dirname:

dirname /home/edwprod/abortive_visit/bin/abortive_proc_call.ksh

Using of dirname is the simpliest way, but it is not recommended for cross platform scripting for example in the last version of autoconf documentation http://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Limitations-of-Usual-Tools.html#Limitations-of-Usual-Tools .

So my full featured version of sed-based alternative for dirname:

str="/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh"
echo "$str" | sed -n -e '1p' | sed  -e 's#//*#/#g' -e 's#\(.\)/$#\1#' -e 's#^[^/]*$#.#' -e 's#\(.\)/[^/]*$#\1#' -

Examples:

It works like dirname:

  • For path like /aa/bb/cc it will print /aa/bb
  • For path like /aa/bb it will print /aa
  • For path like /aa/bb/ it will print /aa too.
  • For path like /aa/ it will print /aa
  • For path like / it will print /
  • For path like aa it will print .
  • For path like aa/ it will print .

That is:

  • It works correct with trailing /
  • It works correct with paths that contains only base name like aa and aa/
  • It works correct with paths starting with / and the path / itself.
  • It works correct with any $str if it contains \n at the end or not, even with many \n
  • It uses cross platform sed command
  • It changes all combinations of / (// ///) to /
  • It can't work correct with paths containing newlines and characters invalid for current locale.

Note Alternative for basename may be useful:

echo "$str" | awk -F"/" '{print $NF}' -
like image 31
Роман Коптев Avatar answered Oct 21 '22 20:10

Роман Коптев


awk + for :

echo "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh" | awk 'BEGIN{res=""; FS="/";}{ for(i=2;i<=NF-1;i++) res=(res"/"$i);} END{print res}'
like image 35
wangzhengyi Avatar answered Oct 21 '22 20:10

wangzhengyi


This code with awk will work perfectly as same as dirname, I guess.

It's so simple and has very low cost to work. Good luck.

Code

$ foo=/app/java/jdk1.7.0_71/bin/java
$ echo "$foo" | awk -F "/*[^/]*/*$" '
{ print ($1 == "" ? (substr($0, 1, 1) == "/" ? "/" : ".") : $1); }'

Result

/app/java/jdk1.7.0_71/bin

Test

  • foo=/app/java/jdk1.7.0_71/bin/java -> /app/java/jdk1.7.0_71/bin
  • foo=/app/java/jdk1.7.0_71/bin/ -> /app/java/jdk1.7.0_71
  • foo=/app/java/jdk1.7.0_71/bin -> /app/java/jdk1.7.0_71
  • foo=/app/ -> /
  • foo=/app -> /
  • foo=fighters/ -> .

More

If you're not available such awk delimiter, try it this way.

$ echo $foo | awk '{
 dirname = gensub("/*[^/]*/*$", "", "", $0);
 print (dirname == "" ? (substr($0, 1, 1) == "/" ? "/" : ".") : dirname);
 }'
like image 2
caret Avatar answered Oct 21 '22 20:10

caret


In addition, to the answer of Kent, an alternative awk solution is:

awk 'BEGIN{FS=OFS="/"}{NF--}1'

which has the same sickness as the one presented by Kent. The following, somewhat longer Awk corrects all the flaws:

awk 'BEGIN{FS=OFS="/"}{gsub("/+","/")}
     {s=$0~/^\//;NF-=$NF?1:2;$0=$0?$0:(s?"/":".")};1' <file>

The following table shows the difference:

| path       | dirname | awk full | awk short |
|------------+---------+----------+-----------|
| .          | .       | .        |           |
| /          | /       | /        |           |
| foo        | .       | .        |           |
| foo/       | .       | .        | foo       |
| foo/bar    | foo     | foo      | foo       |
| foo/bar/   | foo     | foo      | foo/bar   |
| /foo       | /       | /        |           |
| /foo/      | /       | /        | /foo      |
| /foo/bar   | /foo    | /foo     | /foo      |
| /foo/bar/  | /foo    | /foo     | /foo/bar  |
| /foo///bar | /foo    | /foo     | /foo//    |

note: dirname is the real way to go, unless you have to process masses of them stored in a file.

like image 1
kvantour Avatar answered Oct 21 '22 20:10

kvantour