Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash split string on delimiter, assign segments to array

Tags:

bash

In bash, I would like to transform a PATH-like environment variable that may contain space-separated elements into an array, making sure elements bearing spaces do not cause word-splitting, appearing as "multiple elements".


Let PATH_VARIABLE be the variable in question.

Let un:dodecaedro:per:tirare:per:i danni be the content of the variable.

It is intended for the desired array _to have 6 elements, not 7.

0) un
1) dodecaedro
2) per
3) tirare
4) per
5) i danni

The "tricky" entry may be the space-separated value: i danni.

I am looking for the absolute most elegant and correct way to achieve this.

Limitation: it must work with my bash version: v3.2.48(1)-release


In python this is done just beautifully as so:

>>> v='un:dodecaedro:per:tirare:per:i danni'
>>> len(v.split(':'))
6

Works. Shows what I am looking for.


What's the best way to do this in our beloved bash?

Can you specifically improve on my attempt 4?

Here my attempts


#!/bin/bash

PATH_VARIABLE='un:dodecaedro:per:tirare:per:i danni'

# WRONG
a1=($(echo $PATH_VARIABLE | tr ':' '\n'))

# WRONG
a2=($(
  while read path_component; do
  echo "$path_component"
  done < <(echo "$PATH_VARIABLE" | tr ':' '\n')
))

# WORKS, it is elegant.. but I have no bash 4!
# readarray -t a3 < <(echo "$PATH_VARIABLE" | tr ':' '\n')

# WORKS, but it looks "clunky" to me :(
i=0
while read line; do
  a4[i++]=$line
done < <(echo "$PATH_VARIABLE" | tr ':' '\n')

n=${#a4[@]}
for ((i=0; i < n; i++)); do
  printf '%2d) %s\n' "$i" "${a4[i]}"
done

My environment

bash v3.2.48(1)-release

osx OS X v10.8.3 (build 12D78)


like image 214
Robottinosino Avatar asked Apr 03 '13 02:04

Robottinosino


2 Answers

Consider:

$ foo='1:2 3:4 5:6'
$ IFS=':'; arr=($foo)
$ echo "${arr[0]}"
1
$ echo "${arr[1]}"
2 3
$ echo "${arr[2]}"
4 5
$ echo "${arr[3]}"
6

Oh well - took me too long to format an answer... +1 @kojiro.

like image 126
jim mcnamara Avatar answered Sep 20 '22 23:09

jim mcnamara


# Right. Add -d '' if PATH members may contain newlines.
IFS=: read -ra myPath <<<"$PATH"

# Wrong!
IFS=: myPath=($PATH)

# Wrong!
IFS=:
for x in $PATH; do ...

# How to do it wrong right...
# Works around some but not all word split problems
# For portability, some extra wrappers are needed and it's even harder.
function stupidSplit {
    if [[ -z $3 ]]; then
        return 1
    elif [[ $- != *f* ]]; then
        trap 'trap RETURN; set +f' RETURN
        set -f
    fi
    IFS=$3 command eval "${1}=(\$${2})"
}

function main {
    typeset -a myPath
    if ! stupidSplit myPath PATH :; then
        echo "Don't pass stupid stuff to stupidSplit" >&2
        return 1
    fi
}

main

Rule #1: Don't cram a compound data structure into a string or stream unless there's no alternative. PATH is one case where you have to deal with it.

Rule #2: Avoid word / field splitting at all costs. There are almost no legitimate reasons to apply word splitting on the value of a parameter in non-minimalist shells such as Bash. Almost all beginner pitfalls can be avoided by just never word splitting with IFS. Always quote.

like image 41
ormaaj Avatar answered Sep 20 '22 23:09

ormaaj