read in bash on whitespace-delimited file without empty fields collapsing

Question

I'm trying to read a multi-line tab-separated file in bash. The format is such that empty fields are expected. Unfortunately, the shell is collapsing together field separators which are next to each other, as so:

# IFS=$'	'
# read one two three <<<$'one		three'
# printf '<%s> ' "$one" "$two" "$three"; printf '
'
<one> <three> <>

...as opposed to the desired output of <one> <> <three>.

Can this be resolved without resorting to a separate language (such as awk)?

Dennis Williamson · Accepted Answer

It's not necessary to use tr, but it is necessary that IFS is a non-whitespace character (otherwise multiples get collapsed to singles as you've seen).

$ IFS=, read -r one two three <<<'one,,three'
$ printf '<%s> ' "$one" "$two" "$three"; printf '
'
<one> <> <three>

$ var=$'one		three'
$ var=${var//$'	'/,}
$ IFS=, read -r one two three <<< "$var"
$ printf '<%s> ' "$one" "$two" "$three"; printf '
'
<one> <> <three>

$ idel=$'	' odel=','
$ var=$'one		three'
$ var=${var//$idel/$odel}
$ IFS=$odel read -r one two three <<< "$var"
$ printf '<%s> ' "$one" "$two" "$three"; printf '
'
<one> <> <three>

DigitalRoss · Answer

Sure

IFS=,
echo $'one		three' | tr \11 , | (
  read one two three
  printf '<%s> ' "$one" "$two" "$three"; printf '
'
)

I've rearranged the example just a bit, but only to make it work in any Posix shell.

Update: Yeah, it seems that white space is special, at least if it's in IFS. See the second half of this paragraph from bash(1):

   The shell treats each character of IFS as a delimiter, and  splits  the
   results of the other expansions into words on these characters.  If IFS
   is unset, or its value is exactly <space><tab><newline>,  the  default,
   then  any  sequence  of IFS characters serves to delimit words.  If IFS
   has a value other than the default, then sequences  of  the  whitespace
   characters  space  and  tab are ignored at the beginning and end of the
   word, as long as the whitespace character is in the value  of  IFS  (an
   IFS whitespace character).  Any character in IFS that is not IFS white-
   space, along with any adjacent IFS whitespace  characters,  delimits  a
   field.   A  sequence  of IFS whitespace characters is also treated as a
   delimiter.  If the value of IFS is null, no word splitting occurs.

Charles Duffy · Answer

I've written a function which works around this issue. This particular implementation is particular about tab-separated columns and newline-separated rows, but that limitation could be removed as a straightforward exercise:

read_tdf_line() {
    local default_ifs=$' 	
'
    local n line element at_end old_ifs
    old_ifs="${IFS:-${default_ifs}}"
    IFS=$'
'

    if ! read -r line ; then
        return 1
    fi
    at_end=0
    while read -r element; do
        if (( $# > 1 )); then
            printf -v "$1" '%s' "$element"
            shift
        else
            if (( at_end )) ; then
                # replicate read behavior of assigning all excess content
                # to the last variable given on the command line
                printf -v "$1" '%s	%s' "${!1}" "$element"
            else
                printf -v "$1" '%s' "$element"
                at_end=1
            fi
        fi
    done < <(tr '	' '
' <<<"$line")

    # if other arguments exist on the end of the line after all
    # input has been eaten, they need to be blanked
    if ! (( at_end )) ; then
        while (( $# )) ; do
            printf -v "$1" '%s' ''
            shift
        done
    fi

    # reset IFS to its original value (or the default, if it was
    # formerly unset)
    IFS="$old_ifs"
}

Usage as follows:

# read_tdf_line one two three rest <<<$'one		three	four	five'
# printf '<%s> ' "$one" "$two" "$three" "$rest"; printf '
'
<one> <> <three> <four       five>

read in bash on whitespace-delimited file without empty fields collapsing

Tags:

bash

Charles Duffy

3 Answers

Dennis Williamson

Sure

DigitalRoss

Charles Duffy

Recent Activity

Donate For Us

read in bash on whitespace-delimited file without empty fields collapsing

Tags:

bash

Charles Duffy

3 Answers

Dennis Williamson

Sure

DigitalRoss

Charles Duffy

Related questions

Recent Activity

Donate For Us