Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I extract the first two characters of a string in shell scripting?

People also ask

How do you get the first two characters of a string?

Use the String. substring() method to get the first two characters of a string, e.g. const first2 = str. substring(0, 2); . The substring method will return a new string containing the first two characters of the original string.

How do you get the first two characters of a string in Unix?

Getting the first character To access the first character of a string, we can use the (substring) parameter expansion syntax ${str:position:length} in the Bash shell. position: The starting position of a string extraction. length: The number of characters we need to extract from a string.

How do I extract a character from a string in bash?

Using the cut Command You can also use the -d and -f flags to extract a string by specifying characters to split on. The -d flag lets you specify the delimiter to split on while -f lets you choose which substring of the split to choose.

What does 1 & 2 mean in shell script?

"You use &1 to reference the value of the file descriptor 1 (stdout). So when you use 2>&1 you are basically saying “Redirect the stderr to the same place we are redirecting the stdout”.


Probably the most efficient method, if you're using the bash shell (and you appear to be, based on your comments), is to use the sub-string variant of parameter expansion:

pax> long="USCAGol.blah.blah.blah"
pax> short="${long:0:2}" ; echo "${short}"
US

This will set short to be the first two characters of long. If long is shorter than two characters, short will be identical to it.

This in-shell method is usually better if you're going to be doing it a lot (like 50,000 times per report as you mention) since there's no process creation overhead. All solutions which use external programs will suffer from that overhead.

If you also wanted to ensure a minimum length, you could pad it out before hand with something like:

pax> long="A"
pax> tmpstr="${long}.."
pax> short="${tmpstr:0:2}" ; echo "${short}"
A.

This would ensure that anything less than two characters in length was padded on the right with periods (or something else, just by changing the character used when creating tmpstr). It's not clear that you need this but I thought I'd put it in for completeness.


Having said that, there are any number of ways to do this with external programs (such as if you don't have bash available to you), some of which are:

short=$(echo "${long}" | cut -c1-2)
short=$(echo "${long}" | head -c2)
short=$(echo "${long}" | awk '{print substr ($0, 0, 2)}'
short=$(echo "${long}" | sed 's/^\(..\).*/\1/')

The first two (cut and head) are identical for a single-line string - they basically both just give you back the first two characters. They differ in that cut will give you the first two characters of each line and head will give you the first two characters of the entire input

The third one uses the awk sub-string function to extract the first two characters and the fourth uses sed capture groups (using () and \1) to capture the first two characters and replace the entire line with them. They're both similar to cut - they deliver the first two characters of each line in the input.

None of that matters if you are sure your input is a single line, they all have an identical effect.


The easiest way is:

${string:position:length}

Where this extracts $length substring from $string at $position.

This is a Bash builtin, so awk or sed is not required.


You've gotten several good answers and I'd go with the Bash builtin myself, but since you asked about sed and awk and (almost) no one else offered solutions based on them, I offer you these:

echo "USCAGoleta9311734.5021-120.1287855805" | awk '{print substr($0,0,2)}'

and

echo "USCAGoleta9311734.5021-120.1287855805" | sed 's/\(^..\).*/\1/'

The awk one ought to be fairly obvious, but here's an explanation of the sed one:

  • substitute "s/"
  • the group "()" of two of any characters ".." starting at the beginning of the line "^" and followed by any character "." repeated zero or more times "*" (the backslashes are needed to escape some of the special characters)
  • by "/" the contents of the first (and only, in this case) group (here the backslash is a special escape referring to a matching sub-expression)
  • done "/"

Just grep:

echo 'abcdef' | grep -Po "^.."        # ab

If you're in bash, you can say:

bash-3.2$ var=abcd
bash-3.2$ echo ${var:0:2}
ab

This may be just what you need…