Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regular expressions in shell script

Tags:

What is the correct way to parse a string using regular expressions in a linux shell script? I wrote the following script to print my SO rep on the console using curl and sed (not solely because I'm rep-crazy - I'm trying to learn some shell scripting and regex before switching to linux).

json=$(curl -s http://stackoverflow.com/users/flair/165297.json) echo $json | sed 's/.*"reputation":"\([0-9,]\{1,\}\)".*/\1/' | sed s/,// 

But somehow I feel that sed is not the proper tool to use here. I heard that grep is all about regex and explored it a bit. But apparently it prints the whole line whenever a match is found - I am trying to extract a number from a single line of text. Here is a downsized version of the string that I'm working on (returned by curl).

{"displayName":"Amarghosh","reputation":"2,737","badgeHtml":"\u003cspan title=\"1 silver badge\"\u003e\u003cspan class=\"badge2\"\u003e●\u003c/span\u003e\u003cspan class=\"badgecount\"\u003e1\u003c/span\u003e\u003c/span\u003e"}

I guess my questions are:

  • What is the correct way to parse a string using regular expressions in a linux shell script?
  • Is sed the right thing to use here?
  • Could this be done using grep?
  • Is there any other command that's more easier/appropriate?
like image 372
Amarghosh Avatar asked Oct 28 '09 10:10

Amarghosh


People also ask

What is regular expression in shell script?

A regular expression (regex) is a text pattern that can be used for searching and replacing. Regular expressions are similar to Unix wild cards used in globbing, but much more powerful, and can be used to search, replace and validate text.

Can you use regex in bash script?

Regex is a very powerful tool that is available at our disposal & the best thing about using regex is that they can be used in almost every computer language. So if you are Bash Scripting or creating a Python program, we can use regex or we can also write a single line search query.

What are regular expressions in bash?

If you're wondering what is meant by "regular expression", a brief explanation is in order. A regular expression is some sequence of characters that represents a pattern. For example, the [0-9] in the example above will match any single digit where [A-Z] would match any capital letter.


1 Answers

The grep command will select the desired line(s) from many but it will not directly manipulate the line. For that, you use sed in a pipeline:

someCommand | grep 'Amarghosh' | sed -e 's/foo/bar/g' 

Alternatively, awk (or perl if available) can be used. It's a far more powerful text processing tool than sed in my opinion.

someCommand | awk '/Amarghosh/ { do something }' 

For simple text manipulations, just stick with the grep/sed combo. When you need more complicated processing, move on up to awk or perl.

My first thought is to just use:

echo '{"displayName":"Amarghosh","reputation":"2,737","badgeHtml"'     | sed -e 's/.*tion":"//' -e 's/".*//' -e 's/,//g' 

which keeps the number of sed processes to one (you can give multiple commands with -e).

like image 64
paxdiablo Avatar answered Oct 22 '22 08:10

paxdiablo