Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash Script Regular Expressions...How to find and replace all matches?

I am writing a bash script that reads a file line by line.

The file is a .csv file which contains many dates in the format DD/MM/YYYY but I would like to change them to YYYY-MM-DD.

I would to match the data using a regular expression, and replace it such that all of the dates in the file are correctly formatted as YYYY-MM-DD.

I believe this regular expression would match the dates:

([0-9][0-9]?)/([0-9][0-9]?)/([0-9][0-9][0-9][0-9])

But I do not know how to find regex matches and replace them with the new format, or if this is even possible in a bash script. Please help!

like image 751
Josh Avatar asked Apr 14 '11 03:04

Josh


People also ask

Can you use regex in Find and Replace?

Find and replace text using regular expressions When you want to search and replace specific patterns of text, use regular expressions. They can help you in pattern matching, parsing, filtering of results, and so on. Once you learn the regex syntax, you can use it for almost any language.

How do I find and replace in bash?

To replace content in a file, you must search for the particular file string. The 'sed' command is used to replace any string in a file using a bash script. This command can be used in various ways to replace the content of a file in bash. The 'awk' command can also be used to replace the string in a file.

How do you replace in regex?

To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring. The string 3foobar4 matches the regex /\d. *\d/ , so it is replaced.

How do I find all words in a regular expression?

The regular expression \b[A]\w+ can be used to find all words in the text which start with A. The \b means to begin searching for matches at the beginning of words, the [A] means that these matches start with the letter A, and the \w+ means to match one or more word characters.


3 Answers

Pure Bash.

infile='data.csv'

while read line ; do
  if [[ $line =~ ^(.*),([0-9]{1,2})/([0-9]{1,2})/([0-9]{4}),(.*)$ ]] ; then
    echo "${BASH_REMATCH[1]},${BASH_REMATCH[4]}-${BASH_REMATCH[3]}-${BASH_REMATCH[2]},${BASH_REMATCH[5]}"
  else
    echo "$line"
  fi
done < "$infile"

The input file

xxxxxxxxx,11/03/2011,yyyyyyyyyyyyy          
xxxxxxxxx,10/04/2011,yyyyyyyyyyyyy          
xxxxxxxxx,10/05/2012,yyyyyyyyyyyyy          
xxxxxxxxx,10/06/2011,yyyyyyyyyyyyy          

gives the following output:

xxxxxxxxx,2011-03-11,yyyyyyyyyyyyy
xxxxxxxxx,2011-04-10,yyyyyyyyyyyyy
xxxxxxxxx,2012-05-10,yyyyyyyyyyyyy
xxxxxxxxx,2011-06-10,yyyyyyyyyyyyy
like image 159
Fritz G. Mehner Avatar answered Sep 30 '22 12:09

Fritz G. Mehner


Try this using sed:

line='Today is 10/12/2010 and yesterday was 9/11/2010'
echo "$line" | sed -r 's#([0-9]{1,2})/([0-9]{1,2})/([0-9]{4})#\3-\2-\1#g'

OUTPUT:
Today is 2010-12-10 and yesterday was 2010-11-9

PS: On mac use sed -E instead of sed -r

like image 21
anubhava Avatar answered Sep 30 '22 11:09

anubhava


You can do it using sed

echo "11/12/2011" | sed -E 's/([0-9][0-9]?)\/([0-9][0-9]?)\/([0-9][0-9][0-9][0-9])/\3-\2-\1/'
like image 35
sverre Avatar answered Sep 30 '22 12:09

sverre