Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does grep -Po '...\K...' do? How else can that effect be achieved?

Tags:

linux

bash

shell

I have this script script.sh:

#!/bin/bash
file_path=$1
result=$(grep -Po 'value="\K.*?(?=")' $file_path)
echo $result

and this file text.txt:

value="a"
value="b"
value="c"

When I run ./script.sh /file/directory/text.txt command, the output in the terminal is the following:

a b c

I understand what the script does, but I don't understand HOW it works, so I need a detailed explanation of this part of command:

-Po 'value="\K.*?(?=")'

If I understood correctly, \K is a Perl command. Can you give me an alternative in shell (for example with awk command)?


Thank you in advance.

like image 348
Ordinary User Avatar asked Jun 13 '17 14:06

Ordinary User


1 Answers

  • grep -P enables PCRE syntax. (This is a non-standard extension -- not even all builds of GNU grep support it, as it depends on the optional libpcre library, and whether to link this in is a compile-time option).
  • grep -o emits only matched text, and not the entire line containing said text, in output. (This too is nonstandard, though more widely available than -P).
  • \K is a PCRE extension to regex syntax discarding content prior to that point from being included in match output.

Since your shell is bash, you have ERE support built in. As an alternative that uses only built-in functionality (no external tools, grep, awk or otherwise):

#!/usr/bin/env bash
regex='value="([^"]*)"'                    # store regex (w/ match group) in a variable
results=( )                             # define an empty array to store results
while IFS= read -r line; do             # iterate over lines on input
  if [[ $line =~ $regex ]]; then        # ...and, when one matches the regex...
    results+=( "${BASH_REMATCH[1]}" )   # ...put the group's contents in the array
  fi
done <"$1"                              # with stdin coming from the file named in $1
printf '%s\n' "${results[*]}"           # combine array results with spaces and print

See http://wiki.bash-hackers.org/syntax/ccmd/conditional_expression for a discussion of =~, and http://wiki.bash-hackers.org/syntax/shellvars#bash_rematch for a discussion of BASH_REMATCH. See BashFAQ #1 for a discussion of reading files line-by-line with a while read loop.

like image 194
Charles Duffy Avatar answered Oct 19 '22 10:10

Charles Duffy