Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use sed to extract substring

I have a file containing the following lines:

  <parameter name="PortMappingEnabled" access="readWrite" type="xsd:boolean"></parameter>   <parameter name="PortMappingLeaseDuration" access="readWrite" activeNotify="canDeny" type="xsd:unsignedInt"></parameter>   <parameter name="RemoteHost" access="readWrite"></parameter>   <parameter name="ExternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="ExternalPortEndRange" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="InternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="PortMappingProtocol" access="readWrite"></parameter>   <parameter name="InternalClient" access="readWrite"></parameter>   <parameter name="PortMappingDescription" access="readWrite"></parameter> 

I want to execute command on this file to extract only the parameter names as displayed in the following output:

$sedcommand file.txt PortMappingEnabled PortMappingLeaseDuration RemoteHost ExternalPort ExternalPortEndRange InternalPort PortMappingProtocol InternalClient PortMappingDescription 

What could be this command?

like image 640
MOHAMED Avatar asked May 21 '13 16:05

MOHAMED


People also ask

Can you use sed on a string?

The sed command is a common Linux command-line text processing utility. It's pretty convenient to process text files using this command. However, sometimes, the text we want the sed command to process is not in a file. Instead, it can be a literal string or saved in a shell variable.

How do you use sed?

Find and replace text within a file using sed command Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input.txt. The s is the substitute command of sed for find and replace. It tells sed to find all occurrences of 'old-text' and replace with 'new-text' in a file named input.txt.


2 Answers

grep was born to extract things:

grep -Po 'name="\K[^"]*' 

test with your data:

kent$  echo '<parameter name="PortMappingEnabled" access="readWrite" type="xsd:boolean"></parameter>   <parameter name="PortMappingLeaseDuration" access="readWrite" activeNotify="canDeny" type="xsd:unsignedInt"></parameter>   <parameter name="RemoteHost" access="readWrite"></parameter>   <parameter name="ExternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="ExternalPortEndRange" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="InternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>   <parameter name="PortMappingProtocol" access="readWrite"></parameter>   <parameter name="InternalClient" access="readWrite"></parameter>   <parameter name="PortMappingDescription" access="readWrite"></parameter> '|grep -Po 'name="\K[^"]*' PortMappingEnabled PortMappingLeaseDuration RemoteHost ExternalPort ExternalPortEndRange InternalPort PortMappingProtocol InternalClient PortMappingDescription 
like image 114
Kent Avatar answered Sep 29 '22 17:09

Kent


sed 's/[^"]*"\([^"]*\).*/\1/'

does the job.

explanation of the part inside ' '

  • s - tells sed to substitute
  • / - start of regex string to search for
  • [^"]* - any character that is not ", any number of times. (matching parameter name=)
  • " - just a ".
  • ([^"]*) - anything inside () will be saved for reference to use later. The \ are there so the brackets are not considered as characters to search for. [^"]* means the same as above. (matching RemoteHost for example)
  • .* - any character, any number of times. (matching " access="readWrite"> /parameter)
  • / - end of the search regex, and start of the substitute string.
  • \1 - reference to that string we found in the brackets above.
  • / end of the substitute string.

basically s/search for this/replace with this/ but we're telling him to replace the whole line with just a piece of it we found earlier.

like image 31
unxnut Avatar answered Sep 29 '22 18:09

unxnut