Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grep all instances of strings that start with certain characters

Tags:

linux

grep

bash

cat

I would like to grep out all instances of strings that start with the characters 'rs' (from just one file) and pipe the full string into a new file. I managed to get the count of the instances but I don't know how to get them into the new file:

grep -c rs < /home/Stephanie/this.txt
698572

An example of a line in the file is:

1203823    forward   efjdhgv   rs124054t8 dhdfhfhs
12045345    back   efjdkkjf   rs12445368 dhdfhfhs

I just want to grab the rs string and move it to a ne file. Can someone help me out with the piping? I read around a bit but what I found wasn't particularly helpful to me. thanks

like image 225
Stephopolis Avatar asked Feb 22 '13 04:02

Stephopolis


People also ask

How do you grep a string that is started with some string and ends with some string like a B?

The Backslash Character and Special Expressions The symbols \< and \> respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word.

How do you grep a string with special characters in Unix?

If you include special characters in patterns typed on the command line, escape them by enclosing them in single quotation marks to prevent inadvertent misinterpretation by the shell or command interpreter. To match a character that is special to grep –E, put a backslash ( \ ) in front of the character.

How do you grep first character?

In grep command, caret Symbol ^ matches the expression at the start of a line. In the following example, it displays all the line which starts with the Nov 10. i.e All the messages logged on November 10. The ^ matches the expression in the beginning of a line, only if it is the first character in a regular expression.

How do you grep case sensitive?

By default, grep is case sensitive. This means that the uppercase and lowercase characters are treated as distinct. To ignore case when searching, invoke grep with the -i option (or --ignore-case ).


1 Answers

I'd suggest something like this:

egrep -o "(\s(rs\S+))" data.txt | cut -d " " -f 2 > newfile.txt

\s looks for something that starts with any whitespace character

(rs\S+) and then searches for a string that starts with "rs" and is followed by any non-whitespace character

The results still have the white spaces in it, which we don't want, so we "cut" them out, before the content gets written to new file.

like image 66
biophonc Avatar answered Sep 22 '22 21:09

biophonc