I have an expression like <pre class="prettyprint"><code>test_abc_HelloWorld_there could be more here. </code></pre> <ol> <li>I'd like a regex that takes the first word before the first underscore. So get <code>"test"</code> </li> </ol> I tried <code>[A-Za-z]{1,}_</code> but that didn't work. <ol> <li>Then I'd like to get "abc" or anything in between the first 2 underscores.</li> </ol> 2 Separate Regular expressions, not combined Any help is very appreciated! Example: for 1) the regex would match the word <code>test</code> for 2) the regex would match the word <code>abc</code> so any other match for either case would be wrong. As in, if I were to replace what I matched on then I would get something like this: for case 1) match "test" and replace "test" with "Goat". <pre class="prettyprint"><code>'Goat_abc_HelloWorld_there could be more here' </code></pre> I don't want a replace, I just want a match on a word.

In both case you can use assertions. <pre class="prettyprint"><code>^[^_]+(?=_) </code></pre> will get you everything up to the first underscore of the line, and <pre class="prettyprint"><code>(?<=_)[^_]+(?=_) </code></pre> will match whatever string is located between two unserscores.

Step back and consider that maybe you're overengineering the solution here. Ruby has a split method for this, other languages probably have their own equivalents given something like this "AAPL_annual_i.xls", you could just do this and take advantage of the fact that your data is already structured <pre class="prettyprint"><code>string_object = "AAPL_annual_i.xls" ary = string_object.split("_") #=> ["AAPL", "annual", "i.xls"] extension = ary.split(".")[1] #=> ["xls"] filetype = ary[3].split(".")[0] #etc </code></pre> 'doh! But seriously, I've found that leaning on the split method is not only easier on me, it's easier on my associates who have to read my code and understand what it does.

Regex: match everything before FIRST underscore and everything in between AFTER

Tags:

regex

I have an expression like

Click to copy

test_abc_HelloWorld_there could be more here.

I'd like a regex that takes the first word before the first underscore. So get "test"

I tried [A-Za-z]{1,}_ but that didn't work.

Then I'd like to get "abc" or anything in between the first 2 underscores.

2 Separate Regular expressions, not combined

Any help is very appreciated!

Example:

for 1) the regex would match the word test for 2) the regex would match the word abc

so any other match for either case would be wrong. As in, if I were to replace what I matched on then I would get something like this:

for case 1) match "test" and replace "test" with "Goat".

Click to copy

'Goat_abc_HelloWorld_there could be more here'

I don't want a replace, I just want a match on a word.

215

asked May 12 '11 23:05

EKet

2 Answers

In both case you can use assertions.

Click to copy

^[^_]+(?=_)

will get you everything up to the first underscore of the line, and

Click to copy

(?<=_)[^_]+(?=_)

will match whatever string is located between two unserscores.

200

answered Nov 15 '22 19:11

Thomas Hupkens

Step back and consider that maybe you're overengineering the solution here. Ruby has a split method for this, other languages probably have their own equivalents

given something like this "AAPL_annual_i.xls", you could just do this and take advantage of the fact that your data is already structured

Click to copy

string_object = "AAPL_annual_i.xls"
ary = string_object.split("_")
#=> ["AAPL", "annual", "i.xls"]
extension = ary.split(".")[1]
#=> ["xls"]
filetype = ary[3].split(".")[0] #etc

'doh!

But seriously, I've found that leaning on the split method is not only easier on me, it's easier on my associates who have to read my code and understand what it does.

answered Nov 15 '22 19:11

boulder_ruby

Related questions
                            
                                Pattern matching a regex pattern in Haskell
                            
                                Parse text for hashtags and replace with links using php
                            
                                Regex to extract date time from given string
                            
                                Regex to match a string with 2 capital letters only
                            
                                Find all possible substrings beginning with characters from capturing group
                            
                                Does a regular expression exist for enzymatic cleavage?
                            
                                Python regex look-behind requires fixed-width pattern
                            
                                Remove end of line characters from end of Java String
                            
                                Regular expression to allow range of numbers, or null
                            
                                How to find a whole word in a string in PHP without accidental matches?
                            
                                BBEdit-compatible regex for remove blank lines
                            
                                Pass regex options to PowerShell [regex] type
                            
                                jQuery - regexp selecting and removeClass()?
                            
                                Allow only 2 decimal points entry to a textbox using jquery?
                            
                                grep substring between two delimiters
                            
                                Removing punctuation marks form text in Scala - Spark
                            
                                Python: Remove numbers at the beginning of a string
                            
                                c# - Check if string ends with 4 numbers
                            
                                Regex that does not allow consecutive dots
                            
                                How do I filter a string on just numbers, dots and commas?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regex: match everything before FIRST underscore and everything in between AFTER

Tags:

regex

EKet

People also ask

2 Answers

Thomas Hupkens

boulder_ruby

Recent Activity

Donate For Us