I'm looking for a .NET regular expression extract all the URLs from a webpage but haven't found one to be comprehensive enough to cover all the different ways you can specify a link. And a side question: Is there one regex to rule them all? Or am I better off using a series of less complicated regular expressions and just using mutliple passes against the raw HTML? (Speed vs. Maintainability)

<pre class="prettyprint"><code>((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+) </code></pre> I took this from regexlib.com [editor's note: the {1} has no real function in this regex; see this post]

Regular expression for parsing links from a webpage?

1 Answers

((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)

I took this from regexlib.com

[editor's note: the {1} has no real function in this regex; see this post]

187

answered Sep 19 '22 21:09

csmba

Related questions
                            
                                Does Simple Injector supports MVC 4 ASP.NET Web API?
                            
                                Async and await in MVC 4 Controller
                            
                                S.O.L.I.D principles and compilation? [closed]
                            
                                Decimal Parse Issue
                            
                                Stack and Queue enumeration order
                            
                                Convert from HttpResponseMessage to IActionResult in .NET Core
                            
                                How do I handle message failure in MSMQ bindings for WCF
                            
                                Updating multiple rows Linq vs SQL
                            
                                Why the Reset() method on Enumerator class must throw a NotSupportedException()?
                            
                                Why do I need "field:" in my attribute declaration "[field:NonSerialized]"?
                            
                                how would I access this WPF XAML resource programmatically?
                            
                                Entity Framework: Problem associating entities with nullable field
                            
                                Under what conditions will `RealProxy.GetTransparentProxy()` return `null`?
                            
                                Sort a Dictionary by key AND value?
                            
                                What happens if I set HttpGetEnabled = false
                            
                                WPF: How to make RichTextBox look like TextBlock?
                            
                                Dispatch.Invoke( new Action...) with a parameter
                            
                                Write boldface text using Console.WriteLine (C#) or printfn (F#)?
                            
                                Why does Dictionary.ContainsKey throw ArgumentNullException? [closed]
                            
                                Is there a difference between "double val = 1;" and "double val = 1D;"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular expression for parsing links from a webpage?

Tags:

html

.net

regex

Chris Smith

People also ask

1 Answers

csmba

Recent Activity

Donate For Us