I've tried a number of things but nothing seems to be working properly. I have an Access DB and am writing code in VBA. I have a string of HTML source code that I am interested in stripping all of the HTML code and Tags out of so that I just have plain text string with no html or tags left. What is the best way to do this?
Thanks
One way that's as resilient as possible to bad markup;
with createobject("htmlfile")
    .open
    .write "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
    .close
    msgbox "text=" & .body.outerText
end with
                            Function StripHTML(cell As Range) As String  
 Dim RegEx As Object  
 Set RegEx = CreateObject("vbscript.regexp")  
 Dim sInput As String  
 Dim sOut As String  
 sInput = cell.Text  
 With RegEx  
   .Global = True  
   .IgnoreCase = True  
   .MultiLine = True  
.Pattern = "<[^>]+>" 'Regular Expression for HTML Tags.  
 End With  
 sOut = RegEx.Replace(sInput, "")  
 StripHTML = sOut  
 Set RegEx = Nothing  
End Function  
This might help you, Good luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With