I've tried a number of things but nothing seems to be working properly. I have an Access DB and am writing code in VBA. I have a string of HTML source code that I am interested in stripping all of the HTML code and Tags out of so that I just have plain text string with no html or tags left. What is the best way to do this?
Thanks
One way that's as resilient as possible to bad markup;
with createobject("htmlfile")
.open
.write "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
.close
msgbox "text=" & .body.outerText
end with
Function StripHTML(cell As Range) As String
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
Dim sInput As String
Dim sOut As String
sInput = cell.Text
With RegEx
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "<[^>]+>" 'Regular Expression for HTML Tags.
End With
sOut = RegEx.Replace(sInput, "")
StripHTML = sOut
Set RegEx = Nothing
End Function
This might help you, Good luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With