Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function to trim leading and trailing whitespace in vba

I have checked quite a few suggestions re trimming leading & trailing whitespace in vba (excel, incidentally).

I have found this solution, but it also trims å ä ö (also caps) and I am too weak in regex to see why:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}"
    textRegExp.Global = False
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    If textRegExp.Test (TextData) Then
      MultilineTrim = textRegExp.Replace (TextData, "$1")
    Else
      MultilineTrim = ""
    End If
End Function

(this is from an answer here at SO, where the useraccount seems inactive:

https://stackoverflow.com/a/1606433/3701019 )

So, I would love if anyone could help with either (a) an alternative solution to the problem or (b) a version of the regexp / code that would not strip out (single) åäö characters.

Thanks for any help!

Details: Problem

  • Trim functions in vba do not consider all whitespace chars (tabs, for instance). Some custom trim is needed
  • The best solution I found is above, but it also removes single å ä ö characters.

My context is a xmlparser in vba, where it gets chunks of xml to parse. It sometimes just gets a character from the stream, which may be å ä ö, which then this function strips away completely.

I would be happy to clarify or edit this question, of course.

FYI: I have shared exactly what I did based on the answers, see below.

like image 252
Marcus Widerberg Avatar asked Jun 04 '14 22:06

Marcus Widerberg


2 Answers

For a regex I would use:

^[\s\xA0]+|[\s\xA0]+$

This will match the "usual" whitespace characters as well as the NBSP, commonly found in HTML documents.

VBA Code would look something like below, where S is the line to Trim:

Dim RE as Object, ResultString as String
Set RE = CreateObject("vbscript.regexp")
RE.MultiLine = True
RE.Global = True
RE.Pattern = "^[\s\xA0]+|[\s\xA0]+$"
ResultString = RE.Replace(S, "")

And an explanation of the regex:

Trim whitespace at the start and the end of each line
-----------------------------------------------------

^[\s\xA0]+|[\s\xA0]+$

Options:  ^$ match at line breaks

Match this alternative (attempting the next alternative only if this one fails) «^[\s\xA0]+»
   Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
   Match a single character present in the list below «[\s\xA0]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
      The character with position 0xA0 (160 decimal) in the character set «\xA0»
Or match this alternative (the entire match attempt fails if this one fails to match) «[\s\xA0]+$»
   Match a single character present in the list below «[\s\xA0]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
      The character with position 0xA0 (160 decimal) in the character set «\xA0»
   Assert position at the end of a line (at the end of the string or before a line break character) «$»

Created with RegexBuddy
like image 171
Ron Rosenfeld Avatar answered Nov 14 '22 21:11

Ron Rosenfeld


Try this:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "(^[ \t]+|[ \t]+$)"
    textRegExp.Global = True
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    MultilineTrim = textRegExp.Replace (TextData, "")
End Function
like image 39
AnotherParker Avatar answered Nov 14 '22 22:11

AnotherParker