Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Escaping strings for gsub

Tags:

gsub

lua

I read a file:

local logfile = io.open("log.txt", "r")
data = logfile:read("*a")
print(data)

output:

...
"(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S
...

Yes, logfile looks awful as it's full of various commands

How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable?

Below snippet, does not work:

s='"(\.)\n(\w)", r"\1 \2"'
data=data:gsub(s, '')

I guess some escaping needs to be done. Any easy solution?


Update:

local data = [["(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S]]

local s = [["(\.)\n(\w)", r"\1 \2"]]

local function esc(x)
   return (x:gsub('%%', '%%%%')
            :gsub('^%^', '%%^')
            :gsub('%$$', '%%$')
            :gsub('%(', '%%(')
            :gsub('%)', '%%)')
            :gsub('%.', '%%.')
            :gsub('%[', '%%[')
            :gsub('%]', '%%]')
            :gsub('%*', '%%*')
            :gsub('%+', '%%+')
            :gsub('%-', '%%-')
            :gsub('%?', '%%?'))
end

print(data:gsub(esc(s), ''))

This seems to works fine, only that I need to escape, escape character %, as it wont work if % is in matched string. I tried :gsub('%%', '%%%%') or :gsub('\%', '\%\%') but it doesn't work.


Update 2:

OK, % can be escaped this way if set first in above "table" which I just corrected

:terrible experience:

Update 3:

Escaping of ^ and $

As stated in Lua manual (5.1, 5.2, 5.3)

A caret ^ at the beginning of a pattern anchors the match at the beginning of the subject string. A $ at the end of a pattern anchors the match at the end of the subject string. At other positions, ^ and $ have no special meaning and represent themselves.

So a better idea would be to escape ^ and $ only when they are found (respectively) and the beginning or the end of the string.

Lua 5.1 - 5.2+ incompatibilities

string.gsub now raises an error if the replacement string contains a % followed by a character other than the permitted % or digit.

There is no need to double every % in the replacement string. See lua-users.

like image 649
theta Avatar asked Mar 20 '12 16:03

theta


People also ask

What does string GSUB return?

gsub (s, pattern, repl [, n]) Returns a copy of s in which all (or the first n , if given) occurrences of the pattern have been replaced by a replacement string specified by repl , which can be a string, a table, or a function. gsub also returns, as its second value, the total number of matches that occurred.

How do you escape Lua?

Enter the backslash escape! To have a backslash printed we must use a backslash escape sequence; \\ . This is a backslash to say this is an escape character, followed by another backslash to tell Lua this is a backslash that needs to be printed.

What is string GSUB Lua?

The string. gsub() function has three arguments, the first is the subject string, in which we are trying to replace a substring to another substring, the second argument is the pattern that we want to replace in the given string, and the third argument is the string from which we want to replace the pattern.


3 Answers

According to Programming in Lua:

The character `%´ works as an escape for those magic characters. So, '%.' matches a dot; '%%' matches the character `%´ itself. You can use the escape `%´ not only for the magic characters, but also for all other non-alphanumeric characters. When in doubt, play safe and put an escape.

Doesn't this mean that you can simply put % in front of every non alphanumeric character and be fine. This would also be future proof (in the case that new special characters are introduced). Like this:

function escape_pattern(text)
    return text:gsub("([^%w])", "%%%1")
end

It worked for me on Lua 5.3.2 (only rudimentary testing was performed). Not sure if it will work with older versions.

like image 182
FSMaxB Avatar answered Oct 06 '22 17:10

FSMaxB


Why not:

local quotepattern = '(['..("%^$().[]*+-?"):gsub("(.)", "%%%1")..'])'
string.quote = function(str)
    return str:gsub(quotepattern, "%%%1")
end

to escape and then gsub it away?

like image 7
Qix - MONICA WAS MISTREATED Avatar answered Oct 06 '22 17:10

Qix - MONICA WAS MISTREATED


try

line = '"(\.)\n(\w)", r"\1 \2"'
rx =  '\"%(%\.%)%\n%(%\w%)\", r\"%\1 %\2\"'
print(string.gsub(line, rx, ""))

escape special characters with %, and quotes with \

like image 3
Mike Corcoran Avatar answered Oct 06 '22 15:10

Mike Corcoran