Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tell a raw string (r'') from a regular string ('')?

I'm currently building a tool that will have to match filenames against a pattern. For convenience, I intend to provide both lazy matching (in a glob-like fashion) and regexp matching. For example, the following two snippets would eventually have the same effects:

@mylib.rule('static/*.html')
def myfunc():
    pass

@mylib.rule(r'^static/([^/]+)\.html')
def myfunc():
    pass

AFAIK r'' is only useful to the Python parser and it actually creates a standard str instance after parsing (the only difference being that it keeps the \).

Is anybody aware of a way to tell one from another?

I would hate to have to provide two alternate decorators for the same purpose or, worse, resorting manually parsing the string to determine if it's a regexp or not.

like image 546
saalaa Avatar asked May 06 '11 19:05

saalaa


People also ask

What is the difference between normal string and raw string?

There is no special type for raw strings; it is just a string, which is equivalent to a regular string with backslashes represented by \\ . In a normal string, an escape sequence is considered to be one character, but in a raw string, backslashes are also counted as characters.

What does the prefix r in front of a string do?

The r means that the string is to be treated as a raw string, which means all escape codes will be ignored. For an example: '\n' will be treated as a newline character, while r'\n' will be treated as the characters \ followed by n .

What is a raw string in RegEx?

According to Python docs, raw string notation (r"text") keeps regular expressions meaningful and confusion-free. Without it, every backslash ('\') in a regular expression would have to be prefixed with another one to escape it. For example, the two following lines of code are functionally identical − >>> re.

How can we make a string raw string?

Python raw string is created by prefixing a string literal with 'r' or 'R'. Python raw string treats backslash (\) as a literal character. This is useful when we want to have a string that contains backslash and don't want it to be treated as an escape character.


1 Answers

The term "raw string" is confusing because it sounds like it is a special type of string - when in fact, it is just a special syntax for literals that tells the compiler to do no interpretation of '\' characters in the string. Unfortunately, the term was coined to describe this compile-time behavior, but many beginners assume it carries some special runtime characteristics.

I prefer to call them "raw string literals", to emphasize that it is their definition of a string literal using a don't-interpret-backslashes syntax that is what makes them "raw". Both raw string literals and normal string literals create strings (or strs), and the resulting variables are strings like any other. The string created by a raw string literal is equivalent in every way to the same string defined non-raw-ly using escaped backslashes.

like image 58
PaulMcG Avatar answered Sep 25 '22 07:09

PaulMcG