Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many backslashes are required to escape regexps in emacs' "Customize" mode?

I'm trying to use emacs' customize-group packages to tweak some parts of my setup, and I'm stymied. I see things like this in my .emacs file after I make changes with customize:

'(tramp-backup-directory-alist (quote (("\\\\`.*\\\\'" . "~/.emacs.d/autobackups"))))

This was the result of putting the following into the customize text field:

Regexp matching filename: \\`.*\\'

This is a representative sample: I'm actually trying to change several things that want a regexp, and they all show this same problem. How many layers of quoting are there, really? I can't seem to find the magic number of backslashes to get the gosh-dang thing to do what I'm asking it to, even for the simplest regular expressions like .*. Right now, the given customization produces - nothing. It makes no change from emacs' default behavior.

Better yet, where on earth is this documented? It's a little difficult to Google for, but I've been trying quite a few things there as well as in the official documentation and the Emacs wiki. Where is an authoritative source for how many dang backslashes one needs to make a regular expression in customize-mode actually work - or at the very least, to fail with some kind of warning instead of failing silently?


EDIT: As so often happens with questions asked in anger, I was asking the wrong question. Fortunately the answers below, led me to the answer to the question that I needed, which was about quoting rules. I'm going to try to write down what I learned here, because I find the documentation and Googleable resources to be maddeningly obscure about this. So here are the quoting rules I found by trial and error, and I hope that they help someone else, inspire correction, or both.

When an emacs customize-mode buffer asks you for a "Regexp matching filename", it is being, as emacs often is, both terse and idiosyncratic (how often the creator's personality is imparted to the creation!). It means, for one thing, a regexp that will be compared to the whole path of the file in search of a match, not just to the name of the file itself as you might assume from the term "filename". This is the same sense of "filename" used in emacs' buffer-file-name function, for example.

Further, although if you put foo in the field, you'll see "foo" (with double-quotes) written to the actual file, that's not enough quoting and not the right quoting. You need to quote your regexp with the quoting style that, as far as I can tell, only emacs uses: the ``backtick-foo-single-quote'`scheme. And then you need to escape that, making it \`backslash-backtick-foo-backslash-single-quote\' (and if you think that's a headache to input in Markdown, it's more so in emacs).

On top of this, emacs appears to have a rule that the . regexp special character does not match a / at the beginning of filenames, so, as was happening to me above, the classic .* pattern will appear to match nothing: to match "all files", you actually need the regexp /.*, which then you stuff into the quote format of customize-mode to produce \`/.*\', after which customize paints another layer of escaping onto it and writes it to the customization file.

The final result for one of my efforts - a setting such that #autosave# files don't gunk up the directory you're working in, but instead all live in one place:

(custom-set variables
  '(auto-save-file-name-transforms (quote (
    ("\\`/[^/]*:\\([^/]*/\\)*\\([^/]*\\)\\'" "~/.emacs.d/autobackups/\\2" t)
    ("\\`/.*/\\(.*?\\)\\'" "~/.emacs.d/autobackups/\\1" t)
))))

Backslashes in elisp are a far greater threat to your sanity than parentheses.


EDIT 2: Time for me to be wrong again. I finally found the relevant documentation (through reading another Stack Overflow question, of course!): Regexp Backslash Constructs. The crucial point of confusion for me: the backtick and single quote are not quoting in this context: they're the equivalent of perl's ^ and $ special characters. The backslash-backtick construct matches an empty string anchored at the beginning of the string being checked for a match, and the backslash-single-quote construct matches the empty string at the end of the string-under-consideration. And by "string under consideration," I mean "buffer, which just happens to contain only a file path in this case, but you need to match the whole dang thing if you want a match at all, since this is elisp's global regexp behavior."

Swear to god, it's like dealing with an alien civilization.


EDIT 3: In order to avoid confusing future readers -

  • \` is the emacs regex for "the beginning of the buffer." (cf Perl's \A)
  • \' is the emacs regex for "the end of the buffer." (cf Perl's \Z)
  • ^ is the common-idiom regex for "the beginning of the line." It can be used in emacs.
  • $ is the common-idiom regex for "the end of the line." It can be used in emacs.

Because regex searches across multi-line bodies of text are more common in emacs than elsewhere (e.g. M-x occur), the backtick and single-quote special characters are used in emacs, and as best as I can tell, they're used in the context of customize-mode because if you are considering generic unknown input to a customize-mode field, it could contain newlines, and therefore you want to use the beginning-of-buffer and end-of-buffer special characters because the beginning and end of the input are not guaranteed to be the beginning and end of a line.

I am not sure whether to regret hijacking my own Stack Overflow question and essentially turning it into a blog post.

like image 954
Brighid McDonnell Avatar asked Nov 12 '11 07:11

Brighid McDonnell


1 Answers

In the customize field, you'd enter the regexp according to the syntax described here. When customize writes the regexp into a string, any backslashes or double-quote chars in the regexp will be escaped, as per regular string escaping conventions.

So in short, just enter single backslashes in the regexp field, and they'll get correctly doubled up in the resulting custom-set-variables clause written to your .emacs.

Also: since your regexp is for matching filenames, you might try opening up a directory containing files you'd like to match, and then run M-x re-builder RET. You can then enter the regexp in string-escaped format to confirm that it matches those files. By typing % m in a dired buffer, you can enter a regexp in unescaped format (ie. just like in the customize field), and dired will mark matching filenames.

like image 130
sanityinc Avatar answered Sep 20 '22 04:09

sanityinc