What would be the consequences of inserting a positive lookbehind for n-bytes, <code>(?<=\C{n})</code>, into the beginning of any arbitrary regular expression, particularly when used for replacement operations? At least within PHP, the regex match functions, <code>preg_match</code> and <code>preg_match_all</code>, allow for matching to begin after a given byte offset. There is no corresponding feature in any of the other PCRE PHP functions - you can specify a limit to the number of replacements done by <code>preg_replace</code> for instance, but not that those replacements' matches must occur after n-bytes. There would obviously be some (lets call them trivial) consequences to performance and readability, but would there be any (non-trivial) impacts, like matches becoming non-matches (except when they are not offset by n bytes) or replacements becoming malformed? Some examples: <code>/some expression/</code> becomes <code>/(?<=\C{4})some expression/</code> for a 4-byte offset <code>/(this) has (groups)/i</code> becomes <code>/(?<=\C{2})(this) has (groups)/i</code> for a 2-byte offset As far as I can tell, and from the limited tests that I've run, adding in this lookbehind effectively simulates this offset parameter and doesn't mess with any other lookbehinds, substitutions, or other control patterns; but I'm also not an expert on Regex. I'm trying to determine if there are any likely consequences to building replace/filter function extensions by inserting the n-byte lookbehind into patterns. It should operate just as the match functions' offset parameter works - so simply running the expression against <code>substr( $subject, $offset )</code> won't work for the same reasons it doesn't for <code>preg_match</code> (most notably it cuts off any lookbehinds and <code>^</code> then incorrectly matches the start of the substring, not the original string).

Consequences of Inserting Positive Lookbehind into Arbitrary Regex to Simulate Byte Offset

Tags:

What would be the consequences of inserting a positive lookbehind for n-bytes, (?<=\C{n}), into the beginning of any arbitrary regular expression, particularly when used for replacement operations?

At least within PHP, the regex match functions, preg_match and preg_match_all, allow for matching to begin after a given byte offset. There is no corresponding feature in any of the other PCRE PHP functions - you can specify a limit to the number of replacements done by preg_replace for instance, but not that those replacements' matches must occur after n-bytes.

There would obviously be some (lets call them trivial) consequences to performance and readability, but would there be any (non-trivial) impacts, like matches becoming non-matches (except when they are not offset by n bytes) or replacements becoming malformed?

Some examples:

/some expression/ becomes /(?<=\C{4})some expression/ for a 4-byte offset

/(this) has (groups)/i becomes /(?<=\C{2})(this) has (groups)/i for a 2-byte offset

As far as I can tell, and from the limited tests that I've run, adding in this lookbehind effectively simulates this offset parameter and doesn't mess with any other lookbehinds, substitutions, or other control patterns; but I'm also not an expert on Regex.

I'm trying to determine if there are any likely consequences to building replace/filter function extensions by inserting the n-byte lookbehind into patterns. It should operate just as the match functions' offset parameter works - so simply running the expression against substr( $subject, $offset ) won't work for the same reasons it doesn't for preg_match (most notably it cuts off any lookbehinds and ^ then incorrectly matches the start of the substring, not the original string).

Related questions
                            
                                PagerJS how to accomplish "two-way binding" with URL params?
                            
                                Bulk Insert - Row Terminator for UNIX file + "\l" row terminator
                            
                                UNNEST function in MYSQL like POSTGRESQL
                            
                                Understanding immutable composite types with fields of mutable types in Julia
                            
                                React getInitialState using props
                            
                                Why is there no operator<< for std::unique_ptr?
                            
                                How to make a default custom theme with ggplot2 in R
                            
                                Number keyboard in iPad?
                            
                                What is the difference between scheduler and dispatcher in context of process scheduling
                            
                                Python 3.4: compile cython module for 64-bit windows
                            
                                Read only first N bytes from socket in node.js
                            
                                How to avoid default return value when accessing a non-existent field with lenses?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Consequences of Inserting Positive Lookbehind into Arbitrary Regex to Simulate Byte Offset

Tags:

Related questions

Recent Activity

Donate For Us