Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PCRE - offset vector, multiple of 3?

Tags:

c

regex

pcre

I'm leraning PCRE and I don't understand why offset vector has to be multiple of 3. This is from pcredemo.c (rc is result from pcre_exec()):

/* The output vector wasn't big enough */

if (rc == 0) {
    rc = OVECCOUNT / 3;
    printf("ovector only has room for %d captured substrings\n", rc - 1);
}

/* Show substrings stored in the output vector by number. Obviously, in a real
 * application you might want to do things other than print them. */

for (i = 0; i < rc; i++) {
    char *substring_start = subject + ovector[2 * i];
    int substring_length = ovector[2 * i + 1] - ovector[2 * i];
    printf("%2d: %.*s\n", i, substring_length, substring_start);
}

To me it seems that ovector stores str1_start, str1_end, str2_start, str2_end, ..., so array could hold OVECCOUNT/2 strings. Why is it OVECCOUNT/3?

Thank you.

like image 334
woky Avatar asked Aug 16 '12 19:08

woky


1 Answers

The manual:

The first two-thirds of the vector is used to pass back captured substrings, each substring using a pair of integers. The remaining third of the vector is used as workspace by pcre_exec() while matching capturing subpatterns, and is not available for passing back information. The number passed in ovecsize should always be a multiple of three. If it is not, it is rounded down.

like image 112
themel Avatar answered Sep 27 '22 15:09

themel