Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get PCRE2 capture group name -if any- associated to a capture group number

Tags:

c

pcre

Once pcre2_match() is successfully executed I use pcre2_get_ovector_pointer() to get the ovector and then build a data structure containing (1) the matched string (using ovector[0] and ovector[1]); and (2) the matched capture groups (using ovector[2*i] and ovector[2*i+1], for i in [1..rv), where rv is the return value of pcre2_match()).

For each capture group number, I'd like to include in the data structure the matched string (no problem, that's in ovector), the length of the match (same, the info can be extracted form ovector), and, and this is the difficulty, the name of the capture group (obviously, only if the matched group has a name).

Helper functions are available to fetch matched capture groups by name. In particular pcre2_substring_number_from_name() could be used to transform a capture group name into a group number (i.e name-to-number translation). What I need in the exact opposite behaviour: given a group number, get its associated group name, if any, or NULL otherwise (i.e. number-to-name translation). I assume I missing something obvious here, but I'm not able to find a way to do that using the PCRE2 API. Is it possible?

like image 267
Carlos Abalde Avatar asked Oct 20 '25 23:10

Carlos Abalde


1 Answers

This is not the simple number-to-name API function I was looking for, but the following snippet from https://www.pcre.org/current/doc/html/pcre2demo.html contains enough inspiration to implement what I need :)

(void)pcre2_pattern_info(
  re,                   /* the compiled pattern */
  PCRE2_INFO_NAMECOUNT, /* get the number of named substrings */
  &namecount);          /* where to put the answer */

if (namecount == 0) printf("No named substrings\n"); else
  {
  PCRE2_SPTR tabptr;
  printf("Named substrings\n");

  /* Before we can access the substrings, we must extract the table for
  translating names to numbers, and the size of each entry in the table. */

  (void)pcre2_pattern_info(
    re,                       /* the compiled pattern */
    PCRE2_INFO_NAMETABLE,     /* address of the table */
    &name_table);             /* where to put the answer */

  (void)pcre2_pattern_info(
    re,                       /* the compiled pattern */
    PCRE2_INFO_NAMEENTRYSIZE, /* size of each entry in the table */
    &name_entry_size);        /* where to put the answer */

  /* Now we can scan the table and, for each entry, print the number, the name,
  and the substring itself. In the 8-bit library the number is held in two
  bytes, most significant first. */

  tabptr = name_table;
  for (i = 0; i < namecount; i++)
    {
    int n = (tabptr[0] << 8) | tabptr[1];
    printf("(%d) %*s: %.*s\n", n, name_entry_size - 3, tabptr + 2,
      (int)(ovector[2*n+1] - ovector[2*n]), subject + ovector[2*n]);
    tabptr += name_entry_size;
    }
  }
like image 170
Carlos Abalde Avatar answered Oct 23 '25 13:10

Carlos Abalde



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!