Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse allowing nested parentheses in nom

Tags:

rust

nom

I'm using nom. I'd like to parse a string that's surrounded by parentheses, and allowing for additional nested parentheses within the string.

So (a + b) would parse as a + b, and ((a + b)) would parse as (a + b)

This works for the first case, but not the nested case:

pub fn parse_expr(input: &str) -> IResult<&str, &str> {
    // TODO: this will fail with nested parentheses, but `rest` doesn't seem to
    // be working.
    delimited(tag("("), take_until(")"), tag(")"))(input)
}

I tried using rest but this doesn't respect the final ):

pub fn parse_expr(input: &str) -> IResult<&str, &str> {
    delimited(tag("("), rest, tag(")"))(input)
}

Thanks!

like image 873
Maximilian Avatar asked Jan 29 '26 08:01

Maximilian


1 Answers

I found a reference to this in the nom issue log: https://github.com/Geal/nom/issues/1253

I'm using this function, from parse_hyperlinks — basically a hand-written parser for this https://docs.rs/parse-hyperlinks/0.23.3/src/parse_hyperlinks/lib.rs.html#41 :

pub fn take_until_unbalanced(
    opening_bracket: char,
    closing_bracket: char,
) -> impl Fn(&str) -> IResult<&str, &str> {
    move |i: &str| {
        let mut index = 0;
        let mut bracket_counter = 0;
        while let Some(n) = &i[index..].find(&[opening_bracket, closing_bracket, '\\'][..]) {
            index += n;
            let mut it = i[index..].chars();
            match it.next().unwrap_or_default() {
                c if c == '\\' => {
                    // Skip the escape char `\`.
                    index += '\\'.len_utf8();
                    // Skip also the following char.
                    let c = it.next().unwrap_or_default();
                    index += c.len_utf8();
                }
                c if c == opening_bracket => {
                    bracket_counter += 1;
                    index += opening_bracket.len_utf8();
                }
                c if c == closing_bracket => {
                    // Closing bracket.
                    bracket_counter -= 1;
                    index += closing_bracket.len_utf8();
                }
                // Can not happen.
                _ => unreachable!(),
            };
            // We found the unmatched closing bracket.
            if bracket_counter == -1 {
                // We do not consume it.
                index -= closing_bracket.len_utf8();
                return Ok((&i[index..], &i[0..index]));
            };
        }

        if bracket_counter == 0 {
            Ok(("", i))
        } else {
            Err(Err::Error(Error::from_error_kind(i, ErrorKind::TakeUntil)))
        }
    }
}
like image 192
Maximilian Avatar answered Jan 31 '26 23:01

Maximilian