I would like to parse binary files in Raku using its regex / grammar engine, but I didn't found how to do it because the input is coerce to string.
Is there a way to avoid this string coercion and use objects of type Buf or Blob ?
I was thinking maybe it is possible to change something in the Metamodel ?
I know that I can use unpack but I would really like to use the grammar engine insted to have more flexibility and readability.
Am I hitting an inherent limit to Raku capabilities here ?
And before someone tells me that regexes are for string and that I shouldn't do it, it should point out that perl's regex engine can match bytes as far as I know, and I could probably use it with Regexp::Grammars, but I prefer not to and use Raku instead.
Also, I don't see any fundamental reason why regex should be reserved only to string, a NFA of automata theory isn't intriscally made for characters instead of bytes.
Is there a way to avoid this string coercion and use objects of type Buf or Blob ?
Unfortunately not at present. However, one can use the Latin-1
encoding, which gives a meaning to every byte, so any byte sequence will decode to it, and could then be matched using a grammar.
Also, I don't see any fundamental reason why regex should be reserved only to string, a NFA of automata theory isn't intriscally made for characters instead of bytes.
There isn't one; it's widely expected that the regex/grammar engine will be rebuilt at some point in the future (primarily to deal with performance limitations), and that would be a good point to also consider handling bytes and also codepoint level strings (Uni
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With