We learned the basic concepts of nom
yesterday
when we wrote a parser for HTTP headers. HTTP is by its nature a text protocol.
nom
however always works on bytes (byte array slices, denoted in Rust with
&[u8]
). This makes it perfectly suitable for parsing binary data as well.
There's already a selection of parsers using nom
for binary formats such as
Redis dump files,
MySQL protocol
or tar archives. Today we are going
to build a simplified WebSocket frame parser.
Continue reading »
It's entirely possible that you're walking a happy road of a programmer
who never had to write a parser by hand. That's not my case unfortunately.
I remember incomprehensible indexing of hideous arrays of characters,
a maddening cascade of unmaintainable if-else statements, and futile,
indescribable attempts to abstract away parts of this unspeakable monstrosity.
If the above paragraph was hard to parse, good! Putting some Lovecraftian
adjectives into a description of events can be a way of coping with terrible
experiences. Those dark, eldritch (sorry, couldn't resist one more) days are
fortunately over. With
parser combinators we
can write composable and fast parsers. Rust adds another adjective here:
safe. nom
is a parser combinator library
that works by generating parsing code at compile time with a bunch of macros.
It also tries to avoid allocation and work through input bytes without copying.
I decided to split nom
article into two parts.
Today we're focused on parsing text (well, bytes that contain text),
while the next article in the series will cover binary parsing.
This is my first hands-on experience with parser combinators - I'm learning
nom
as I write this. Feel free to let me know if the examples here could
be more nom-idiomatic.
Continue reading »