24 days of Rust - nom, part 2

We learned the basic concepts of nom yesterday when we wrote a parser for HTTP headers. HTTP is by its nature a text protocol. nom however always works on bytes (byte array slices, denoted in Rust with &[u8]). This makes it perfectly suitable for parsing binary data as well. There's already a selection of parsers using nom for binary formats such as Redis dump files, MySQL protocol or tar archives. Today we are going to build a simplified WebSocket frame parser.

Continue reading »
Written on Dec. 11, 2016

24 days of Rust - nom, part 1

It's entirely possible that you're walking a happy road of a programmer who never had to write a parser by hand. That's not my case unfortunately. I remember incomprehensible indexing of hideous arrays of characters, a maddening cascade of unmaintainable if-else statements, and futile, indescribable attempts to abstract away parts of this unspeakable monstrosity.

If the above paragraph was hard to parse, good! Putting some Lovecraftian adjectives into a description of events can be a way of coping with terrible experiences. Those dark, eldritch (sorry, couldn't resist one more) days are fortunately over. With parser combinators we can write composable and fast parsers. Rust adds another adjective here: safe. nom is a parser combinator library that works by generating parsing code at compile time with a bunch of macros. It also tries to avoid allocation and work through input bytes without copying.

I decided to split nom article into two parts. Today we're focused on parsing text (well, bytes that contain text), while the next article in the series will cover binary parsing.

This is my first hands-on experience with parser combinators - I'm learning nom as I write this. Feel free to let me know if the examples here could be more nom-idiomatic.

Continue reading »
Written on Dec. 9, 2016