Skip to content

Perf and binary-parse-stream/binary-parser #86

@BurtHarris

Description

@BurtHarris

@hildjj, @nathan7

node-cbor depends on some very clever work combining generators with readable streams that has cloned into the node-cbor/vendor/binary-parse-stream directory. The README.md is rather sparse, and says it's based on https://github.com/nathan7/binary-parser, but that's a broken link now. However, https://github.com/nathan7/binary-parse-stream still seems present, if a bit stale... (4 years since last commit.

None-the-less, the approach used permits a stream-based parser to be written with sync-looking code, which is very impressive. My gut feeling however is that the performance of node-cbor may be limited by this, in particular yielding once or more for each parsed data item (perhaps as short as one byte of incoming data.) All of these yields seem like a substantial source of overhead for large data sets. A chunk based approach where the parser can remain synchronous during processing of entire Buffer's seem like it could offer substantial performance gains.

ECMAScript async iterators seems like it might be a good forward-looking and chunk-oriented approach. TC 39 declared async iteration "finished" in January 2018, and V8 and Node v10 both seem to have adopted it, at lest experimentally.

I'm considering writing a binary parsing framework similar to binary-parse-stream but based instead on async iterators. Before I dig into alone, I thought it might be valuable to run the idea by you both, to see if what you think, and to see if you know of other implementations doing something like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions