[ANN] Release of `multipart_form.0.2.0`

dinosaure · April 19, 2021, 2:25pm

I am pleased to announce the release of multipart_form. Throughout the development of mrmime, we have gained a thorough knowledge of the RFCs about email. However, these RFCs also describe mechanisms that are found in HTTP/1.1.

Genesis

More specifically, a lot of work has been done on RFC 2045 & RFC 2046 (see RFC 7578 § 4) which describe the multipart format (found in emails and in HTTP/1.{0,1} requests when serializing a <form>).

From this work (~2 years), we decided to extract the parts allowing to manipulate a multipart/form-data content for HTTP/1.{0,1} responses (plus RFC 2183). This resulted in the creation of multipart_form.

This project is a cross between what many users have been waiting for (for CoHTTP and http/af), a knowledge of what exists and its limitations, and finally a development in the spirit of MirageOS.

The result is an API that is “full stream”. Indeed. a question arose from the beginning, how to manipulate this format while:

not having access to a file system (MirageOS)
not exploding memory usage for file uploads

Memory bound implementation

With the help of @Armael and the memtrace tool, we were able to implement and extend multipart_form so that it is easier to use and really ensures our original assumption about memory consumption.

So we experimented with use cases like uploading very large files. Here is the result that memtrace gives us with a 100Mb file:

The application tries to save the games in files. We use opium (and thus http/af) but tests were also done with CoHTTP. The code is available here for people who want to reproduce.

Documentation & encoding

Finally, a major effort has been made in the documentation to explain in detail how to use multipart_form. Version 0.2.0 also adds a way to produce a multipart/form-data document (experimental) with the same constraints on memory usage.

I hope this work will be useful to a lot of people. The documentation is available here.

Armael · April 19, 2021, 7:12pm

I should add that while memtrace-viewer was already quite useful, I do have some complaints about it.

One thing that can be seen in the screenshot above is that the graph of used memory/time does not have a vertical scale (!!). So we’re actually missing the most important information here, which is that the peak memory usage for the file upload application stays below ~20Mb even when uploading >1Gb files.
And as far as I could tell there’s currently no way of obtaining this information using memtrace, even though it can most likely be computed from the data that’s already there.

Another thing is that I wasn’t able to build the git version of memtrace from source (to try and add the feature myself). It seems to rely on a number of unreleased janestreet libraries, and I wasn’t able to figure out the right sequence of opam pins to make everything compile.

vlaviron · April 20, 2021, 9:57am

For the peak memory usage, I think people usually use OCAMLRUNPARAM='v=0x400' ./my_program. It’s likely more reliable than memtrace, which has to rely on statistical estimations.

I can’t help you with memtrace building instructions, but we (at OCamlPro) developed an alternative viewer for the dumps generated by memtrace called memthol (see announce here). It’s coded in Rust, so not necessarily easier to patch, but I think it already has the scale for the memory graphs.

Armael · April 20, 2021, 12:27pm

For the peak memory usage, I think people usually use OCAMLRUNPARAM='v=0x400' ./my_program . It’s likely more reliable than memtrace, which has to rely on statistical estimations.

Ah, that’s good to know, thanks!

we (at OCamlPro) developed an alternative viewer for the dumps generated by memtrace called memthol

I have also tried memthol! It indeed has the expected vertical scale for the graphs, and nice features that memtrace-viewer doesn’t have (in particular the ability to graph at the same time several sub-parts of the trace, depending on a filter on the callstack).
However, I found memthol (in its current state) to be less usable than memtrace-viewer:

it is really missing a flamegraph view similar to the one in memtrace-viewer. Memthol has an expressive language for filtering the trace based on callstacks, but no way of actually displaying the callstacks of allocation points!
the UI is very clunky. One graph takes 90% of the screen (= extremely large on just a normal 24" monitor); the rounded colored buttons use almost unreadable black-on-black text; the horizontal scrolling bars inside UI elements are far from ideal; the bottom black panel is sluggish (it reacts with a ~1s delay).

These are my two main complaints. I also can’t help but wonder about the choice of rust to implement an ocaml-specific tool. Rust is a fine language, but doesn’t this restrict the set of potential contributors to the tool to ocaml programmers that also know rust, i.e. probably not many people?

Yaron_Minsky · April 24, 2021, 10:46pm

This is indeed an annoyance, but, this repo should make it a lot easier to build:

That is a repo with all of the dependencies included.

Topic		Replies	Views
[ANN] http-multipart-formdata v2.0.0 Community web , announce , http	0	844	June 27, 2021
Multipart form data Learning	7	746	September 2, 2021
[ANN] http-multipart-formdata v3.0.1 released Community web , announce	3	1001	August 11, 2021
[ANN] http-multipart-formdata 1.0.0 Community announce	1	1032	December 2, 2020
[ANN] First Public Release (beta) of the Memthol memory profiling visualizer Community announce	0	715	December 7, 2020

[ANN] Release of `multipart_form.0.2.0`

Genesis

Memory bound implementation

Documentation & encoding

Related topics