Stripping binaries

grayswandyr · July 18, 2018, 1:36pm

Hi, I’d like to get recommendations from the community regarding stripping binaries. My use case is a program (for Unix and Mac OS X), built with dune/ocamlopt, the source code of which does not rely on dynamic linking nor on the FFI. However it relies on several opam packages and on the Unix module too, so I wondered whether I should be wary of something here. Are there good practices to rely upon? Or can I just run the standard strip command on my binary?

perry · July 18, 2018, 4:26pm

Stripping binaries was very important back in the days when disk space was very limited. These days, I think it’s an antipattern. Why do you want to do it?

grayswandyr · July 18, 2018, 7:58pm

Because disk space is limited for some people, for socio-economical or regulatory reasons for instance. But please let’s not debate on this topic here; I’m only asking for technical advice.

hcarty · July 18, 2018, 10:57pm

I haven’t tried to strip OCaml binaries recently, but the last time I tried everything was fine with ocamlopt-compiled binaries. They should act more or less like a stripped binary compiled from C.

hannes · July 19, 2018, 9:10am

This is what I do. Sometimes I need to transport binaries over the network, and strip reduces roughly 33% of the binary size.

keleshev · July 19, 2018, 9:38am

In 2016 I did a quick benchmark comparing binary sizes and start-up time of a hello-world OCaml program. Compared the original, strip, as well as upx tool which also archives the executable to reduce the size even more. I’ve also compared a bare-bones hello-world with the one where Jane Street Core was linked.

Here’s what I’ve got:

	Original	`strip`	`upx --best`
OCaml	201K, 12ms	113K, 12ms	64K, 14ms
OCaml+Core	14M, 35ms	8.1M, 35ms	2.3M, 75ms

grayswandyr · July 19, 2018, 10:58am

Oh yeah even better, I had forgotten about this one. Unfortunately, my upx’ed program core dumps when launched?! The stripped program runs fine OTOH.

lindig · July 21, 2018, 10:33am

Binary sizes of 14M vs 201k are absurd. Can someone explain how linking in OCaml works? Does linking always include all modules independent whether they are used or not? Or is Core so entangled that it is impossible to exclude certain modules?

Yaron_Minsky · July 21, 2018, 12:16pm

The key issue here is that OCaml overapproximates dependencies at the module level, which means that lots of functionality that isn’t use can nonetheless be linked in. There are some plans for better native dead-code elimination, but there’s nothing usable for that yet.

It’s worth noting that Base is already considerably lighter. My experiments show:

Base: 3.9M
Core_kernel: 8.5M
Core: 9.9M

In any case, as a general matter, if you don’t need the extra functionality of Core_kernel or Core, I think Base is considerably preferable, both because it’s lighter, and because it enforces some better idioms, like hiding polymorphic comparison by default, and encouraging better more modular idioms for building things like comparison-based container types.

There are some changes coming to Base in the next release that should make it yet lighter, though I’m not sure how far those will go.

y

Yaron_Minsky · July 21, 2018, 12:17pm

To be clear, though: OCaml doesn’t include any module you don’t reference, but anything you do reference, and anything transitively referenced by the modules you directly reference, will be included, whether or not you ever touch that functionality in your code.

y

perry · July 21, 2018, 12:55pm

That’s unfortunate. There’s no good way to trace out what’s live and what isn’t as things stand?

samoht · July 21, 2018, 1:16pm

Yes it’s possible: see https://github.com/ocaml/ocaml/pull/608

xavierleroy · July 21, 2018, 5:16pm

The only case where strip should not be used is for bytecode executables produced by ocamlc in -custom mode. Tehse are hybrid files composed of an ELF executable followed by OCaml bytecode, which confuses the strip command. But -custom mode is rarely used nowadays.

If your executable is produced by ocamlopt, nothing bad will happen if you run strip on it.

grayswandyr · July 21, 2018, 8:14pm

Thanks, good to know.

grayswandyr · July 21, 2018, 8:15pm

This looks rather compelling!

cfcs · August 4, 2018, 2:56pm

Just to chime in with a security angle: Stripped binaries also provide fewer useful gadgets for exploit writers, and can have a significant impact on the viability of ROP exploits, especially when the would-be stripped sections are mapped executable (? like if you’re using mirage on solo5, and maybe also the -xen target ?).

perry · August 4, 2018, 8:14pm

Stripped sections aren’t generally marked executable. (I am not sure they’re ever marked executable – if they were, how could you strip them? Indeed, if you can load it, it isn’t stripped.)

On your “especially”, if a stripped section isn’t loaded, how could it possibly be used for things like ROP exploits?

cfcs · August 5, 2018, 6:32pm

No, the sections that are affected by stripping are not generally (highly dependent on which parameters you pass to strip, for instnace whether you strip relocations or not) marked as executable.
That’s why I gave an example of a case where the protection markings are not respected (solo5), and with my question marks I intended to convey that I don’t know how mirage-xen maps the sections, but I have feeling it might be the same there.

Re: my “especially”: Sorry if I was bit unclear: I’d argue that the impact of strip'ing is usually beneficial to security, so a stripped section (that’s not loaded) is one less section to be used in ROP attacks (when the section is mapped as executable), or data-only attacks (when the section is not mapped as executable).
Essentially the fewer bytes or sections mapped R/R+W/R+E/R+W+E in your process space, the better.

perry · August 5, 2018, 6:33pm

I see no evidence for that whatsoever. Stripped sections are not loaded. If a stripped section needs to be loaded, then the executable will stop working. If it isn’t loaded, ROP attacks can’t target it in the first place. If it is loaded, that’s probably because the binary needs it or it won’t work.

Topic		Replies	Views
Reducing mirageos image size using ocamlclean? Ecosystem mirageos	18	2023	September 1, 2018
[BLOG] The Growth of the OCaml (Binary) Distribution Community opam , package , binary , opam-bin	9	1405	February 7, 2023
[ANN] BAP 2.1.0 Release Community announce , bap	26	4577	September 22, 2020
Reducing size of an opam installation and executable/library size Ecosystem opam , compiler	1	613	March 1, 2023
How to do a "release build" with dune or ocamlopt Learning	2	806	January 17, 2024

Stripping binaries

Related topics