Functors for dependency injection?

TL;DR this is gonna have some old war stories, and maybe those uninterested oughta just skip past it.

Ah, therein lies a tale. And I’ll start with the funny part, b/c it’s the part I like best. In the story below, there’s a point where Jim Colson (if I accidentally ran over him, I’d back up and run over him again, the fuckin’ plagiarist) took my code, broke it, open-sourced it, and forbade me to publish the original version (owned by IBM). I was so angry, that when Paul Horn, the Director of IBM Research (and Colson’s “executive mentor”) (by this time, it’s 2005) called me up and told me I had to go clean up a gynormous Websphere nightmare at a big custodial bank (turned out it was AIX, JVM, Websphere, Portal Server, and customer code bugs, took six months to one-by-one hunt them down) I got so goddamn angry I threw my IBM-issued laptop across my kitchen and destroyed it. But I’d already told him I’d go (b/c the alternative was quitting, and stupid me, I didn’t realize that that was the intelligent thing to do) so I went and cleaned up the mess. They never charged me for that laptop, and for all I know, my manager still has the busted hulk in his office.

(1) In C++, lots is possible using templates (which as I’m sure you know, were inspired by SML modules, “back in the day”).

(2) even in C (hence in C++ also) you can use conditional-compilation and linking to write a piece of code that uses some subsystem, and then run it in environments with different versions of that subsystem. It’s very commonly done. I have a (former) friend who wrote a complete S(torage)A(rea)N(etwork)-on-IP system, and at every level from the bottom-to-the-top, he used mocking of various kinds to allow him to test both subsystems and entire deployed systems in single processes with pseudo-random event-clocks for reproducibility.

(3) But in Java, first you don’t have conditional compilation, and second you really don’t have “linking” in any real sense. And this was by design (you can look at the original writeups for Java from SUN, and see them boasting about the lack of #defines, linking, etc. Hell, they didn’t support #line directives, so even IF you wanted to write a preprocessor, you had to jump thru INSANE hoops to make it work (I know: I did it twice).

(4) But even worse, because Java was so damn BIG (and remember, we’re talking a time when a Gig was a LOT of memory on a server) you were forced to use “app-servers” which were meant to stay up for a long time. Even today, I’d bet that it’s rare to see toplevel programs written in Java: almost all Java code is “loaded/managed” by runtimes of some sort. Those runtimes enforce the linking, so you get what they give you.

All of this means that you really can’t use any of the tricks we’re used-to from C to do mocking, or swap-out implementations of some interface “at the last moment”. And since you don’t actually have a static modularity mechanism, you can’t use CLR/C++/ML’s tricks.

[I do take your point that C++ doesn’t have “interfaces”. So you don’t get the typechecking. But you -do- get the “visibility of names” and “access control” that you get with ML functors.]

So the runtime implementors started building DI as a way to get the value of those above techniques, for Java code.

OK, now here’s where I get on my soapbox. IT DIDN’T HAVE TO BE THIS WAY. AT THE TIME, it was clear that MSFT’s assemblies were a fine, fine approach. AT THE TIME (in fricken’ 2003), I wrote a paper[1] (with Corwin, Bacon, and Gross) and an implementation, applying it to Tomcat as a proof, that you could modularize Java servers in a way that allowed you to then apply the tricks above, without some dynamic insanity. But you see, technical correctness doesn’t matter in the Java world: at the time, Jim Colson and the “pervasive computing” idiots at IBM had built this thing called OSGI (R2) that was deeply broken, and they needed a solution. So they did two things: (a) took the implementation and modified it (breaking critical bits) to turn it into OSGI R3, which they open-sourced; (b) forbade me to open-source my implementation, since it was (of course) owned by IBM,and as senior architects, they could do that. It’s a feudal kingdom after all, not a well-run tech company.

I distinctly remember an email and conversation with Glyn Normington (then of IBM Hursley) who assured me that he’d done everything he could to convince them to keep static modularity, but well, them’s the breaks. Glyn went on to SpringSource, of course.

The upshot of this long tale, is that Java still has no static modularity, not even of the kind that the CLR has enjoyed since its inception. So everybody uses DI, and DI is driven by runtimes that take various config-files (last I checked, it was XML) and build graphs of objects (“beans” bleccch) in the runtime. It’s all dynamic, it all depends on the runtime getting things right, nothing is truly statically checked, and (haha) no wonder, to most programmers it’s a black-box (as my friend at that streaming company described to me).

[1] dblp: MJ: a rational module system for Java and its applications.
I don’t claim that this was revolutionary. C#'s assemblies were basically this. Rather, this was something that could be -done- in Java at the time. When you work in “the business” you try to propose solutions that can be implemented now, and don’t require boiling the ocean. That’s all this was.

3 Likes

Just did a quick read of libmonda. How would this be implemented without functors? first class modules?

There’s another (historical) bit that might help explain. Java originally came out of “microservers” (intended to run in light-switches) but it rapidly became clear that its real niche was in transaction-processing servers. There had been a long-standing attempt to re-implement these, based on CORBA and C/C++ … which had failed miserably, and I mean -miserably-. One of the tenets of the entire school, was the idea that you couldn’t extend the programming language: everything was done by IDL compilers and other add-ons. Interpreted config-files were necessary, since they could bridge the gap. And “frameworks” were all the rage.

An aside on frameworks: when architects [spit] propose frameworks, they’re telling you “here’s a set of nouns and verbs, and here are the rules for using them; OBTW, we provide no static checking that you’re obeying the rules: screw up and die a horrible death”. A framework is a substitute for a language, by a weakling who isn’t (wo)man enough to actually design a language (or extension, at least).

In large enterprises, this has a particular appeal, b/c it means that the armies of programmers you spent a lot of money to train, can be repurposed to the next new technology, without having to learn a new language. [ignore that a sufficiently different framework -is- a new language … just ignore that.] So vendors learned “don’t design or offer new languages; instead, just offer new frameworks”.

Frameworks and config-file-driven initialization are almost-necessarily going to force you to runtime configuration. In the context of Java, which purposely (and bravely!) eschews any kind of preprocessing, linking, link-time control, etc, voila voila voila, you end up with something like DI. Because if “a new language” or even “a new extension to the current language” is off-the-table, and you can’t control the environment in which code executes statically at build-time, then runtime configuration is all you have left.

BTW, something else to think about: NOT ONE of the major commercial J2EE app-servers was ever open-sourced. WebLogic? Nope. Websphere? Nope. There’s a good reason for that, even though at this point (and heck, 10yr ago) only suckers use these suckers. The code inside is horrendous. Just. Horrendous. Every nightmare you can imagine, and then some. And yet, the Java standards were and are driven to a great extent by “expert groups” convened from and funded by the big enterprise vendors. So the same people who built these monstrosities, are the ones who come up with these various standards.

2 Likes

It’s not really like that anymore. In android, all of this runtime configuration crap had to be thrown out of the window because reflection is far too slow and kills dead code elimination. Both of these are quite important for resource constrained environments. The end result is dagger2: the initialization of your object graph is now generated at compile time via annotation processing. As usual for java, aesthetically, it’s a monster. But I’d like to point out that the java world has moved a little forward since your last stint has an enterprise programmer a decade ago :wink:

Entertaining tales though.

I don’t know how first class modules would help, but bigfunctors would solve this problem trivially.

Well, yes. Dalvik started off with “take a collection of classfiles and grind them down to a single executable”, right? IIRC the original Dalvik didn’t have classloaders, either. And Dalvik was the creation of a single company, a company known for (at the time) technical execution quality. Unlike Java and the Java standards. Is Dalvik today a fully J2SE-compliant platform? I’d guess “no”. [just to be clear: I don’t think this is an error on Dalvik’s part: quite to the contrary.]

This is precisely what I was wondering. Thank you for making it clearer!

I’ve just read up on these two (including your “Free Monads in the Wild” post) and it’s really interesting. Haven’t wrapped my head around tagless final yet, but free monads definitely looks like a nice tool to have on your belt. In the context of decoupling, is it correct that with free monads we would write different interpreters for different implementations (e.g. actual vs. mocks) operating on the same operations?

I feel like this should be an obvious issue but I’m still unsure, would you be able to point me to an example of how or where this might be a problem? And how quickly is “quickly” (i.e. is the actionable “avoid it at all cost” or “we can use it, but don’t go overboard”)?

1 Like

This twenty years old ICFP presentation by Benjamin Pierce makes the case quite well that a fully functorized style does not scale in practice (slides 45-48) and gives you a key insight on when to actually use a functor (slide 27-b).

5 Likes

The link doesn’t work for me. Could you post the URL directly? Thanks!

Best wishes,
Nicolás

Sorry, my bad markdown formatting. The link should be fixed.

That’s really informative, thanks! I think the slides 45-48 indeed explained the problem really well. As I suspected, the problem is actually pretty obvious when you think about it (assuming this is the same problem @rgrinberg mentioned). If I understand correctly, this problem relates to signatures and how fully functorized signatures may introduce coupling between signatures in the dependency hierarchy.

Curiously, I have questions about the mentioned alternative of parameterization on types instead of modules: does it solve the main problem? I believe this is also available in OCaml via sharing constraints.

1 Like

Hmm … do you mean this dune feature https://dune.readthedocs.io/en/stable/variants.html ?

It seems like the main problem of functors is their ability to introduce arbitrary types into the environment. This is what causes the combinatorial explosion. However, with interface discipline (limiting the number of new types introduced at functor application), it seems like scaling wouldn’t be such an issue. Am I wrong?

To give an ideal example, a series of functors A(B(C(D(E))))), where each one consumes a type and introduces a type, would not be difficult to deal with.

Also it seems like a ppx that assumes certain sane conventions for functors and applies boilerplate automatically could be really useful. Not sure what those conventions would be, but presumably people using them more heavily would have a better idea. I should try write up a project with functors everywhere and see if I can come up with guidelines.

1 Like

Not to beat a dead horse, but the explosion of sharing-constraints isn’t a reason to think that Java’s DI is better. Java’s DI is as if you defined all the types separately, in one global module, and then just -used- them in the modules/functors. And of course, two instances of a generative type in ML, collapse down to a single type in the Java case.

You are undercounting the cost of using abstract types in functors. Consider functorizing an application over some promise like monad 'a io. Now, the signature of any functor that consumes or produces such a value must include an abstract 'a io type. Anytime you write a functor Make(A : sig type 'a io end)(B: sig type 'b io) end you must introduce 'a A.io = 'a B.io equality. That’s simple enough with 1 abstract type, but will quickly get out of hand if you structure your application like this.

Writing signatures for such functors is also a pain because of all the sharing constraints. If you ahead with your experiment, you should find that not writing any mli’s makes things easier, but at the cost of hideous type errors.

Code generation (via ppx or otherwise) can indeed solve this issue. See functoria.

4 Likes

It seems functoria is deprecated. Is mirage moving towards using dune virtual_modules features?

AFAIK Functoria is very much not deprecated. It just moved over to the Mirage repo.

1 Like