Actual Performance Costs of OOP Objects

I know @bluddy started this discussion about the performance of OOP, but having followed the discussion leading up to this which was about a more expressive I/O layer in Ocaml, and OOP is a possible answer to that, but not necessarily the only solution.

To focus on the I/O problem it seems to me:

  1. If we used OOP for this, in many cases, the overhead of a method call would be completely dwarfed by the cost of performing I/O.
  2. Most common uses of I/O pull out, or put in, fairly large chunks of data from the channel to avoid performing too many calls to that layer, even if buffered and even if the underlying structure is an in-memory string, so the cost of a method call, again, seems like it is unlikely to be a large overall cost in the program.
  3. There are various programmer sentiments that, rightly or wrongly, will impact one’s reaction to the I/O layer of ocaml being objects. Personally, I have a possibly irrational distaste for objects (in generally I don’t like sub-typing as I believe it makes programs harder to understand), so module-based implementation “feels” better to me even if it’s using objects underneath. I have my own I/O library that I use for async code that is implemented similar to what @talex5 has done except I use a record for methods underneath a module handling the buffering.
  4. If the primary counter to doing an OO low-level layer and a Buffered module/other modules above that is in the case of an in-memory representation we end up paying a double-copy cost, that might be a reasonable cost to pay for an interface people might like a bit more.
  5. I’m not sure how much any of this matters given a lot of Ocaml code is in some async code, which have their own I/O primitives which, I don’t think, would be workable with this interface given the types would be different.

One final thought: I don’t know the cost for looking up a function with first-class modules. But this strikes me as maybe a good compromise in that with first-class modules, you always have an escape hatch if it’s important for performance. That is, imagine you had a Buffered_stream module that was backed by some kind of I/O layer built on a dispatch table and it turns out you’re mostly doing I/O in an in-memory representation and that is too slow. Well, for that performance sensitive code you could pass in a module with your reduced-copy implementation that matches Buffered_stream for those specific cases and pass in the standard Buffered_stream implementation for non-in-memory situations. Yes, it’s more verbose, but at the same time the amount of code that needs that is probably quite low so maybe that is a fair balance to be made.

1 Like