What is a good function?

I’m writing an article together with a friend on what a good function is (in any programming language), and got curious what the OCaml community is thinking. So, what are your thoughts? What is a good function, and how do you keep it good?

Some aspects to consider:

  • State
  • Contract (post- and pre-condition)
  • Pure, referential transparent
  • Testable, mockable
  • Documented
  • Name and how naming relates to the domain
  • Size
  • Number of arguments
  • How to behave in case of failure (assertions, exceptions, different return values)
  • Changeability
  • Empirical evidence for recommendations (e.g. correlation between size and fault density)

And so on. :slight_smile:

One draft of the article available here: What is a good function?

if writing followed the recipe presented by @sperber at 35c3 (see https://mro.name/a46hgkx) chances are, it is a good one. A bonus: if it is pure.

P.S.: the recipe is 8 consecutive steps:

  1. make a short description (1 line)
  2. data analysis (what does it need)
  3. signature (name, what goes in, comes out)
  4. tests (what do I care about)
1 Like

Slides: https://mirror.netcologne.de/CCC/congress/2018/slides-pdf/35c3-9800-how_to_teach_programming_to_your_loved_ones.pdf

Not easy to analyze how the function came to be. :slight_smile: Easier to see how it is right now.

I found this talk full of good advice on how to design good functions: Writing Quality Code in Erlang - YouTube

It’s ostensibly about Erlang, but I find the same principles are pretty universal, really.

1 Like

Hm, no slides? Can you summarize the talk in a sentence or two, perhaps?

OK, saw it, but the end point is pretty trivial? Small is better.

looked at after the fact, many solutions to difficult problems appear trivial. After being found. That’s why they are solutions and not workarounds. Take E = m c^2. Or excavating troy. Or planetary orbits. Nice thing: trivial solutions are easy to adopt. You don’t need more fancy gear.

However, it’s often not trivial to make functions small.

1 Like

A good function is one that does what it looks like it’s supposed to be doing. Good code in general is code that can be easily fixed or replaced when that day comes. Things that help toward these goals include:

  • evocative names.
  • comments explaining all the context that’s in the programmer’s mind at the time of writing the code. This should explain what the function is intended for.
  • understandability of the function’s code without jumping to other pieces of code. Using explicit arguments and not relying on external mutable objects helps. Types also help guarantee that function calls are correct without thinking too hard.
  • accompanying tests. Not only do tests help catch future regressions, but they also illustrate how to call the function.
  • avoiding unnecessary abstractions.
  • preferring familiar patterns over unfamiliar ones. Stay consistent with the project’s practices, and more generally with the language’s best practices.

None of this is specific to OCaml. OCaml just makes some of these properties easier to achieve than in some other languages.

4 Likes

This is one of the interesting parts, because it cannot be tested automatically. Only code-reviews can catch it, and maybe not even that is enough.

Is this possible? Often a function will apply on certain domain elements, and the knowledge about those elements might be in another file. Example: If you want to connect order with invoices and transactions, you might do this in the order module, but the definition of the invoice and transaction elements will be elsewhere.

Feel free to elaborate on this one, too. :slight_smile:

I think it’s not just about ‘small is better’ but more about the level of abstraction. Each function should deal with a consistent level of abstraction that makes sense for it. E.g. a function that combines data from two different REST API calls should not try to construct and make the HTTP calls directly. It should factor out the actual HTTP calls into helper functions and keep only the logic of combining the data.

1 Like

The following code is understandable without having to look elsewhere:

let print_invoice_summary (inv : Invoice.t) =
  printf "invoice #%s %s: %s\n"
    (Invoice_ID.to_string inv.id)
    (Date.to_string inv.creation_date)
    inv.description

It’s understandable thanks to the use of English identifiers. If we remove them, it becomes incomprehensible:

let f (x : A.t) =
  printf "invoice: #%s %s: %s\n"
    (B.to_string x.a)
    (C.to_string x.b)
    x.c

You may say nobody writes such code, and you’d be wrong.

2 Likes

This is about anything that’s more general than it needs to be now and in the foreseeable future. The “foreseeable future” is hard to determine, there’s no doubt about this. However, it’s often easier to make things more general or more abstract when the time comes. There’s often no need to make them abstract upfront “just in case”.

In OCaml, some abstractions are possible and even easy to write but can lead to code that’s harder to read. ​Some examples include:

  • creating higher-order functions that will be used only once in the application e.g. list_fold_left3 whose type would be ('a -> 'b -> 'c -> 'd -> 'a) -> 'a -> 'b list -> 'c list -> 'd list -> 'a (modeled after List.fold_left and List.fold_left2).
  • creating generic types that don’t need to be generic e.g. type 'key t = { id: 'key; creation_date: Date.t; description: string } instead of type t = { id: string; creation_date: Date.t; description: string }.
  • creating parametrized modules (OCaml functors) when that could be avoided.
  • creating functions with a bad name, functions that are used only once and would benefit from being anonymous.
  • making a record type abstract (the module interface would expose type t instead of type t = { ... }) and providing a bunch of functions to access its fields, when accessing the fields directly would work just fine.
  • using jargon to describe data structures that don’t benefit from it. For example, this is the case of calling something a “monad” when this is irrelevant to the user of the library but calling them a “wrapper” would be more insightful.
1 Like

You’re making it easy for yourself by choosing fields that are very common, like id and date. :slight_smile: Let me yank some fields from our database that belongs to the invoice table.

  • gateway_txn_id
  • pay_to_accept_quote
  • entity_id
  • delivery_small_print
  • tax_exemption_code
  • claim_tax_back

But yes, most of the others do make sense. Of course that’s because this domain is so well known by everyone. Other domains could be more obscure.

1 Like

OK, yes, those are all good points.

The “foreseeable future” is hard to determine, there’s no doubt about this

Perhaps you can look at the previous 5 years of the code-base to get a picture of how it will evolve in the future. :slight_smile: Just a thought, I’m assuming there is research about this topic already, to track changes and then extrapolate future changes. You could also analyze change requests, and how they change over time.

As a counterpoint, the latter is a lot more easier to understand at a glance, while the former requires at least a closer inspection to understand.

Maybe a good intermediate would be:

let print_invoice_summary (inv : Invoice.t) =
  let module I = Invoice_ID in
  let module D = Date in
  printf "invoice #%s %s: %s\n"
    (I.to_string inv.id)
    (D.to_string inv.creation_date)
    inv.description

which IMO makes it easier to see that the function simply converts its arguments to strings and prints them.

(obviously, for the context of such a simple function this might be overkill, but I’ve found such local module aliases to be of much help in more complex functions where verbose module names can make the program harder to parse through).

:thinking: The definitions of I and D introduce a pointless indirection. They add cognitive burden and therefore I’m very much against this specific snippet.

Maybe for this particular (toy) example, but in a larger project, well chosen module aliases can help in forming a common language for concisely talking about the domain.

It likely comes down to coding-style preference, this feels closer to the lisp-style development philosophy, where you write simple code to express complex concepts by extending your primitives to be targeted to the domain you’re working in. (in the case of lisp, this is more a necessity, because you don’t have types to guard-rail you, but I still find it nice to adopt in OCaml projects).

1 Like