What is a good function?

that’s a matter of taste. Alignment improves legibility, we all use that all the time.

If it improves legibility, it’s great. That’s all I was trying to communicate.

How’s this different than any other language?

It’s more a mindset shift rather than any kind of technical shift, so it’s hard to describe exactly.

When coding in OCaml normally, I don’t often think too much about whether a given piece of code “reads” like human language or not, wheras in lisp, this is more critical.

For example, the following might be more of an OCaml style of code:

let update state = 
    let time = Time.now () in
    List.iter (fun file -> file.age <- time) state.files

Which (for this example) is simple enough, but doesn’t necassarily read like a human lanugage.

If I were writing this in lisp, I would write it as follows:

let update_file_age file ~to_ = file.age <- to_
let for_each ~in_ f = List.iter f ~in_

let update state =
    for_each ~in_:state.files (update_file_age ~to_:Time.now ())

In the above code, while possibly more verbose than the original, IMO it can be easier to mentally check, because the code reads like a sentence in english.

Admittedly this seems like a pretty simple idea, but I found that it was pretty much crucial for being able to practically write lisp code. Without types to help you, it’s very easy to make mistakes if you forget the intended behaviour of a particular snippet of code, so you want things to be as simple to understand as possible.

1 Like

This is a really interesting perspective to me. Can you say more? Is it just about H-M typing vs dynamic typing, or is there more to it?

Hm, who are you replying to? Gopiandcode?

Sorry that wasn’t clear. Yes, I find the idea of the OCaml vs Lisp mindset intriguing. Not simply that different programming languages/paradigms might align with different ways of thinking (a common argument) but specifically the links to naming and typing.

I’ve been thinking about writing a blog post about this (my experiences getting to grips with lisp as an OCaml/Haskell programmer) but haven’t had the time to fully organise my thoughts yet, but here are a few preliminary words on it.

I don’t think this comes down to HM-typing/dynamic typing (I hope not, as I’d like to see some of the flexibility of macros in OCaml at some point (MetaOCaml?)), but more that the entirety of lisp encourages people to develop in a certain way.

Alongside the dynamic typing (which something like python has as well), I also find that the syntax also encourages this style. In particular for two reasons:

  • Parens are noisy - I’ve found online that many say that the parenthesis in lisp are an aquired taste:


    Now while I do think they become more paletable over time, I believe that the syntactic noise the parentheses impose places some inescapable constraints on the complexity of functions that you can easily write.

    In particular, because deeply nesting expressions causes parentheses to pile up, in order to write maintainable lisp code, developers are implicitly encouraged to encode their problem in as simple a form as possible (i.e no deeply nested expressions).

    To allow programmers to do this in a manageable way without having to express complex properties, lisp provides excellent macro support.

  • Code is data - This is probably the most-commonly cited “difference” between lisp and other languages, in that you can manipulate the program AST as easy as you manipulate standard data structures in the language.

    The main benefit of this is that it’s easy to mould the language to your taste, and once you realise this, it can be incredibly empowering?.

    To give an example of such moulding of the language, here’s a problem that I ran into when I was first starting out.

    In particular, often I would find myself in a situation where I wanted to check a condition that was a conjunction of several clauses, where subsequent conditions may use the results of prior ones:

    (let ((res (possibly-failing-comp)))
       (and res 
                (possibly-failing-comp2 res)
                ... ;; more conditions using res
        )
    

    So, just to handle this dependency, rather than using a monad as in OCaml, I had to add several program constructs, which would quickly pile up, leading to quite messy code.

    Naively, because I was so accustomed to languages with fixed syntax, I just internalised this annoyance as one of the “costs” of using the language, and for a long time wrote code with this construct, struggling to understand why people would praise lisp.

    This all changed once I found macros (in this case, the concept of anaphoric macros), and I realised that these limitations were not “a part of the language”, but could easily be circumvented.

    (a-and (possibly-failing-comp)
            (possibly-failing-comp2 it)
             ...)
    

    Anaphoric-and essentially implements a monadic bind, using the symbol it to refer to the result of the previous computation, and make it simple to write these kinds of dependent conditions idiomatically.

    In this case, there was a particular generic macro concept that captured my usecase, but more generally, whenever I have a particular program construct that I repeat often (and thus want to make part of my domain language), I create a quick custom macro for it.

    The ease of writing macros means that this style of creating a DSL to represent your domain is encouraged.

As an example of how my lisp coding style has changed as I’ve adopted to ergonomic lisp code, below I’ll show two snippets of lisp code that I’ve written in the past - one at the start of my journey, and the other at the end:

Unlispy Lisp code:

(defun histogram/get-word-count ()
  (save-excursion
    (beginning-of-buffer)
    ;; string wordmap
    (let ((wmap  (make-hash-table :test 'equal))
	       (end (save-excursion (end-of-buffer) (point))))
      (while (< (point) end)
	    ;; move forward to word
	    (re-search-forward  "\\w+\\W*" end t)
	    (let* ((text-start (match-beginning 0))
	            (text-end (match-end 0))
	            (word (s-downcase (s-trim (buffer-substring-no-properties text-start text-end)))))
	       (--> (gethash word wmap 0) (puthash word (+ it 1) wmap))))
       wmap)))

The above code calculates a histogram of words in the current buffer.

I wrote this when I was first learning lisp, and wrote the function almost exactly like I would have written it in OCaml. The program creates a local let binding to a hash map, and then loops over the words in the buffer and adds them to the table.

For the sake of the readers, here I’ve chosen a small function that isn’t too hard to read, but even in this small snippet, you can start to some parts in it where I make use of quite deeply nested expressions that aren’t easy to parse (such as (s-downcase (s-trim (buffer-substring-no-properties text-start text-end)))). These kind of expressions would have been perfectly fine in OCaml, but in Lisp, the lack of typing + syntactic noise means that they can quickly become challenging to handle.

Lispy lisp code:

(defun interactive-utop--handle-spawning-child (_process output)
  "Handle OUTPUT received from_PROCESS when spawning a child."
  ;; we're waiting for utop to get back with the spawned child pid
  (interactive-utop--append-output output)
  (let ((lines (split-string interactive-utop-output "\n"))
        line child-pid)
    (interactive-utop--for-each-complete-line line lines
      (interactive-utop--split-line-into line command argument
        (pcase command 
          ("stdout"
           ;; due to way that __interactive_utop_spawn_child is
           ;; implemented, parent will print out child's pid first
           (cond
            ((string-equal argument "- : unit = ()") nil)
            (t
             (unless child-pid (setq child-pid (string-to-number argument)))
             (interactive-utop--set-idle-state))))
          ("accept" nil)
          ("prompt" nil)
          (_
           (interactive-utop--handle-unknown-command command argument)))))))

This code is more recent, and was part of a small package to provide coq-style interactivity to utop.

The code receives output from the utop process, splits it into lines, each of which contains a command and argument, and then performs some behaviour based on the type of command.

Notice how while the code reads almost like human language.

For example, interactive-utop--for-each-complete-line and interactive-utop-split-line-into are two macros that I created to abstract over some common patterns that were introducing excessive noise into the program (in particular, they hide a while loop with a temporary variable, and a regex-match and extraction).

In this way, even though this program might be more complex than the histogram one, it is far easier to parse as a human, because each line/expression explains itself.

3 Likes

lol? That’s your brain removing noise for you subconsciously. ^^

I feel like you made so many strong claims that I want to answer.

Not sure about your brain, but mine can definitely read the first code faster than the second one.

It’s like reading any infix operator, you start from the middle if you want to put it in imperative language c = a + b, adds a to b and put it on c or you read from left to right in declarative language c is equal to a plus b. Just that it follows another word order where the subject is the last, similar to Verb–object–subject word order - Wikipedia

That doesn’t mean one is more readable than the other or that one is more natural than the other, just that one is more familiar to you than the other.

Or in a more OO manner you could do

let update state = 
    let time = Time.now () in
    state.files |> List.iter (fun file -> file.age <- time)

Which reads in a more SVO way, I personally prefer like this, but it’s not about being similar to human language.

Yes and I hardly disagree here, non-nested expressions don’t make things “simpler” it just means they’re in separated places, which is the opposite of what I want. If a function F is just a helper to G is a lot better to keep it inside of G.

Coupling is not fundamentally bad and I would argue that in general coupling is actually quite positive as it reduces cognitive overhead, but there is some trade-offs, coupling is not flexible.

Well you can manipulate OCaml AST as easy as any other data structure, just that it is more complex, which shows the problem a lot deeper, it’s not about easy of manipulating the structure, is about the simplicity of the structure and tools to do that, but nothing fundamental

And now to someone learn how to use this macro they need to, “look at documentation” which they will not do, “read the implementation” which means the macro is a bad abstraction or “copy the code from somewhere else” and not understand it which is the worst case.

This is another abstraction to learn, without any good guiding, probably without a good documentation and that allows a lot of hacking. The thing with monads is that as soon as you know how one monad works, all of them work similarly, and you can be guided by the type system.

There is a good reason for that, while the code “looks” like english to you, it’s not, code is not human language and as such making it artifically look “human” is actually a “False friend”, code is a lot more strict and that is intentional as we need to communicate with others and is not reasonable to expect someone to learn 1000 different ways to say something as it happens with human language, and the only reason that works for human language is because kids are incredible at learning.

Overall there is a reason why programming language design did spend a lot of time finding ways to constrains what the user can do, why languages like Go exists(which I think it’s the wrong extreme).

In general I don’t know what is a good function, but I would not call any of your “lisp” examples a good function while the second is better is not because it is good, it’s just that as you mentioned, you wrote the first one while learning it, but personally I wouldn’t do any of those.

A function that does only one thing, but does it well.

1 Like

Maybe there’s been a misunderstanding here - as far as I was aware, I wasn’t making any objective claims, but rather explaining my experience learning how to write idiomatic lisp, and some takeaways from that experience.

To be clear, my post was responding to @cjr’s query about the differences in style between HM-typing versus dynamic typing:

At no point was I trying to make a judgement on which language is better, but rather expositing on the style of coding that lisp encourages - this was something that wasn’t immediately clear to me, so I thought it would be helpful to discuss about.

A lot of the responses seem to be along the spirit of “defending” an OCaml style of coding over a lisp one, suggesting that my post was received as trying to say OCaml < Lisp, which couldn’t have been further from my intention.

For the record: OCaml is my favourite programming language - and I’m not so brazen of a troll to go to one of the main OCaml discussion forums and claim that some other language is “better” than it.

That being said, let me respond to a few points.

Yes, mine too. I think my choice of example might not have been the best here, because the original was certainly more simple than the second version, and I probably wouldn’t suggest replacing standard combinators like iter and fold with “human sounding” variants.

I think a better summary of the point that I was trying to make is: as writing functions is so easy in OCaml, when writing code it’s easy to come up with functions that do too many things at once and don’t have a cohesive explanation in terms of the domain logic. Obviously, sometimes this is unavoidable, but I find that the structure of code improves if one strives to keep functions cohesive.

In the case of lisp, the noise from the parentheses means that if you start writing functions like that it quickly sticks out like a sore thumb, and so I find that I am encouraged to simplify my functions, while in OCaml, the lack of noise (which is a good thing generally) means that I don’t have that same pressure.

Hence, since learning lisp, I’ve now adopted the habit of periodically reviewing my OCaml code and checking if I have made functions that stray from this path.

This is one place where I would objectively say that Lisp has a hand up over OCaml - the macro story for OCaml versus lisp is worlds apart. In lisp, you can write macros just as easily as writing another function, while in OCaml, adding a macro requires a significantly greater degree of effort - creating a dune file, learning the ppxlib api (which has pretty poor documentation, and requires scouring blog posts to get to grips with),

Granted, as you say, none of these are “fundamental” to the language, but they are effectively a part of the experience of using OCaml today, and they culminate in making users less likely to macros unless they absolutely need to (this may even be a good thing, given the difficulty of reasoning about the semantics of macros).

Again I’m not trying to make a judgement between the two, or trying say that lisp’s macros are better than OCaml - in fact, I do prefer monads/lenses/etc. with their type support over the macro imitations that I find in lisp.

I believe this is too superficial of a reading of my original point (admittedly a reading probably encouraged by my poor choice of examples and wording). Obviously, code is not english language, and language is ambiguous and a poor substitute for a programming language. My point was not that we should write code so that it can be read as a human language, but rather think of writing code/creating abstractions from the perspective of writing a DSL to express our semantics. In particular, the emphasis is on the explicit perspective of writing DSLs when writing code. In particular, when writing a function, I would suggest considering whether it makes sense in terms of the domain, and splitting it into multiple functions if it doesn’t. When you read code in this form, it might not read literally like a human language, but rather abstractly like one, with the functions acting as words in terms of the domain, and the program as a sentence.

While those code samples are still recent enough that I wouldn’t feel comfortable throwing them under the bus completely, they weren’t intended to be the examples of the best possible lisp code - I intended them to show primarily what I considered the difference between lispy and non-lispy code.

2 Likes

Cross-cutting concerns. Also, it’s not always easy to define what a “thing” is. Or even “one”.

If you don’t understand what that means, then know that many unix command-line tools follow this principle.

1 Like

A view I haven’t seen represented here yet, and which isn’t at all novel but probably worth considering: an operator in an algebra that correctly and elegantly formalizes some of the essential structure in the problem domain is a good function.

This pushes the problem back into the qualifiers “correctly” and “elegantly” applied to the formal representation of the system, which I take to be the right direction to push. Good functions end up being the operators of good algebras. (For that latter question, I like the way Joseph Goguen approaches the problem in terms of semiotic systems, but the more current work in this direction seems to go under the banner of applied category theory?)

Note that I don’t mean to suggest it is easy or trivial or even always possible to derive these kinds of representations. (I especially don’t mean to suggest I’m good at this.) There’s also the thorny question of how we cope with inherently non-compositional problem domains, as we’re likely to encounter in interactive and distributed systems. Perhaps functions aren’t the tool for this kind of job?

Questions around tests and documentation are very important for code quality and development practice. But it seems that those questions can be factored out and dealt with separately from the quality of the functions that they verify and explain (respectively).

I suspect that once you’ve tended to the formal and systemic issue, questions like length (in terms of LOC), purity, testability, and number of arguments are mostly neutralized.

Question of naming and general legibility persist. I think this boils down to the profound and inescapable problems of communication, and I suspect they are inherently contextual and relative. Different communities have different common ground and use different idioms, and the best communications will be those that are responsive to the needs of the community. I’d imagine most of this part overlaps with practices around effective writing generally.

2 Likes

nobody said it would be, I guess. Easy is always hard.

1 Like

Yes yes, but it’s domain specific. Is “processOrder()” one thing? Or “installApplication()”?

Could you give an example of this approach? In a non-mathy domain, preferably. :stuck_out_tongue:

1 Like

Both, you can have a good function at any level of granularity.
Look: even main(), the most top-level function of a program is a good function.
It does one thing (launch the program, then return its exit status).

@shonfeder - Is this really about functions? Or signatures? When I’m focusing on operators and compositions, I’m working with my .mli file. But my .ml file may have a bunch of helper functions that are entirely internal and aren’t directly related to the algebra that I’m working on. Are helper functions “bad” functions?

1 Like