About the order of definitions

chris-armstrong · April 25, 2023, 12:56am

In code generation tasks, particularly where one is working with mutually recursive definitions, an option for forward declaration would be nice. I usually end up trying to sort things in dependency order as much as possible, but inevitably I always end up with a big ball of let something ... and ... and somewhere near the end.

mars0i · April 25, 2023, 3:22am

Some other languages that require this order:

Idris
Clojure, but you can declare names at the top of the file, and then definitions of the declared names can appear in any order.

For Clojure, I’ve seen it said that the standard order speeds up the compiler a little bit, since there is only one pass through the file. I’ve also seen Clojure programmers defend it for reasons given above: you always know that the functions defined in terms of others are at the end.

Sometimes I follow the default order in Clojure, but sometimes I prefer to put the most important definitions first–which might be the ones that use all of the others–as @grayswandyr suggested. The reason is that the first definitions can then give someone reading the code an outline of the structure of the computations to follow. This can be very important to someone else who needs to understand your code for the first time (or for me a couple of years later).

When I don’t put the most important definitions first, sometimes I add a very prominent comment at the top, recommending that one begin reading with such and such function definition. If definitions are in dependency order, i.e. defined in terms of ones above, that most-informative function might not be at the end of the file.

yawaramin · April 25, 2023, 3:52am

In OCaml I tend to think of interface files as fulfilling the role of forward declarations and, when possible, declaring the more important items earlier.

c-cube · April 25, 2023, 4:38am

I agree, code generation is a place where OCaml’s ordering constraints
can be a massive pain. If you’re generating code from some format where
order doesn’t matter, you end up having to implement a
strongly-connected component analysis where other languages, like rust,
can emit code in a straightforward way.

It’s also a pain when one wants to have many types that are mutually
recursive; and a real blocker if you want a type mutually recursive with
a set of it or something like that[^1].

It made sense for OCaml, and it’s consistent with having a REPL, but
languages tend to be less order-dependent these days.

[^]1 please don’t say “recursive modules”.

grayswandyr · April 25, 2023, 8:06am

Isn’t mutual recursion by default a different matter from “postponed” local definitions (which is what where is used for)?

benjamin-thomas · April 26, 2023, 11:00am

Thanks for that detailed feedback

I’ll surely have a look at it, since it falls into the category of languages that “make you think differently”, so fun to play with.

To me, I feel the inverse is true. Reading the file top-down is akin to reading a book. You can only go as far as the knowledge you’ve actually acquired. If you skimmed to quickly, go back a step or two. Reaching the end, you “should” have understood everything.

Thanks! Interesting regarding Idris, since I understood it inherited quite a few Haskell idioms.

So I note that code generation can be painful (I have yet to play with this). In the other cases, I feel that restructuring one’s code may be an option to consider (feeling pain → time to refactor)

Leonidas · April 26, 2023, 12:24pm

Basically all “scripting” languages? Python, Ruby, Clojure come to mind. Of course you could argue that they don’t do compilation, but e.g. in Python code does get compiled to bytecode so in many ways it is not that different from OCaml bytecode compilation.

You can even have fun errors about reading variables that you write to later, because the scoping rules are, well, not very good.

nobrowser · April 26, 2023, 6:06pm

Even if this is true about Python globals (and I’m not sure it is so simple even in this case), because Python is OOP most code is in class methods, and those definitely can forward reference other methods.

mars0i · April 26, 2023, 6:08pm

Only in a statically typed language forum would someone call Clojure a “scripting language”! Clojure is definitely compiled (to JVM, to Javascript, a few other targets), even though it’s not statically type checked.

I go back and forth between dynamically typed and statically typed ways of thinking. The other side always looks different to me from the point of view of the one perspective.

benjamin-thomas · April 26, 2023, 7:43pm

Ruby, Python and JavaScript definitely don’t. For instance, executing this code from a file will run fine and print “OK”.

def func1
  func2
end

def func2
  "OK"
end

puts func2

But this is not unlike most (?) compiled languages as far as I know. As an example, this Go code will compile and run fine:

package main

import "fmt"

func main() {
	fmt.Println(myFunc())
	fmt.Println(myGlobalVar)
}

func myFunc() string {
	return "OK"
}

var myGlobalVar = "OK2"

K_N · April 26, 2023, 8:27pm

From what I understand, the thing that is resolved “statically” (during bytecode compilation) in Python is scope. So when the code is compiled, the bytecode compiler has to know in which scope an identifier is supposed to be. But it does not need to know the actual definition nor if it actually exists. For instance this is fine in Python:

def f():
    g()

The bytecode compiler will happily generate some code for f that will, when invoked, look in the global scope if g is bound to something. So you can define g above f, below it or not at all. And if it’s not defined f itself can alter the global scope:

def f():
    f.__globals__['g'] = lambda: print("I exist now!")

# g()    called here would throw an error 'g' is not defined
f()
g()   #prints I exist now

What fun…

Chet_Murthy · April 26, 2023, 8:30pm

I don’t mean this to sound all pedantic and all, but … I think it might be worthwhile to think about this in terms of “what is the scope of a name?” Different languages make different choices about this, and sometimes they choose differently in different contexts. The question of “the visibility of names” is a … primordial one, in that it was heavily-discussed and -debated early on, but once these various modes of visiblity were established, language-designers would make these choices very, very, very early-on.

the original LISPs chose that a name was visible everywhere in the call-tree descending from its definition-site. This was called “dynamic scoping”.
later, languages like Scheme (and ML-family languages) chose that a name was visible in a structurally-defined textual scope: for local definitions, it was the body of a block within which the definition was made visible (“let … in …”) and for toplevel definitions, it was “the rest of the compilation unit”.
O-O languages chose that a name was visible within the “scope” where it was defined: a “scope” could be a class (for instance). And, I guess, from there it became natural for names to be visible within an entire compilation unit, treating the compilation unit as a a “scope”.

[obviously OCaml’s “objects” use the same scoping rules as O-O languages, but set that aside]

Notice how Python’s decisions about visiblity mean that when you define two functions in the toplevel, viz:

def foo(n):
  return bar(n)

def bar(n):
  return n

the definition of foo is incomplete until bar is defined. This is a (again) primordial issue: what is the supporting context needed to define the meaning of a bit of code? When names are visible within entire scopes, that context is the entire scope. When names are visible only on what follows in a compilation unit, that scope is … “everything prior in the compilation unit”.

benjamin-thomas · April 27, 2023, 5:22am

Very interesting point, thanks. “Scoping rule” does seem to put a word on the behavior I’m describing.

So at the extreme end of the spectrum, dynamic scoping allows this kind of code to run just fine:

# Ruby
class Hello
  def foo
    "FOO"
  end

  def bar
    i_am_bogus(1, 2, 3, 4, 5)
  end
end


hello = Hello.new
x = hello.foo
puts(x)

// JavaScript
const Hello = {
  foo: () => "FOO",
  bar: () => i_am_bogus(1, 2, 3),
}

function i_am_bogus_too() {
  return wat();
}

x = Hello.foo();
console.log(x);

At the other extreme we have languages like OCaml, Haskell, Go, Java, Clojure, etc. that implement “lexical scoping” rules. But OCaml, by allowing shadowing at the top level (as noted by @bcc32 - thank you), enforces a stricter constraint to the programmer.

I have yet to understand what the top level is exactly, but I think I get it now. Please feel free to correct me if I’m wrong.

Chet_Murthy · April 27, 2023, 5:38am

Your two examples work for different reasons, maybe? [I could be wrong about this, obvs …]

in the case of Ruby, it works b/c you never invoke the undefined function: if you did, I assume it would blow up. That’s b/c Ruby isn’t even trying to ensure that all names resolve to well-defined targets.
in the case of JS, it’s staight-up “look in the rest of the scope for the name” (uh, I think).

The “scope” here is the entire compilation unit.

I do think that well-defined languages make clear, for a name, what will be the domain of search (and what order) for definitions of that name. That’s all I mean by “scope”.

Topic		Replies	Views
Evaluation order considered important for newbs to learn early Learning	4	922	November 3, 2017
Order of evaluation in application/tuple/...? Learning	24	3240	March 28, 2021
Tuple and function execution order Learning tuple	9	2302	February 11, 2020
Module linking in OCaml Learning ocaml	9	1585	August 26, 2021
Array.init order of initialization Community	1	345	August 29, 2023

About the order of definitions

Related topics