Adding indexed variants of all functions in stdlib

I am thinking of adding index-aware variants of existing functions to the standard library. For instance, List.map has an indexed variant named List.mapi. Currently, only a minority of standard functions for which it would make sense have such a variant (there is no for_alli, existsi, findi…). Besides, which functions have an indexed variant is not entirely consistent across standard modules (fold_lefti exists only in Seq, filteri in List, find_mapi in Seq since 5.1…). This has bitten me many times. Hence I would like to make it systematic.

Affected standard modules: Seq, List, Array, Float.Array, String, Bytes (+ their labeled variants) and I hope I haven’t forgotten any.

Regarding naming: my current idea: because there are many functions involved, and also because it is not always obvious how to follow the existing convention of appending an i to the name (“should that function be called find_opti or findi_opt?”), instead of throwing all the new functions into the main namespace, I am thinking of adding a submodule Indexed to every affected standard module. Then, indexed functions in this submodule would have the same name as their non-indexed counterpart, e.g. List.mapi would be the same thing as List.Indexed.map.

Optional secondary goal: identify a small subset of indexed functions that should consistently be provided in the main namespace, with the i suffix convention. I believe it should include these: iteri, mapi (already exists in all affected modules), fold_lefti (among the most useful of these, and already exists in Seq), to_seqi (has a somewhat different status than the other indexed functions, and already exists in several modules).

This is a relatively easy but large addition to the standard library, so before I rush into code and dump everything at OCaml maintainers, I’d like to gather opinions from the broader community.

  1. What do you think of adding all these functions systematically?
  2. What do you think of the naming scheme?
    • What should be in the minimal subset mentioned above?
  3. Implementation: in order to de-duplicate code, should the non-indexed versions be re-implemented as wrappers for the indexed versions, knowing that this would make them slightly slower (one more indirection and a closure allocation)?
  4. Documentation (the hardest part): how to document all these?
    • The most concise and therefore my favorite option (which happens to be the laziest) would be to just write a paragraph at the level of the Foo.Indexed submodule, saying something like “these functions are variants of those found in Foo, which also provide the index as an argument to their callback”. (This style of documentation is found for Random.State.)
    • The opposite solution would be to fully document the indexed variants, and reduce the documentation of non-indexed variants to “Like Indexed.foo but with indexes omitted.” Unfortunately, that would break the flow of documentation in the main namespace of affected modules.
    • I don’t like duplicate explanations because that’s a waste of the reader’s time/brain, not mentioning having to maintain both in sync.

For the record, here is a candidate list of all indexed functions (here with the i suffix convention), according to my scan of current documentation. Commented are those that already exist as of OCaml 5.1.

(* Seq: *)

  (*! val iteri : (int -> 'a -> unit) -> 'a t -> unit !*)
  (*! val fold_lefti : ('b -> int -> 'a -> 'b) -> 'b -> 'a t -> 'b !*)
  val for_alli : (int -> 'a -> bool) -> 'a t -> bool
  val existsi : (int -> 'a -> bool) -> 'a t -> bool
  val findi : (int -> 'a -> bool) -> 'a t -> 'a option
  val findi_index : (int -> 'a -> bool) -> 'a t -> int option (* OCaml 5.1 *)
  (*! val find_mapi : (int -> 'a -> 'b option) -> 'a t -> 'b option !*)  (* OCaml 5.1 *)
  val iter2i : (int -> 'a -> 'b -> unit) -> 'a t -> 'b t -> unit
  val fold_left2i : ('a -> int -> 'b -> 'c -> 'a) -> 'a -> 'b t -> 'c t -> 'a
  val for_all2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val exists2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val unfoldi : ('b -> int -> ('a * 'b) option) -> 'b -> 'a t
  val iteratei : (int -> 'a -> 'a) -> 'a -> 'a t
  (*! val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t !*)
  val filteri : (int -> 'a -> bool) -> 'a t -> 'a t
  val filter_mapi : (int -> 'a -> 'b option) -> 'a t -> 'b t
  val scani : (int -> 'b -> 'a -> 'b) -> 'b -> 'a t -> 'b t
  val take_whilei : (int -> 'a -> bool) -> 'a t -> 'a t
  val drop_whilei : (int -> 'a -> bool) -> 'a t -> 'a t
  val groupi : (int -> 'a -> 'a -> bool) -> 'a t -> 'a t t
  val flat_mapi : (int -> 'a -> 'b t) -> 'a t -> 'b t
  val concat_mapi : (int -> 'a -> 'b t) -> 'a t -> 'b t
  val map2i : (int -> 'a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  val map_producti : (int -> 'a -> int -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  val partitioni : (int -> 'a -> bool) -> 'a t -> 'a t * 'a t
  val partition_mapi : (int -> 'a -> ('b, 'c) Either.t) -> 'a t -> 'b t * 'c t

(* List: *)

  (*! val iteri : (int -> 'a -> unit) -> 'a t -> unit !*)
  (*! val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t !*)
  val rev_mapi : (int -> 'a -> 'b) -> 'a t -> 'b t
  val filter_mapi : (int -> 'a -> 'b option) -> 'a t -> 'b t
  val flat_mapi : (int -> 'a -> 'b t) -> 'a t -> 'b t
  val concat_mapi : (int -> 'a -> 'b t) -> 'a t -> 'b t
  val fold_left_mapi : (int -> 'a -> 'b -> 'a * 'c) -> 'a -> 'b t -> 'a * 'c t
  val fold_lefti : ('a -> int -> 'b -> 'a) -> 'a -> 'b t -> 'a
  val fold_righti : (int -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b
  val iter2i : (int -> 'a -> 'b -> unit) -> 'a t -> 'b t -> unit
  val map2i : (int -> 'a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  val rev_map2i : (int -> 'a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  val fold_left2i : ('a -> int -> 'b -> 'c -> 'a) -> 'a -> 'b t -> 'c t -> 'a
  val fold_right2i : (int -> 'a -> 'b -> 'c -> 'c) -> 'a t -> 'b t -> 'c -> 'c
  val for_alli : (int -> 'a -> bool) -> 'a t -> bool
  val existsi : (int -> 'a -> bool) -> 'a t -> bool
  val for_all2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val exists2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val findi : (int -> 'a -> bool) -> 'a t -> 'a
  val find_opti : (int -> 'a -> bool) -> 'a t -> 'a option
  val findi_index : (int -> 'a -> bool) -> 'a t -> int option (* OCaml 5.1 *)
  (*! val find_mapi : (int -> 'a -> 'b option) -> 'a t -> 'b option !*)  (* OCaml 5.1 *)
  (*! val filteri : (int -> 'a -> bool) -> 'a t -> 'a t !*)
  val find_alli : (int -> 'a -> bool) -> 'a t -> 'a t
  val partitioni : (int -> 'a -> bool) -> 'a t -> 'a t * 'a t
  val partition_mapi : (int -> 'a -> ('b, 'c) Either.t) -> 'a t -> 'b t * 'c t
  val to_seqi : 'a t -> (int * 'a) Seq.t

(* Array: *)

  val fold_lefti : ('a -> int -> 'b -> 'a) -> 'a -> 'b t -> 'a
  val fold_left_mapi : (int -> 'a -> 'b -> 'a * 'c) -> 'a -> 'b t -> 'a * 'c t
  val fold_righti : (int -> 'b -> 'a -> 'a) -> 'b t -> 'a -> 'a
  val iter2i : (int -> 'a -> 'b -> unit) -> 'a t -> 'b t -> unit
  val map2i : (int -> 'a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  val for_alli : (int -> 'a -> bool) -> 'a t -> bool
  val existsi : (int -> 'a -> bool) -> 'a t -> bool
  val for_all2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val exists2i : (int -> 'a -> 'b -> bool) -> 'a t -> 'b t -> bool
  val findi_opt : (int -> 'a -> bool) -> 'a t -> 'a option
  val findi_index : (int -> 'a -> bool) -> 'a array -> int option (* OCaml 5.1 *)
  val find_mapi : (int -> 'a -> 'b option) -> 'a t -> 'b option
  (*! val to_seqi : 'a array -> (int * 'a) Seq.t !*)

(* String: *)

  (*! val mapi : (int -> char -> char) -> t -> t !*)
  val fold_lefti : ('a -> int -> char -> 'a) -> 'a -> t -> 'a
  val fold_righti : (int -> char -> 'a -> 'a) -> t -> 'a -> 'a
  val for_alli : (int -> char -> bool) -> t -> bool
  val existsi : (int -> char -> bool) -> t -> bool
  (*! val iteri : (int -> char -> unit) -> t -> unit !*)
  (*! val to_seqi : t -> (int * char) Seq.t !*)

(* Bytes: *)

  (*! val iteri : (int -> char -> unit) -> t -> unit !*)
  (*! val mapi : (int -> char -> char) -> t -> t !*)
  val fold_lefti : ('a -> int -> char -> 'a) -> 'a -> t -> 'a
  val fold_righti : (int -> char -> 'a -> 'a) -> t -> 'a -> 'a
  val for_alli : (int -> char -> bool) -> t -> bool
  val existsi : (int -> char -> bool) -> t -> bool
  (*! val to_seqi : t -> (int * char) Seq.t !*)

3 Likes

I think this suggestion belongs as an Issue in the ocaml github, where the core devs can comment on it.

Personally, I have not noticed the absence of these functions much, but I don’t have too strong an opinion on not adding them. I have the following observations about the proposed methods:

  • For folds you can already modify the accumulator to keep track of the index, so I don’t quite like the fold_*i - it’s not a signature I’m going to remember, so I’ll just use the regular fold
  • A lot of the functions provide “commonly used folds”, e.g. exists, for_all, I would argue that indexed versions are not so common as to warrant convenient shortcuts in the Stdlib

Note that these two are not just folds, because they interrupt the traversal as soon as the answer is known. BTW this is something I found was lacking too, some kind of interruptible folds. But that’s another matter.

The core dev is generally busy. I won’t waste their time and mine if the proposal proves unpopular.

Adding them all for the sake of it just because there are some that have been deemed useful enough to have is overkill. One should have a look at common standard library extensions (base, containers, batteries, etc?) and see which ones they have added based on actual needs.

Implementing non-indexed versions in terms of indexed ones yielding another closure is probably going to be unacceptable overhead. Not too long ago there was a PR to refactor Hashtbl a bit to reduce code duplication: Refactor hashtbl.ml by andrepd · Pull Request #12007 · ocaml/ocaml · GitHub. But that seems to have gotten stuck behind the significant overhead issue as well.

That’s not an inference I would make. I don’t think that seeing them in standard library extensions entails those being based on “actual needs”.

Functions may end up in these project based on other principles or events. For example uniformity of interfaces, symmetries, drive-by desires etc.

The Haskell ecosystem solved this problem by having a separate package with indexed versions of all functions:

I believe you can do the same today, encourage people to use the functions and get more feedback from users.


From my experience, I never used the Haskell package ilist despite knowing about it for years. I noticed that a single function that adds indexes to all elements in a list is enough.

In Haskell, this is zip [0..] myList.
And in OCaml, it would be something like List.mapi (fun i a -> (i, a))