Should we have a `String.rev` in `Stdlib`?

There have been several times over the last few weeks when I’ve wanted to reverse small strings, mostly for nicer printf-debugging. Was a reverse function ever part stdlib/pervasives at some point and removed for some reason or the other?

Base, containers, and batteries all have it in their string modules. As far as I could tell, it’s not there directly in Astring.

Will submit a PR if there is consensus here.

3 Likes

What are your use cases for String.rev?

1 Like

My use case for the last week is very specific, but I imagine others will have run into times they’ve wanted to reverse a string.

The past week, I was printing binary numbers using pp-binary-ints, but I was using the bits to represent states of an automata. The least significant bit represented state 0, and it would have been nice to be able to quickly reverse the string before printing, so that the active states of the automata could be read from right to left.

I’d also imagine people being surprised there’s no built-in string reverse when they’ve tried to solve the stupid palindrome checking problem in OCaml.

One point to keep in mind is that rev implementation is a one-line function starting from String.init.

If your only have one use case in mind which is very specific according to your own words, I think it is a sign that it is better to keep the rev function away from the String module. I would rather not mix generic functions with functions with a handful of known use cases.

It’s not exactly an easy one-liner though, there’s some index manipulation.

let rev x =
  let len = String.length x in
  String.init len (fun n -> String.get x (len - n - 1));;

But I agree that if no one else chimes in here to signal that String.rev is generic enough, it need not be in the String module.

2 Likes

Perhaps if it were included in an omnibus of new String functions, like say:

val replace : find:string -> rep:string -> string -> string
val replace_all : find:string -> rep:string -> string -> string
val rev : string -> string
val split_on : splitter:string -> string -> string list

And other handy functions.

Those are spicy topics! I would really prefer not to bring those into the discussion. See PR#10 and PR#626. I haven’t fully read through them, but you’ll see some discussion about Str vs String, and some of those are provided via Str.

2 Likes

My view on it is that if something is useful enough to be present in more than one stdlib alternative, it should be a candidate for the stdlib itself. It didn’t get to those alternative stdlibs by accident.

4 Likes

(Just the standard answer for anything involving strings:) Another thing to keep in mind is Unicode. The naive code breaks with UTF-8-encoded strings and, in fact, not sure whether reversing Unicode strings makes sense at all.

4 Likes

From my point of view, the absence of any direct arguments in this thread on why rev is generally useful for strings make the compared analysis with stdlib alternative mostly irrelevant.

1 Like

It’s useful because it’s often something you want to use for various reasons, algorithmic and otherwise. If we had comprehensions that could handle strings, I agree it wouldn’t be necessary. In python, I can write some_string[::-1] and get a reversed string due to slice notation ([start;end;step]). Maybe the better function to insert would be a slice function.

2 Likes

In python, I can write some_string[::-1] and get a reversed string

Note that, in Python, you get the UTF-8 reversed string, not the byte-reversed string, which this thread seems to suggest adding.

“Various reasons” is hand-waving.

Let me illustrate the differences with List.rev. Spontaneously List.rev is useful because:

  • building lists often creates list in reverse order, List.rev is necessary to produce
    a list in the right order after the list has been completely built.
  • considering a list as a representation for an ordered sequence of elements, rev allows to switch from an order relation to the reversed order.

Correct me if I’m wrong, but even python will reverse the order of codepoints, not grapheme clusters or whatever the actually useful notion of character is. So reversing a string in this way will probably be incorrect in the presence of multi-codepoint characters, which is more and more common in our :sparkles: emoji :dancer: world.

I’m not convinced a rev function is that useful either, in the end. It makes more sense on bytes because it allows one to serialize some stuff back-to-front (or, say, generate bytecode back-to-front) and then reverse to get the final result. Arguably that’s a very niche usage anyway.

2 Likes

You are right. For example, the accent gets moved to the other letter on the following example. (Hopefully, Discourse does not mess too much with it.)

>>> 'x̅y'[::-1]
'y̅x'
4 Likes

Overall, we’ve haven’t reached the kind of support for String.rev that I was hoping for, so I think we can put this to rest. Probably didn’t help that there’s a separate heated Stdlib conversation happening. If you find a good use for String.rev feel free to revive this thread. I’ll summarize the discussion from above, but I will take the liberty to add some arguments for both sides.

Arguments against String.rev

The main objection to String.rev is that it’s not generic enough, at least not to the same level as List.rev where it is necessary because lists are often created in a reverse order or often represent ordered sequences where it is useful to be able to reverse that order.

I was going to argue that you can sometimes consider a string to be a “representation of an ordered sequence” of bytes and sometimes you need to reverse that order but I don’t think it would have convincing enough for @octachron, especially since we have String.fold_right now. Plus @c-cube suggested Bytes.rev being more appropriate.

A second objection to String.rev is that it’s not meaningful if your bytes represent Unicode code-points. Unicode is honestly too messy, and I am not convinced that we shouldn’t have things that work with Latin1 strings just because they don’t work with Unicode strings.

A third objection to String.rev is that we can implement it in one-line, so do we really need it?

Arguments in support of String.rev

It is in most of the alternative standard libraries. All the Stdlib alternatives contain rev, but neither of the String module alternatives I know of do (astring and stringext). So there is some conflicting evidence there.

It’s just useful sometimes! E.g. as a quick hacks to check for palindromes, for printing bytes in reverse order for debugging, etc.

Alternatives Suggested

  • Bytes.rev
  • Slice notation.

Bytes.rev_in_place might also be useful. Additionally, String.of_list and String.to_list would allow for an easy one-liner with List.rev.

How to reverse a string on OCaml

If you found this thread looking for how to reverse a string, here are some examples.

let rev x =
  String.to_seq x |> List.of_seq |> List.rev |> List.to_seq |> String.of_seq

let rev x =
  let len = String.length x in
  String.init len (fun n -> String.get x (len - n - 1))
2 Likes

You don’t want to create the reversed string for that.

You also don’t want to create the reversed string to just iterate the string in reverse order.
Note that those two use case suggest that the function that you really want is to_rev_seq.

Also, your first rev implementation

is far from ideal since char list is really inefficient in term of memory use compared to a string (and the sequence of character is traversed twice).

Yes, but it is a one-liner where you don’t have to worry about indexes. Useful for quick-hack situations, e.g. in a REPL or when you’re going to delete the code later anyway.

Thanks @ifazk for taking the time to write a summary! That will constitute a good starting point, should the question be raised again.

I mentioned Unicode as the modern keyword for “text”, and because it makes things more technically intricate than with any other encoding, but the conceptual issue is not specific to Unicode. Text in human languages can be appended, cut, pasted, but can it be reversed? On the top of my head that doesn’t seem a natural operation to me (unless you are Da Vinci); but of course I may be wrong. Which brings me to:

I’m not sure palindrome testing ever comes up in real life, as you said it’s more a fun programming exercise than an actual need (or for DNA stuff maybe? but would you use String for that? even if you did, reversing is something you could easily leave to your DNA library).

I’ll play the devil’s advocate here: if that function was deemed useful (which I agree it isn’t at the moment), then I believe the fact that it is error-prone or inefficiency-prone would be an argument for including a carefully-crafted implementation into the library.

3 Likes

it shoud be the other way around:
String.get x (len - n - 1)