Next priority for OCaml?

beajeanm · July 13, 2023, 9:44pm

Time for an OCaml version of https://www.arewewebyet.org/?

(Actually I think we libraries in quite a lot of these categories, but there are not publicised enough and it doesn’t feel like a newcomer could create a coherent stack out of them.)

askvortsov · July 13, 2023, 10:14pm

Looks like there’s one in progress!

github.com/ocaml/ocaml.org

Add "Are we X yet?" pages

ocaml:main ← ocaml:are-we-yet

opened 03:02PM - 27 May 23 UTC

tmattio

+796 -108

Inspired by Rust's [arewewebyet](https://www.arewewebyet.org/), I thought it use…ful to have similar pages for OCaml for some parts of the ecosystem. The goal of these pages is three-fold: - For newcomers, it offers an overview of the usability of OCaml for certain applications. - For OCaml users, it can help the discovery of libraries and frameworks to perform certain tasks - For community members, it can serve as a roadmap to focus our efforts on addressing specific pain points to make OCaml competitive with other languages for specific use cases **TODO** The PR isn't ready to be merged yet. - Needs a description of the state of each category - The UI is a bit rough and could use some improvements - When complete, I'd like to get feedback from people who are working on improving OCaml web ecosystem and the community. <img width="1149" alt="image" src="https://github.com/ocaml/ocaml.org/assets/6162008/66cc6967-e9b1-4f65-8121-21f59a4ca3ae">

meitros · July 14, 2023, 2:52pm

Are we web yet I think has two different interpretations: 1/ can we build client-side web applications, e.g. using melange with react or js_of_ocaml and 2/ Could I build something in ocaml akin to if I used flask/django, even if I just use htmx to power the frontend logic (iirc ocaml.org uses htmx.)

(Doing this out of order, 2, then 1)

For 2, Leaving OCaml still seems apt to me: I/ not having native cloud sdks would be painful, not for everyone but definitely a subset of possible folks ii/ Concerns that database drivers in Postgres aren’t as mature as other languages would be a big worry for me iii/ there might still be some web application-related gaps even after dream hits it’s goals (authentication came up in the blog post) iv/ For most businesses the benefits of having enough existing libraries in your ecosystem outweigh the benefits you get from using ocaml [1]. Paul eventually went F# since it’s ocaml-like and can access .NET libraries.

I think 2 is more the actual problem than 1. But I will say that I’m pretty excited about the progress in 1 (as someone who writes ocaml nowadays.) The goal seems more apt - javascript developers aren’t going to abandon full stack typescript so instead let’s focus on having ocaml developers be able to write both the client and server in a single language, and my understanding so far is that it mitigates some of the existing issues that solutions have had: too obfuscated code / hard to debug and large bundle sizes. I haven’t tried it yet though so don’t have any experience to comment about.

[1] Imagine you had someone skilled at both ocaml and python, and they had a startup idea and they were doing a pro/cons analysis of what tech stack to use - I think they’d usually choose python.

NickBarnes · July 14, 2023, 3:22pm

Evidently from this thread, there are many different attitudes towards debuggers.
I learned a habit in the 90s of always single-stepping newly-written code under a debugger. I have broadly followed this practice for 30 years, in C, SML, assembler, Lisp, Python, and various Java-family languages. It’s using a debugger as a code-review tool, essentially, and can catch a significant proportion of defects at a very early stage. Quite independently of using a debugger to track down a bug, I would like the ability to use one for this code-review purpose on OCaml code.

cemerick · July 14, 2023, 6:15pm

This is putting it gently. Looking back now at the poll in Next priority for OCaml? - #14 by lukstafi, it seems that more respondents are interested in such things as stack allocation and unboxed types as are interested in an accessible debugging solution. I have no idea how to square that with my personal experience of (again, accessible) debuggers being absolutely essential to keeping a project healthy and responsive to users/customers over the span of years.

Obviously the poll is in a peculiar place on the forum, and respondents are self-selecting, so perhaps one could raise methodological concerns, but I find the outcome nonetheless really surprising.

c-cube · July 14, 2023, 6:32pm

Well, I suppose people who stick around long enough to answer polls on
discuss are used to debugging without a debugger (via logging,
assertions, etc.). I’m mostly one of them. They know how to live with
that.

On the other hand, stack allocations, unboxed types, etc. push the
boundary of things you can do at all; without them some levels of
performance will just not be reachable at all.

We probably just don’t see all the beginners or people from other
languages who ragequit because gdb is subpar for OCaml. (Although it
works a bit, and is useful from time to time).

Chet_Murthy · July 14, 2023, 6:32pm

I don’t want to beat a dead horse, but felt it relevant to note that while web applications is an important application area, so is data science, and there the “bar” is numpy/scipy. Ad-hoc polymorphism (== modular implicits) is an important capability towards that goal.

cemerick · July 14, 2023, 6:40pm

Well, sure, if accessible debuggers are absent (replace with any other feature or aspect of community that benefits those unmotivated to persevere without), then those that remain aren’t going to be bothered by their continued absence. Sort of a tautological survivorship bias.

c-cube · July 14, 2023, 6:58pm

Sort of a tautological survivorship bias.

My point is that there’s evidence that you can actually maintain OCaml
projects over a long period of time, without a good debugger

Now for some deep speculation: people I know in the community tend to
either do networking/web stuff (highly concurrent, maybe even with event
loops — how do you square that with a debugger??), OR do heavy
symbolic algorithmic stuff (Coq, Why3, bunch of theorem provers) for
which step debuggers are also not that great! (you need bigger picture
for these.)

cemerick · July 14, 2023, 7:55pm

No doubt! Alas, this is what I’ve found myself doing, sometimes with a great deal of attendant pain. That this is possible is hardly a good argument for any given tool’s deprioritization though (especially if an objective is to attract those who aren’t interested in enduring said pain).

I cannot disagree more strongly with the assertion that stepping debuggers aren’t useful in networking/web/concurrent/event loop/constraint satisfaction programs (however you’d like to circumscribe the context(s)). Concurrent and parallelized programs are where quality debugging tools have IME proven to be the most applicable. All I can offer by way of citation is my own experience, where I’ve found diagnosing and fixing various classes of problems in e.g. JVM applications to be trivial thanks to quality debugging tools, and much more difficult in other environments that lack them, across almost all types of applications, programming styles, fields of use, etc.

I think the disagreement around utility here really is rooted in different personal work histories and practical cultures. It may be that OCaml’s constituents (perhaps most importantly, its core contributors and their respective working styles and cultures) don’t see relative value provided by accessible debuggers; that that appears to have never really been a priority to date supports that speculation at least a little bit. And that’s okay! But it’s important IMO to acknowledge those cultural / personal differences as the driving factor, rather than getting caught up in debates about whether debuggers are actually useful or not in objective terms.

SkySkimmer · July 14, 2023, 8:03pm

I use ocamldebug on Coq regularly and I’m glad I have the option to even though it can be a pain (eg can’t print foo BarMap.t by having a polymorphic printer for BarMap.t and a printer for foo, randomly “connection was lost with the process”).

I’d rather have better debugging than web related stuff which is totally useless to me.

c-cube · July 14, 2023, 8:55pm

I’d love to hear/read about how debugging these kinds of programs look like!!

toots · July 14, 2023, 9:40pm

I personally do not use debuggers a lot when programming but when working with a user to trace a hard to track bug, this is truly a life saver. Just being able to get a snapshot of all the thread call stacks with debugging symbols, even with the OCaml name mangling, helps a lot.

Debugger standards can also be a great way to ask for some information from someone reporting a bug. I might not always be able to reproduce a bug so it is fairly common for us to ask a user to install gdb and send us a stack trace during their issue. If I could get a stacktrace/core dump from them that is actually useful to step into the OCaml code, that would definitely help.

Frankly, the more tools the better, they do come handy. On that note, I also find that we could have improved tooling on the memory allocation side of things. memtrace is pretty good but that’s pretty much all that I know about. Am I missing any other tool?

cemerick · July 15, 2023, 1:46am

I’ll assume you’re talking about concurrent/parallel programs? It’s really a very ordinary thing, at least on the JVM and in .NET. There’s oodles of tutorials and documentation you can google on how to use the debuggers in mainstream IDEs, but I’ll tl;dr the experience for you:

Set a breakpoint (or not, you can also pause any thread (“domain” in OCaml parlance), if you have a program that is deadlocked or maybe you don’t know where in a codebase a program is running), putting sampling or “break” limits on it if you are running a very concurrent/parallel program (so you only stop the first N threads that touch the breakpoint, or only stop after the breakpoint is passed M times).
Freely step around the thread(s) control flow. In a concurrent scenario, a thread might block waiting for another, but those kinds of monitor exclusions are indicated in the UIs of any decent debugger (for sure Intellij/Eclipse/Visual Studio, but surely others, too).
Inspect the impact of the program on state, evaluate code, twiddle variables, do whatever you need to to investigate the problem.
In most cases, you can edit code and apply the changes to the running program without restarting it.

All of the same techniques apply when debugging “algorithmic stuff”. I don’t see how they pose any special challenge in this context.

There are also fancier debugging capabilities on these platforms — time travel, specialized visualizations for certain facilities, framework-specific aids for tracking down UI bugs, hooks for supplying your own visualizations of data structures that aren’t just stringifying their contents, and things like debugging nested lazy streams — but I’ve used them only super occasionally.

In the end, I think debuggers are a lot like any other advanced idea or technology where there can be cultural or experiential barriers, like type systems or testing methodologies: if you “don’t like them”, or think you can’t benefit from them, it’s often because you haven’t used a good one yet, or used one enough to become comfortable with the headspace it requires.

Chet_Murthy · July 15, 2023, 6:34am

Correct me if I’m wrong, but a subtext of your explanation is that this is all happening

in a dev environment (not production, not UAT tyically) (since those are all locked-down)
with a reproducible bug

Yes?

Some thoughts:

(1) In “the business” there’s a saying “we debug what we ship”. That is to say, the product, as shipped is ready to debug – you don’t need a different version (build) of the product to troubleshoot – the original production binaries suffice.

(2) I remember once I asked a friend who did a lot of Java web-app troubleshooting why he didn’t use debuggers, and he noted:

a. you can’t use 'em in production or UAT
b. so that means that you’ve already figured out how to reproduce the bug
c. but that’s the hard part – once you have a clean repro, it’s straightforward to debug/diagnose/fix.

and d. “hot code reloading” is nice in theory, but in practice for the complex applications he encountered in his daily work, it just didn’t work. Quite simply, there’s too much state that doesn’t get cleanly torn-down/rebuilt when you reload code.

So he (and I, and basically every troubleshooter I’ve ever known) learned and built a massive repertory of tricks and tools to debug/diagnose problems in the field, in production, because that’s where the problems always occur: a problem that is caught in development … isn’t an economically very significant problem.

I think it’s great to have debuggers, and sure, it’d be great to have excellent debuggers for OCaml. But the above considerations were all derived from >10yr production troubleshooting (and working with other troubleshooters) in Java and J2EE production enterprise deployments. All those products and the customer apps built using them, were developed by engineers using Eclipse, IntelliJ, etc, etc. They all used IDEs with debuggers in them. And sure, they used those debuggers.

But they’re useless in the face of a production bug, unless you can already isolate and reproduce that production bug in your dev environment with your Eclipse-based debugger.

It wasn’t lack of good debuggers that cause me and my peers to invent tools to help diagnose problems sans debugges. It was that you cannot attach a debugger to a production system, and typically production systems don’t have the metadata required to get the information you need out of a debugger, even if you could attach it.

cemerick · July 15, 2023, 1:03pm

I appreciate your perspective; I think it perfectly illustrates the cultural impact on our respective views on things like this, especially since it sounds like we probably have similar backgrounds (maybe re: tech stacks, but perhaps even also re: types of organizations), yet emerged with very different perspectives due to non-technical factors.

You mentioned a number of constraints is almost axiomatic terms that have hardly been universal (or even common in some cases) in my experience. Some thoughts with hopefully-generous paraphrasings:

“Debuggers are only useful in ~dev environments due to access restrictions”. “Debugging in production” is extremely common IME, even in “sensitive” industries like finance and healthcare (for which there’s a cottage industry of prod debugging tools to help with regulatory compliance for those orgs that take such things seriously). Hot code reloads in ~prod is much less common, though that’s in part a limitation in the relevant Java mechanisms; it’s entirely standard in other types of systems that have been built to ensure that that use case is well-supported, i.e. erlang, various lisps, php, smalltalks, etc. I’ve certainly done both, especially in my woolier lisp days when I’d often hook up full-fledged REPLs to prod envs and dig around wontonly.
In cases where the business is shipping software for customers to deploy or integrate into their own solutions, the prospect of “debug what is shipped” is a complete non sequitur. (FWIW, this describes probably two-thirds of my career for the couple of decades after I started working ~1998.) Bug reports come back with narrative descriptions, sometimes stack traces, sometimes provoking data, and it’s up to you to reproduce the problem, yes, in a ~dev environment. Saying that such cases don’t represent “economically significant problems” is discounting literally entire industries.
Having an immedately-reproducible test case in hand is of course helpful, but hardly necessary. I assume here that you were thinking of heisenbugs and similar issues that might appear to occur in ~prod only (or much more frequently), usually due to load or the specifics of ~prod infrastructure. These are the cases where I’ve found debuggers to be the most impactful, since they allow you to put an observed system under whatever load you like, and problems like deadlocks or other resource contention lights up like a tree. The absolute worst case scenario is being forced to debug the production system, as applying artificial load to it is often verboten, so you’re left to the whims of user action (or even worse, the calendar, if a problem only happens e.g. once a week or once a month when some periodic activity is scheduled).

bluddy · July 15, 2023, 10:22pm

@c-cube is correct here in the sense that pure symbolic programming doesn’t require debugging that much. Most work in OCaml takes place where functional programming is strongest: taking input and processing it into immediate output, often using only immutable code.

Nevertheless, if you ever caught yourself adding printfs in a whole bunch of places to observe many things, that’s something that could have been done an order of magnitude faster with a debugger, which can observe a whole lot of stuff in one iteration. This is why if I offered my boss a language with no debugger support, he’d laugh me out of the room. Once you get used to the massive efficiency boost a debugger provides, there’s no going back.

c-cube · July 15, 2023, 11:45pm

Oh symbolic programming requires plenty debugging, it’s hard enough. The
stuff I’m talking about is rarely purely functional anyway.

After these explanations, I think I’m convinced gdb support is even more
important than I thought it was. I already use gdb some (notably:
for deadlocks, and infinite loops) but you can’t print anything or set
a conditional breakpoint which is what I think could compete with
printf-debugging.

A good feature of printf-debugging, btw, is the ability to use ripgrep
and other text facilities on the output. It’s not uncommon for my
symbolic code to output 10s of thousands of lines, and then I need to
find the bug in it; with a debugger I couldn’t do that but maybe
conditional breakpoints (to stop at the exact point where the bug is
happening) might suffice.

Chet_Murthy · July 16, 2023, 12:56am

A couple of times, deadlocks have been mentioned. In the JVM, back before people used “continuation-passing style” to code up “threads” in a lightweight manner (that is to say, when a “thread” mapped to a “java native thread”), a pretty standard way to diagnose deadlocks was via taking multiple “javacores” and looking for threads that “didn’t move”. A “javacore” was:

stack-traceback for every thread
exhaustive list of every monitor, and all threads blocked on entry to, or waiting inside, that monitor, as well as the thread owning the monitor if any

There was probably more, but that’s what I remember. Lots of people (me included) wrote various tools of varying complexity to analyze series of javacores and report on deadlocks. I have the distinct memory of writing GDB automation (using the “gdb MI (machine interface)”) to walk thru hung C processes and do the same sort of reporting. But having the “javacore” facility was pretty useful, b/c when a customer gets a hang (or something resembling a hang) in production, we’d tell 'em

“take 5 javacores at one-minute intervals and send 'em in”

And we could diagnose from that.

As OCaml gets used more and more in high-concurrency environments, I think such a facility would be valuable.

This facility (and configurable logging, etc) has the property that the operator doesn’t need to be highly-skilled. That was always a big problem in debugging customer applications: the “hands on the console” were always low-skill, and often it was simply not possible to get my hands on the console, for either logistical (they’re in a different place, and can’t allow me to remotely connect for corporate rule reasons) or legal (only bonded operators are allowed to access machines running certain services – a hard legal requirement that occurs not only at banks, but also at CDN operators like Akamai) reasons. So being able to give instructions that can be executed by operators and much later they can gather files of diagnostic data that can be shipped back to the troubleshooter, was often the only way to make progress.

The alternative to that was that the troubleshooter had to get drop-shipped to the customer site, and that is always unpleaseant as all-get-out. The troubleshooter often cannot get hands-on, but at least, they can watch as the operator does the work, and correct errors in real-time. But wow, so unpleasant, having to fly halfway across the country to diagnose something instead of being able to do it from your own office/home.

smorimoto · July 16, 2023, 4:46pm

Is there no argument about relocatable compiler?

Topic		Replies	Views
Debugger support for OCaml Ecosystem	19	2988	December 30, 2021
Debug OCaml code Learning debugger	47	3584	December 16, 2022
Current state of OCamldebug Ecosystem debugging , ocamldebug	2	1282	April 4, 2019
Multicore OCaml: October 2021 Community multicore , multicore-monthly	0	8974	November 16, 2021
Multicore OCaml: October 2020 Community multicore , multicore-monthly	1	7056	November 10, 2020

Next priority for OCaml?

Related topics