It’s that time of the year again when open-source communities are preparing for the next round of Outreachy by looking for good projects, mentors and co-mentors! Hopefully the OCaml community will participate this next round.
Next Round
The previous post for Summer 2022 contains a lot of great information with respect to what it means to be a mentor. The dates for this round are available on the Outreachy website (under the December 2022 header). The most important of which is the final deadline for mentor signup which will come around pretty quickly: September 23, 2022 at 4pm UTC.
If folks have ideas, but not the time to mentor, please do post your thoughts below about project ideas. If you want to mentor a project but aren’t actively a contributor to it I would still recommend posting below and hopefully the right people will see it (I think a great example of this would be improving ocaml.org). If you would like to co-mentor a project please post below too so mentors can reach out (or reach out directly to a mentor that signs up).
Current Round
The OCaml community currently has two very successful Outreachy interns working on Multicore testing (Moazzam Moriani) and a TopoJSON library (Jay Dev Jha), there will be more information on these projects soon as they come to an end. In the meantime, I highly recommend giving the twitter space Moazzam participated in a listen to and perhaps check out the TopoJSON repository that Jay has been building too.
Great!
I was once an Outreachy intern (May - August, 2021) and I worked with the OCaml community which was an amazing experience for me.
I don’t have any project ideas for now, but I would love to co-mentor on a project in this round. Please, reach out if you would love a co-mentor.
Thank you
Dune is such a great tool but I found it incredibly difficult to go through its current docs as a complete beginner. Maybe a project to improve that and documentation for other essential OCaml tools like ocamldebug might be a good idea. I also wouldn’t even mind (co)mentoring such a project.
Somewhat along the same lines, I’m willing to (co)mentor for a project to add reStructuredText output to odoc and also (time permitting) to make a Sphinx plugin for that output. There is prior art for both parts. Proper Sphinx support would let people build cohesive, easy-to-maintain documentation across modules and tools. Would love a co-mentor!
This is a great idea; I think improving the Dune docs is probably one of the most impactful projects that an Outreachy intern could take up, and all in all a great service to the community (as well as a good way to learn about how the OCaml toolchain works!).
The current Dune docs are mostly an enumeration of all possible stanzas with some added commentary around; it is closer to a reference manual than to a user guide. It is sorely lacking “tutorial” content for beginners. In this regard, I always remember fondly the SCons user guide SCons 4.6.0, a good source of inspiration for a project like this in my opinion.
It is fantastic to see a good conversation happening around potential projects for December, thank you everyone for participating.
Yes! I think there’s a desire here for a really great set of tutorials combining Dune and Opam. I think it is hard to focus purely on Dune without bringing Opam into the picture as well. Are there any Dune maintainers who would consider mentoring such a project ?
Hi all. Just like for Patrick, for me it’s also really nice to see this conversation happening between several different people!
About the idea of improving documentation in general and concretely of writing a dune tutorial/user guide: that sounds great also to me. And in fact, I’ve seen that writing documentation is a popular Outreachy project in other open-source communities. At the beginning, before writing/improving tutorial-like documentation, folks would need to get familiar with OCaml. So to add some more brainstorming: aside from working on issues on ocaml.org, they could also write examples / mdx code block API documentation for some easy-to-use libraries in the ecosystem that would benefit from more examples (I don’t have concrete libraries in mind though).
Btw, this also sounds great @jbeckford! (although I admit I’m not familiar with reStructuredText) I’m curious about that: Do you have a use case for that in mind? Something like hosting odoc-generated documentation on Read the Docs or similar?
Btw, from past experience with odoc Outreachy proposals, we know that it can be hard finding good good-first-issue’s for the contribution/application period (let me know if you want me to explain what I’m referring to) on the odoc repo. @Juloo and/or @panglesd, do you think that’s currently easy/possible? If not, there are always other solutions for the contribution period (possibly something similar to what I’ve just mentioned above?).
The use case I have in mind is to produce documentation that is meant to be read linearly (top to bottom, with the option of skipping around) and that is written to explain topics to users. Tutorials, user guides and books follow the linear format, while reference manuals do not follow the linear format.
@nojb mentioned SCons as a good inspiration for build systems; I agree. IMHO you need both the linear format and reference manuals to provide good user documentation. I think Git’s documentation is a great example of a reference manual and a book, and numpy’s documentation is another good example of a reference manual and a user guide.
So tomorrow, after an Outreachy project, I hope OCaml-ers can publish explanatory, linear documentation like numpy’s “Indexing on ndarrays” that is full of cross-references to code interfaces from the autogenerated reference manual.
However today Sphinx can only colorize OCaml code. Sphinx can’t cross-reference odoc autogenerated documentation. In addition the odoc autogenerated documentation can’t be included in a Sphinx site without hackery, clashing styles and dangling references.
A few of my own OCaml-based Sphinx (Read the Doc) sites would greatly benefit from Sphinx having better OCaml integration:
Diskuv Research Security Protocols documentation was the first project I published, and after hacking its Sphinx + odoc HTML together, I never attempted to integrate Sphinx and odoc again. I worry that making explanatory documentation difficult to write will cause other people to abandon writing documentation for their own projects.
Talking about the complexity of Outreachy projects, does anyone have an idea how complex extending GitHub - hackwaly/ocamlearlybird: OCaml debug adapter would be? So far it works only in OCaml 4.11 and 4.12. The lack of version support for OCaml 4.13 and 4.14 is the primary reason why I’m sticking to 4.12.1 in Diskuv OCaml (although eventually I’ll provide an escape hatch). I suspect newcomers to OCaml, especially those coming from languages (ex. Java, C) or operating systems (ex. Windows) dominated by graphical debuggers, would welcome having a debugger. And I suspect extending a debugger is a great way for a student / new graduate to understand how a virtual machine works. I’m not volunteering to mentor! Hopefully someone else can mentor if adding features to earlybird is straightforward.
Why don’t you use .mld files for that ? You get a decent edition language and checked cross references into your API reference. Not to mention latex maths which was merged a few days ago.
.mld files have a zero cost publishing experience for developers, just write the files, let the eco-system do the rest (ocaml.org and odig). For end users it makes it up for an extremely reliable doc hunting (and viewing) experience: either lookup on ocaml.org or via odig doc for the version installed in your switch (works offline and with the theme of your wish).
It seems it has become some kind of hobby to add new backends to odoc but meanwhile after almost 8 years in existence there are still problems in the base HTML renderer and important unresolved issues like a convention to be able to add images or code extraction. I think it would be a better idea to focus on the existing in odoc and make it great rather than add new backends.
I agree, odoc and ocamldoc syntax already support fairly sophisticated documentation output, it doesn’t even need to be written in reference style, it can be in tutorial or user guide style. You can even run odoc (via dune etc.) on plain old .ml files which contain nothing but a single doc comment from top to bottom i.e. (** ... *). There’s definitely room for improvement but it works surprisingly well even now. I think it’s mostly a cultural issue that people don’t do it.
Definitely use .mld files for API reference. You can even do some explanatory, linear documentation like I did with dirsp-exchange-kbb2017. But I haven’t found .mld files to be good substitutes for a linear documentation site like Diskuv OCaml documentation. Honestly, I doubt I could produce that site with .mld files (or the other two I mentioned), but I’m willing to be proven wrong.
The best counterpoint I know of is the odoc documentation site itself. It is well-written, linear and a good example of dog-fooding. It is also plain text combined with OCaml source code, and once you stray a bit from that (ex. diagrams, capturing interactive sessions, transclusion, etc.) you quickly see the limitations today. Continued below
No disagreement, especially for reference documentation where the user queries for a module/package/keyword they already know. But the assumption being made is that .mld is great for tutorials, user guides and books … great for both readers and authors … so why bother with anything else? Continued below
A good example for a tutorial / user guide / book written in .mld / .mli would be nice! Then it can be compared side-by-side with the examples I already provided.
IMHO part of the comparison should be how easy it is to write comprehensive documentation for newcomers (or we’ll never get good docs in the OCaml ecosystem!). I can readily attest to newcomers being able to write documentation in Sphinx (much of that can be browsed on https://readthedocs.org/), so real examples, especially from non-experts, would be useful in this discussion.
I also hear scope creep. Is odoc a tool for reference documentation (where I know what modules/keyword I am looking for, and can ask the tool to do a search) or for linear documentation (where the docs tell me about things I don’t already know), or both? Perhaps it would be better to focus on making odoc a great API reference document tool (which I think it is, alongside odig and the search bar of v3.ocaml.org), and delegate to mature tools for linear documentation that are designed to work with existing API reference tools? (hint: Sphinx)
Yikes; multiple things you mention are troubling.
Why is it either-or?
This thread is about what Outreachy can do with assigned mentors. I believe the Sphinx (Python) plugin portion can be done by Outreachy; I’m not 100% confident about the odoc side but it seems significantly simpler than the existing Markdown backend PR (which is why this would be done with a co-mentor). But … it sounds like the suggestion is to do an Outreachy project to solve 8 year old issues instead. Something doesn’t sound right. Perhaps someone can post a link to those issues?
It sounds like the suggestion is to stop all feature development of odoc to fix 8 year old issues. But that begs the question why the mentioned issues are there after 8 years. Perhaps they are not critical, too complex, etc. Or perhaps we really should drop everything and fix them.
I acknowledge that often adding new features increases the technical debt with the existing code. But the new backend I am suggesting delegates all of the rendering to the Sphinx side, so it should not increase the technical debt of odoc.
I question why we want to duplicate features from other documentation tools (Sphinx, or Markdown + static site generators) when we have limited resources. For example:
If I’m documenting an HTTP API, I am going to reach for the Sphinx httpdomain plugin instead of re-inventing the wheel. Ditto for graphviz and PlantUML diagrams. I don’t want to wait another 8 years!
I can type Markdown and Sphinx documentation in Visual Studio Code and get almost instantaneous visual feedback in the code editor preview. That makes writing documentation enjoyable and almost effort-less.
So for now I’ll wait for some non-expert .ml{i,d} generated examples of tutorials, user guides or books.
I’m not saying .mld files are perfect and that’s why I want them improved :–) and the latest addition of latex maths is an excellent step in that direction.
So maybe it would be good if you could articulate exactly what you feel is missing so that these points can be improved. A quick look suggests to me that it wouldn’t be particularly challenging to provide the diskuv documentation as a web of mld files.
I’m not sure I got this. For me the point is not for newcomers to be able to write documentation, it’s rather to have it easily find and comfortably read. As far as the documentation production is concerned you should aim for the working programmer and as I mentioned .mld files make that exceptionally easy.
In my opinion it should definitively both. No distinction should be made between the two, I want a documentation system not segregated documentation tools that live on their own island. Especially the experience of moving from API reference to reference manuals and howtos should be seamless and this accross packages. .mld files and their checked cross reference capabilities (even across packages!) are in an excellent position to provide this.
It may have been understood that way but note that I was not suggesting this for the Outreachy project here. I suggested something for the odoc project, which is to focus on improving the existing odoc experience rather than expanding its scope. Whether some of the tasks therein are suitable for Outreachy is not for me to judge, but to the Outreachy mentors.
There’s definitely a critical point about the goal of odoc covering both reference and linear documentation. Doing Sphinx may make it harder to achieve that goal. So I’ll continue this offline and spin up a new discuss thread (or post a conclusion) as needed.
After a short discussion at ICFP, me and @gasche are planning to submit a project to add identifiers to the compiler error messages, which should help to have a more extensive documentation of all error cases. We are currently checking that no one has any major objections to the idea before submitting the projects.