[ANN] soupault: a static website generator based on HTML rewriting


Soupault is the first (to my knowledge) website generator that exploits the fact that well-formed HTML is machine readable and transformable (and thanks to @aantron’s lambdasoup it’s quite easy to do).

It can do things like “use the first <h1> for the page title” or “insert output of date -R into the <time> element no matter where it’s in the page”.


  • No templates, no themes, no front matter. You tell it where to insert stuff or what to extract using CSS selectors.
  • Built-in ToC, footnotes, and breadcrumbs.
  • Directories are site sections and can be nested.
  • Extracted metadata can be exported to JSON and fed to external scripts for creating section indices or custom taxonomies.
  • Configurable preprocessors for pages in formats other than HTML.

Soupault can be a drop-in automation tool for existing websites: the directory structure is fully configurable, clean URLs are optional, and it can preserve paths down to file extensions.


With soupault, it’s possible to take advantage of the full HTML markup and even make every page on your website look different rather than built from the same template, and still have an automated workflow.

This is awesome. I have been looking for a SSG that can serve the unique needs of artists’/designers’ portfolio websites and this finally sounds like the one. For this purpose I think a macOS binary would be pretty critical since they (my clients) pretty much use Apple stuff exclusively.

I’ve been experimenting with using Travis CI for OS X builds. Here’s an artifact: https://baturin.org/tmp/soupault-1.0.1-osx.zip

I can confirm that it’s a Mach-O executable, but that’s about it: I can’t actually test it since I don’t have a working Apple machine. If you’ve got time, please test it.

Works here (macos 10.14.6).

> ./soupault --version
[WARNING] Configuration file soupault.conf not found, using default settings
soupault 1.0.2
Copyright 2019 Daniil Baturin, licensed under MIT
Visit https://baturin.org/projects/soupault for documentation

Cool, then it passes the smoke test at least. Thanks for testing.

There’s nothing but basic POSIX stuff, so there’s little potential for incompatibilities… or so I hope.

nice! I did a navigation-injector (in ruby) based on the machine-readability of html a few years ago.

Writing the final markup has quite some appeal.

@hanjiexi I’ve made a 1.1 release with some updates and “official” macOS binaries. https://github.com/dmbaturin/soupault/releases/tag/1.1

@mro That’s pretty much my motivation. If those things require an HTML parser anyway, why keep a markdown-centric workflow.

1 Like

I’ve made a 1.2 release, now with Lua plugin support thanks to Lua-ML: https://baturin.org/projects/soupault/#plugins


1.3 release with some improvements.

  • Invalid config options cause warnings now. There are also “did you mean” suggestions for mistyped options, thanks to @c-cube’s spelll library.
  • Footnotes now keep original id’s for handy hotlinking, and you can add suffix/prefix to footnote ids to make a separate “namespace” for them.
  • Some minor bugfixes.

Made a 1.7.0 release.

First improvement is that you now can pipe the content of any element through any external program with preprocess_element widget (PR by Martin Karlsson).
For example, insert inline SVG versions of all graphviz graphs from <pre class="language-graphviz"> and also highlight the Dot source itself with highlight (or any other tool of your choice):

  widget = 'preprocess_element'
  selector = 'pre.language-graphviz'
  command = 'dot -Tsvg'
  action = 'insert_after'

  after = "graphviz-svg"
  widget = "preprocess_element"
  selector = '*[class^="language-"]'
  command = 'highlight -O html -f --syntax=$(echo $ATTR_CLASS | sed -e "s/language-//")'
  action = "replace_content" # default


Two other improvements are multiple index “views” and default value option for custom index fields, like

  category = { selector = "span#category", default = "Misc" }
1 Like

soupault 1.8.0 is released along with Lua-ML 0.9.1.
Lua-ML now raises Failure when Lua code execution fails. There’s much room for improvement in that area, for now I’ve just done something that is better than just displaying errors on stderr but otherwise allowing syntax and runtime errors pass silently.
If you have any ideas how perfect interpreter error reporting should work, please share!

As of improvements in soupault itself, there’s now:

  • A way for plugins to specify their minimum supported soupault version like Plugin.require_version("1.8.0")
  • TARGET_DIR environment variable and target_dir Lua global that contains the directory where the rendered page will be written, to make it easier for plugins/scripts to place processed assets together with pages.
  • “Build profiles”: if you add profile = "production" or similar to widget config, that widget will be ignored unless you run soupault --profile production.
  • A bunch of new utility functions for plugins.

1.9.0 release is now available.

  • --index-only option that makes soupault dump the site metadata to JSON and stop at that
  • Metadata extraction and index generation can now be limited to specific pages/section/path regexes, just like widgets
  • The preprocess_element widget now supports a list of selectors, e.g. selector = ["code", "pre code"].
  • Plugin API now has functions for running external programs, and some more element tree access functions.
  • CSS selector parse errors are now handled gracefully (lambdasoup PR#31).
  • The title widget now correctly removes HTML tags from the supposed title string and doesn’t add extra whitespace (fixes by Thomas Letan).

1.10.0 release is available.

Bug fixes:

  • Files without extensions are handled correctly.

New features:

  • Plugin discovery: if you save a plugin to plugins/my-plugin.lua, it’s automatically loaded as a widget named my-plugin. List of plugin directories is configurable.
  • New plugin API functions: HTMLget_tag_name, HTML.select_any_of, HTML.select_all_of.
  • The HTML module is now “monadic”: giving a nil to a function that expects an element gives you a nil back, rather than cause a runtime error.
1 Like