Firewall-tree - demo using MirageOS is available, with overview of current progress

Hello everyone,

I’m very pleased to announce that firewall-tree library has reached a presentable stage and a demo which uses firewall-tree in MirageOS is available.

Demo

The demo code is available here.

The code itself is quite thoroughly commented, the README in the folder contains build and run instructions and also illustrations of the tree in graphical form, so I won’t dive into the details of the demo here.

In short, the demo shows how to separate of ICMP trafffic and TCP traffic. For TCP, filter some TCP traffic, then load balance the remaining TCP connections across multiple destination IPv4 addresses. For ICMP ping requests, only reply to every other one.

The tree itself is only a couple of lines, but there is quite a bit setup code for wrapping MirageOS stuff into a module and types usable by firewall-tree. However, it should be possible to package the setup code into a library supporting MirageOS as a firewall-tree backend, so in future the usage of the library should be much smoother and less cluttered than the demo.

I’d like to note that the library right now is still very rough at more complex constructs, and the connection tracker, which is used inside the translators, works fine for a very typical TCP connection, but fails to handle TCP retransmissions, or TCP PDUs with RST and PSH flags.

Library overview

The library requires a “base”, which you can think of as an environment implementation (see here for module type signature), which it then extends into a full firewall module. The resulting module provides the following functions/constructs

The package provides another set of functors which take the resulted tree module as parameter and derive commonly used constructs. See demo for example use.

Moving onward

Still a lot of things to be done, following lists a few of the items

  • TCP connection tracking needs to be fixed
  • Implement more translators, make them more modular (e.g. swappable algorithms)
  • Couple translators to form NAT modules
  • Benchmark
    • I am hoping this library doesn’t slow down MirageOS too much for equivalent functionalities, but I don’t have concrete benchmarks

Final thoughts

The progress so far has been quite time consuming to implement, with a lot of redesigns and rewrites in-between. I am hoping the current design is sane, but I feel I’ve invested too much time in last couple of weeks to get a clear view of things. So I would absolutely appreciate any feedback, whether it’s on API design, the architecture of the code or any other aspects.

I had some experiences with networking and firewall, but as far as implementing a network stack goes, I have zero experience, so I’d also appreciate help on components which demand such expertise as well.

Thanks for reading!
Darren

6 Likes

This is just fantastic progress! You’re coding faster then I can test, but this is on my queue to read and try out as soon as I finish my Mirage review queue :slight_smile:

1 Like

Haha, thanks! Your expression of interest has been a big motivator, as I can’t tell if it is a silly idea or not. I really appreciate your attention.

Hi Darren,

that’s interesting work. I’ve not been reading through all of your code, but you seem to have implemented 80% of a TCP stack and also mention that reset handling (what about simultaneous open, (IP) fragmentation, and retransmissions?) is missing. Did you try to use mirage-tcpip instead, which already implements some of the missing pieces (at least reset handling AFAIR)? I’m pretty sure the API of mirage-tcpip could be revised if unsuitable for your use case. But given the complexity of TCP, I’m not sure how many partial TCP implementations in OCaml are sensible (NB: the linked paper and mirage-tcpip are unrelated, unfortunately – but I have a (partial :/) TCP implementation on my hard drive which follows the above paper).

1 Like

Hello Hannes,

Thanks for reading through!

  • Simultaneous open
    • Uh…uh oh, I didn’t realise simultaneous open is a thing
    • This might require redesign of the TCP connection tracker as it right now assumes only one side is the initiator, and the other side must be the responder
  • IP fragmentation
    • Don’t think this matters too much unless the library moves toward supporting deep packet inspection?
    • But yeah this may be a problem if the library needs to support moving packets from a layer 2 segment of higher MTU to one with lower MTU? But that should be handled by the IP layer (of MirageOS) in this case
    • In any case I don’t know if it should be in the scope of the library
  • Retransmissions
    • From firewall perspective I think it only needs to remember seeing the PDU at some point and just allow duplicates to be sent through up to some threshold, which means collecting the hashes should suffice I hope

Ideally the library is network implementation agnostic, but I can’t tell if that’s a reachable goal exactly considering all these intricate things.

Right now the demo itself uses Tcp.Tcp_packet from tcpip to handle serialisation and deserialisation, but not flow or any connection establishing components.

I can’t tell if flow is suitable for firewall, since that means the unikernel needs to add 65535 listeners to cover all ports? I might be being silly here or misunderstood something, please correct me if I am.

W.r.t. stack implementation, I think I might have picked the wrong wording. For the library atm, RST handling entails some adjustments in the state machine (which is not too much work, but need a lot of debugging probably), and retransmission entails recording PDUs (through hashes most likely) and allow pass through if duplicates are seen (up to N times etc). Both of which are much less work than a full stack implementation.

A firewall does need to work on a slightly different perspective compared to a client, however, as when a firewall sees a PDU, it’s not always the case that either client will see it, as opposed to being a client and receiving something means you know for sure you’ve received it. So the other adjustment required is the state machine of connection tracker needs to tolerate “partial” state transition in some way (so just look ahead one state but don’t commit or whatever).

So overall it’s more of a partial stack implementation powerful enough to observe and forward things and that’s the end of the story.

Thanks for the link to the paper btw! I’m interested in formal verification stuff in general and currently researching on protocol verification topics at uni, so I’ll definitely give it a thorough read later.

1 Like