Hi everyone,
I’m using OCaml for a personal project (as personal projects should be fun, and I find OCaml fun). Recently I’ve been pulling hairs over a “similar events” design though. I’ve tried different approaches (e.g. using tuples, variants, polymorphic variants, records, recently looking into GADTs etc.) but none doesn’t feel correct and I encounter same issues over and over.
Even though I prefer to solve my own puzzles, I’m out of ideas and steam and decided to look for guidance
Problem context
I want to develop event-based system. Initial prototype has ~30 events of approx. 5 families. They are quite similar to each other:
- Every single one has ID and Timestamp (with possible extra identifiers coming later)
- Every single one has a Metadata and Event Data
- Event Data is event-specific - it will be subject to a typing per event, but I believe it’s not relevant for the problem itself so I will stub it
- Meta Data can be empty or hold data comparable to other events (e.g.
UserJoin
andSessionNew
havesession
element that point to same Session entity) - Some events can have almost same shape and structure but shouldn’t be treated nor processed as the same (stumbled upon it when modelling as plain tuples) - e.g.
- Metadata is the most important part and can either be of a single data group or a product of different
Ultimately those events will be coming from Database or API in JSON/Sexp form and serialized/deserialized. I expect cardinality to be finite (initial version maybe around 50-60) but constantly expanding.
Example in OCaml-ish (not sure if valid written ad hoc - something close to one of the attempts) should give a rough idea:
type event_id = EventID of int
type event_ts = EventTS of int
type event_common = { id: event_id; ts: event_ts }
type event_data = EventData
type session = SessionData
type user = UserData
type personal_config = PersonalConfig
type meta_empty = EmptyMeta
type meta_with_user = { user: user }
type meta_with_session = { session: session }
type meta_with_user_and_personal_config = { config: personal_config }
type meta_with_user_and_session = { user: user; session: session }
type meta_with_user_and_session_and_personal_config = { user: user; session: session; config: pesonal_config }
(* More meta + different sets of meta data *)
(* Some example events *)
type user_event =
| UserNew of (event_common * meta_empty * event_data)
| UserLogin of (event_common * meta_with_user * event_data)
| UserJoin of (event_common * meta_with_user_and_session * event_data)
type meta_event =
| Tick of (event_common * meta_empty * event_data)
| UpdateEvent of (event_common * meta_empty * event_data)
type session_event =
| SessionNew of (event_common * meta_with_session * event_data)
| SessionExport of (event_common * meta_with_user_and_session * event_data)
(* etc. *)
Problem itself
However I model I crash against same set of issues
- Grabbing ID from Event is difficult
- e.g.
get_id
require implementation for everything or I can’t get types correctly
- e.g.
- It’s difficult to modelling and grabbing subset of data
- for
get_user
in above example I could mash all possible types, but adding new subset, for exampleavailable_credits
might explode other types and add to the boilerplate code - or I get a some long chain of boxing like:
BusinessEvent WithUserData UserJoin ...
- for
- Lack of ability to have a different groupings - e.g.
SessionExport
andUserJoin
both have session and thus could be subject toget_session
function
I’m also aiming at following design goals:
- I want to be able to type long chain Event progression, i.e. make sure that in chain there is no
UserLeft
beforeUserJoined
- More importantly have a single place to inspect (or type) all possible Event rules (e.g. processor after seeing
UserNew
should spawnIssueTrialPeriod
) - Bag all of those in a list and process in multi-phase (e.g.
UserNew
,Tick
,Tick
,UserJoin
) would be processed in two passes - first pass would remove allTick
events (preprocessing phase) and the second would receive onlyUserNew
,UserJoin
- Need to ensure that no Event exists that cannot be handled by existing system
I understand that’s a lot, so I’d like to mention that I’m not asking for a working code nor a working solution, but I’m struggling for a long time and would be grateful for any finger pointing at a solution /similar conundrum discussion or (I don’t exclude) thought/assumption error?