[ANN] Timere-parse 0.0.2, natural language parsing of date, time and duration

I’m happy to announce the release of Timere-parse 0.0.2, the natural language parsing component of Timere, a date time handling and reasoning library. Both packages are under the Timere repo.

Timere-parse allows interpretation of common descriptions of date, time and duration.

Date time examples

Input strings are in "", indented lines are pretty printed output.

"2020 jun 6 10am"
  Ok 2020-06-06T10:00:00Z
"2020 jun 6th 10:15"
  Ok 2020-06-06T10:15:00Z
"Australia/Sydney 2020 jun 6 10am"
  Ok 2020-06-06T10:00:00+10:00
"01-06-2020 10:10"
  Ok 2020-06-01T10:10:00Z
"2020/06/01 10am"
  Ok 2020-06-01T10:00:00Z
"jul 6 2021 9:15am"
  Ok 2021-07-06T09:15:00Z
"2020/06/01"
  Ok 2020-06-01T00:00:00Z

Duration examples

"24h"
  Ok 1 days 0 hours 0 mins 0 secs
"16.5 hours"
  Ok 16 hours 30 mins 0 secs
"1h20min"
  Ok 1 hours 20 mins 0 secs
"1 hour 2.5 minutes"
  Ok 1 hours 2 mins 30 secs
"100 seconds"
  Ok 1 mins 40 secs
"2.25 minutes 1 seconds"
  Ok 2 mins 16 secs
"5 days 6.5 hours"
  Ok 5 days 6 hours 30 mins 0 secs

Timere object examples

"2020 jun"
  Ok (pattern (years 2020) (months Jun))
"jan"
  Ok (pattern (months Jan))
jan 6 12pm to 2pm"
  Ok (bounded_intervals whole (duration 366 0 0 0) (points (pick mdhms Jan 6 12 0 0)) (points (pick hms 14 0 0)))
"12th, 13 to 15, 20"
  Ok (pattern (month_days 12 13 14 15 20))
"16th 7:30am"
  Ok (pattern (month_days 16) (hours 7) (minutes 30) (seconds 0))
"16th 8am to 10am, 11am to 12pm"
  Ok (inter (pattern (month_days 16)) (union (bounded_intervals whole (duration 1 0 0 0) (points (pick hms 8 0 0)) (points (pick hms 10 0 0))) (bounded_intervals whole (duration 1 0 0 0) (points (pick hms 11 0 0)) (points (pick hms 12 0 0)))))
"2020 jun 16th 10am to jul 1 12pm"
  Ok (bounded_intervals whole (duration 366 0 0 0) (points (pick ymdhms 2020 Jun 16 10 0 0)) (points (pick mdhms Jul 1 12 0 0)))

Corpus

For the full corpus/examples, see corpus/ for code and corpus-outputs/ for generated outputs.

9 Likes

The demo site has been updated to use Timere-parse, you can now try interacting with Timere_parse.timere in web browser at Time expression demo

2 Likes

This is an impressive library for sure! However, I’ve found one pattern that is quite common in European countries but not recognized: 30.01.2020. Do you plan to support it?

Thanks! : D

Just finished adding - should be accessible via the demo now. It just never crossed my mind as I don’t see that date format very often.

(Adding it was surprisingly non-trivial as it turned out, as the parser needs to be able to distinguish between yyyy.mm.dd or dd.mm.yyyy from X.Y (float).)

EDIT: One extra detail for the problem in case someone wants to do something similar in future - reconstructing float x.y from token sequence Nat x; Dot; Nat y is insufficient as y would have lost the leading zeros already, causing an incorrect parse. In other words one would need to retain the original text for the Nat token if they choose to go with the reconstruction route. Using a parser that gives all possible parse trees would remove this issue altogether obviously.