(The following text has been copypasted to the SAT/SMT by example book.)
In NFA, we would just create 12 paths. Alas, this is DFA, so these paths are overlapped at some vertices:
'1' is also accepting state, but can also lead to '10', '11' and '12':
(Both hours and minutes are always has two digits.)
For the sake of simplicity, only years 19xx and 20xx are accepted:
But what if the dash is optional?
Can accept dates in DD-MM-YYYY or YYYY-MM-DD format, dash must be present:
Can accept dates in DD-MM-YYYY or YYYY-MM-DD format, a dash is optional. Now the DFA is too large: date3.png
Important note: using the standard RE, it's impossible to match only DD-MM-YYYY or DDMMYYYY strings without matching DD-MMYYYY or DDMM-YYYY, because it's impossible to represent this in DFA form.
But you can fix this with non-standard backreferences. However, this wouldn't translate to DFA.
Read more about backreferences in my book: Ctrl-F "backreferences".
As my reader comments...
Date: Thu, 19 Aug 2021 12:46:29 -0500 From: Tim Chase < yurichev.com (a) tim.thechases.com > To: blog (a) yurichev.com Subject: Regex accepting DD-MM-YYYY or DDMMYYYY without DD-MMYYYY or DDMM-YYYY Reading https://yurichev.com/news/20210819_RE2/ you wrote > Important note: using the standard RE, it's impossible to match > only DD-MM-YYYY or DDMMYYYY strings without matching DD-MMYYYY or > DDMM-YYYY, because it's impossible to represent this in DFA form. It seems like this should be possible with [0123][0-9](-(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-(19|20)[0-9]{2}|(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(19|20)[0-9]{2}) It's not pretty on account of having all that duplication, but the result should accept DD-MM-YYYY or DDMMYYYY without accepting DD-MMYYYY or DDMM-YYYY (now whether one could *also* accept YYYY-MM-DD/YYYYMMDD format with the same regexps, it might require some sort of deeper magic and back-tracking) Anyway, thanks for the fun posts! -Tim
Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.