-
Notifications
You must be signed in to change notification settings - Fork 152
Description
First, I am deeply indebted to tools-python as a library for ntia-conformance-checker, a tool that I've been contributing to and that has helped with my research. So thank you all.
A user of ntia-conformance-checker reported a bug encountered while using ntia-conformance-checker in which the tool had an error when ingesting an SBOM that had a time "creationInfo" field with milliseconds. The error was: Invalid created value '2023-03-17T19:44:10.259Z' must be date in ISO 8601 format.
I have investigated and I believe that tools-python incorrectly implements the regular expression for ISO 8601. The current regular expression is below:
Lines 37 to 40 in 613982b
| # Matches an iso 8601 date representation | |
| DATE_ISO_REGEX = re.compile( | |
| r"(\d\d\d\d)-(\d\d)-(\d\d)T(\d\d):(\d\d):(\d\d)Z", re.UNICODE | |
| ) |
This regular expression does not allow for decimal fractions of a second. My current understanding of ISO 8601 (enriched by reading here) is that fractions of a second (with any degree of accuracy, hence any number of digits) is valid 8601. Hence a revision of the code to:
DATE_ISO_REGEX = re.compile(
r"(\d\d\d\d)-(\d\d)-(\d\d)T(\d\d):(\d\d):(\d\d)(\.[\d]*)?Z", re.UNICODE
) would make DATE_ISO_REGEX correct. This regular expression looks for an optional decimal followed by any number of digits. This revision should fix the bug reported earlier. I'll submit a PR that changes one of the test cases to have a "creationInfo" field with a fraction of a second value (to demonstrate a failing test case) and then add another commit to demonstrate that the modified regular expression fixes that failing test case, and hence the bug.
Apologies if I've created more problems along the lines of this programmer joke:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Please do check my regular expression carefully. They're tricky :)
cc @goneall