I have a question concerning XML Schema's built-in type xsd:dateTime
.
What are the exact semantics of xsd:dateTime
without a timezone? Ex. 1970-01-01T00:00:00
.
I've read through a number of XML Schema spec documents but could not find out how should it be processed.
Specifically, I want to understand how to convert xsd:dateTime
to the Date (like java.util.Date
or JavaScript Date
) object correctly.
Side note: I am perfectly aware of Java util classes like DatatypeConverter
or DatatypeFactory
, I would like to find the XML Schema spec that defines how to do this conversion.
The problem with the Date
class (in Java as well in JavaScript) is that these classes do have timezones (defaulted to the local time zone). If I'm getting a xsd:dateTime
without time zone on input then I have to deside somehow, which time zone I should assume. Otherwise I just can't convert it to a timezoned value (like Date
).
Now the question is, what should I assume. I see following options here:
I don't really like the second option. It is entirely random! On my machine, if I run
System.out.println(DATATYPE_FACTORY
.newXMLGregorianCalendar("1970-01-01T00:00:00")
.toGregorianCalendar().getTime().getTime());
I'll get -3600000, 0, 3600000 for GMT+1, GMT or GMT-1 (and even more variants depending on summer time. This is so arbitrary, I'm really not getting this. Does this mean than when we have an XML document with an element like
<date-time>1970-01-01T00:00:00</date-time>
we have actually no idea, which exactly time instant was meant?
The first option (assuming UTC) seems more valid to me but this is apparently not what (at least) Java tools are doing.
So could please someone give me a pointer to a spec of some kind defining semantics of the timezoneless xsd:dateTime
?
Thank you.
Update:
Current findings are:
xsd:dateTime
to Date object which has a specific time zone - UNLESS an assumption about the absent time zone is somehow made.My solution will be as follows:
context
object which provides the XML procession context (analog of JAXB JAXBContext
). I will extend this object with a methods like getDefaultTimezoneOffset()
and setDefaultTimezoneOffset(int timezoneOffset)
0
(UTC) at the moment. However can be local time zone (like Java tools do) as well.xsd:dateTime
to Date
, if incoming value is missing a time zone, it will be assumed to be the context.getDefaultTimezoneOffset()
.originalTimezoneOffset
or something like that. This will not modify the value of the Date
object but will provide some additional context information (for instance when the value should be printed again).Date
, the library would check for the originalTimezoneOffset
and if it is provided consider it when rendering the lexical value.The standard XSD DateTime and Date formats are CCYY-MM-DDThh:mm:ss and CCYY-MM-DD, respectively, because the underlying XSD schema of the DataSet maps the DateTime and Date columns of the database to the DateTime and XSD Date data types.
ToString("yyyy-MM-dd HH:mm:ss");
The XML Schema definition language (XSD) enables you to define the structure and data types for XML documents. An XML Schema defines the elements, attributes, and data types that conform to the World Wide Web Consortium (W3C) XML Schema Part 1: Structures Recommendation for the XML Schema Definition Language.
An XML schema definition (XSD), is a framework document that defines the rules and constraints for XML documents. An XSD formally describes the elements in an XML document and can be used to validate the contents of the XML document to make sure that it adheres to the rules of the XSD.
Basically the timezone is absent information, and there are many ways of interpreting absent information; in the end it's up to you. Possible interpretations are:
the timezone is unknown
the timezone can be established from the context, e.g. an associated place
the timezone is UTC
The XPath/XQuery/XSLT family of specifications assume a context-defined timezone. The context here could be the locale of the user, or the timezone of the machine on which the software is running, or any number of other things.
In a sense it's no different from omitting the time and giving only a date. What exactly do you mean when you say you were born on 21 March 1973? What timezone are you talking about? The assumption is probably that you've left out the information because no-one is likely to care.
This is what I've used myself. It all starts from the dateTime spec:
"Local" or untimezoned times are presumed to be the time in the timezone of some unspecified locality as prescribed by the appropriate legal authority; currently there are no legally prescribed timezones which are durations whose magnitude is greater than 14 hours. The value of each numeric-valued property (other than timeOnTimeline) is limited to the maximum value within the interval determined by the next-higher property. For example, the day value can never be 32, and cannot even be 29 for month 02 and year 2002 (February 2002).
If that is confusing, then go to section 3.2.7.2 Order relation on dateTime
Excerpts (to meet posting criteria here):
The ordering between two dateTimes P and Q is defined by the following algorithm [...] A.Normalize P and Q. That is, if there is a timezone present, but it is not Z, convert it to Z [...]
These would be relevant:
C.Otherwise, if P contains a time zone and Q does not, compare as follows: 1.P < Q if P < (Q with time zone +14:00) 2.P > Q if P > (Q with time zone -14:00) 3.P <> Q otherwise, that is, if (Q with time zone +14:00) < P < (Q with time zone -14:00)
D. Otherwise, if P does not contain a time zone and Q does, compare as follows: 1. P < Q if (P with time zone -14:00) < Q. 2. P > Q if (P with time zone +14:00) > Q. 3. P <> Q otherwise, that is, if (P with time zone +14:00) < Q < (P with time zone -14:00)
The "magic number" 14, from 3.2.7:
[...]currently there are no legally prescribed timezones which are durations whose magnitude is greater than 14 hours.
Of course, you could run in indeterminate scenarios, that is where order cannot be ascertained:
2000-01-01T12:00:00 <> 1999-12-31T23:00:00Z
2000-01-16T12:00:00 <> 2000-01-16T12:00:00Z
2000-01-16T00:00:00 <> 2000-01-16T12:00:00Z
It is really hard to tell what kind of assumption you should make. You need to chase down and understand how that value was captured and then passed on to you in XML, since both assumptions can be wrong! If this data is passed around, eventually sent it back to the systems in the same realm as the one that sent it, a safe practice is to make sure you always have a "string" copy of that data.
I really don't think that the stuff you're getting is random. You just need to read a bit more on these specs. And I am not saying it is easy - it is the way it is; plus, this is not about XML or XSD, it is about timezones in general.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With