Time Format Conversion Made Easy

peterchen

4.87/5 (60 votes)

11 Apr 2012CPOL15 min read

163.8K

2.4K

Conversion of and musing about common Windows time formats

Download source - 27.2 KB

Introduction

Windows uses different time formats, with only selected conversion paths between them. This article provides comprehensive conversions between the most common ones, and a discussion why date and time are complicated and what to do to preserve your sanity. Text representation of dates and times are not covered by this article.

Time is the simplest thing
Date & Time Handling Checklist (Nov 2011)
The Conversion Library
- Supported Types
- The particular nastiness of OLE Date/Time
- Selecting a Format for Storage (Apr 2012)
- API Reference (Apr 2012)
More Data
Links
Tests
History

For updates, I have added a note when the section was added or changed.

Time is the Simplest Thing

That was supposed to become the title for my article. Since Joseph M. Newcomer already has an article with that name, I'm stuck with the one you see above.

Date and Time handling is one of the areas that attracts bugs because the complexity of the problem is often underestimated. From the Zune bug to numerous unnamed oddities, our measurements of date and time carry a lot of conventions that resist simple calculations and suggest assumptions that don't always hold true. I'll start with the oddities:

Leap Years

Keeps the calendar year in sync with the solar year. A day, February 29, is inserted if the year is divisible by 4, unless it's divisible by 100 but not by 400.

C++

isLeapYear = (year % 400 == 0) || ( (year % 4 == 0) && !(year % 100 == 0) )

1996 and 2000 were leap years, but 1900 was not. Date/Time APIs usually consider leap years, e.g., when calculating a time span, but not always respect the century rules. Remember the full rule.

Leap Seconds

Keeps the calendar day in sync with the earth day. A second is inserted at June 30 or December 31. The last time was December 31, 2008, 23:59:60. This happens the same time worldwide, so in Japan that would be Jan 1^st, 08:59:60. Yes, these are valid times.
Since the underlying effects are not predictable, leap seconds are announced 6 month in advance.

Most time APIs ignore leap seconds - i.e. continuous formats like time_t jump back a second, or have a twice-as-long second.

Time Zone

Helps everyone agree that 6:30 is too early.
The world is split into time zones that roughly correlate to longitude, corrected for political borders. That's good, otherwise it might be 6:30 in your bedroom, but 7:30 in the office already.

The "Time Zone Free" - standard time is Universal Coordinated Time (UTC). Time Zones are described by an offset to this time.

The Prime or Zero Meridian goes through Greenwich near London, so London has no time zone offset to UTC. Berlin has +1 hour - that is, it's one hour later in Berlin than in London. Some regions have half-hour offsets (e.g. India +5 ½). China, which geographically spans 4 time zones, just has a single time zone.

At the opposite of the prime meridian is the dateline which, when crossed, propels you one day into the future, or one day back. Doesn't help against the hair loss, though.

Daylight Saving Time (DST)

Don't get me started on this one - suffice to say, it was suggested by an insect collector so you get an hour of after-work daylight. Supported and opposed for many reasons, the clock is set forward in spring by one hour, and back again in autumn. When and at what time varies by location, and has changed frequently. Israel settled the date and time, as well as whether have DST at all this year, by a parliament debate. This means that static definitions of DST rules can be outdated quickly. YOu best rely on your platforms update mechanisms for correctness.

UTC Offset

The UTC offset is the difference between local time and Universal Coordinated Time (UTC): local time = UTC + offset. UTC offset usually consists of both the time zone offset and any additional offset like Daylight Saving Time.

Calendar

All that's said here is true only for one calendar - the Gregorian Calendar, fortunately the international de facto standard, even though other calendars are still in common use. The Gregorian calendar was introduced in 1582 as a correction to the then-common Julian Calendar, one of the changes being a direct jump from October 4^th to October 15^th, to correct the inaccuracies of the Julian calendar that have accumulated over the centuries. If you want to work with dates before that, additional trouble awaits.

Some still in use calendars are the Hebrew, Persian and Hindu calendars, featuring days of varying lengths, leap months and other features that make the above problems look simple.

What Does That Mean?

Common time span expressions like "two weeks" have different actual duration depending on when they take place.

Around DST changes, the same local time describe two points in time, one hour apart - and you can't say which one it means. There's an hour worth of local time stamps that don't exist.

Most environments have problems keeping up with the changes, and accurately store and consider previous ones, so certain calculations - especially time spans between two points - suffer from inaccuracies.

Checklist

There are a few things that help preserve sanity:

Don't assume date & time handling is simple. Be aware of UTC vs. local time. Use tested API's instead of implementing your own - unless you know exactly what you do. Test. Peer review. Test your tests. Test your peer reviewers. Check your pet for fleas. Whatever it takes.

Local vs. UTC awareness. Local timestamps are not comparable and some values are ambiguous. UTC timestamps are hard to relate to by users. A common mistake is to mix up local timestamp values with UTC. Document for variables, API's and file formats whether a timestamp value is UTC or local time.

Compare UTC times. If you want to "sort by date", or find out which of two event took place earlier, do store and use UTC. Even if your data occurs in a single time zone, DST can mess up order.

Local time without location is meaningless. I would recommend to "know" the UTC offset for every local time you carry around. When persisting or transmitting timestamps, include the UTC and the local offset when and where the timestamp was generated. Associated with this are three time representations that may be relevant:

UTC for comparison and sorting
local time of the sender - e.g. the logs of your transatlantic partners reporting problems "always around midnight"
local time of the receiver - e.g. looking for an even that occurred "half an hour ago" on the other side of the world

Do not rely on constant fixed time spans. Not only the duration of a month varies, but also the duration of days, hours and minutes. More exactly: the conversion between units of timespans depends on the actual timestamp they are applied to. This needs to be considered when allowing to enter colloquial time spans such as "two months".

The Time Conversion Library

Supported Types

Type	Base type	Resolution¹	Epoch²	earliest date⁴	latest date⁴
time_t (32 bit)	32 bit signed integer	1 second	1.1.1970	Dec 13, 1901 20:45:52	Jan 19, 2038 03:14:07
time_t (64 bit)	64 bit signed integer	1 second	1.1.1970	time didn't exist	the sun is gone
struct tm	struct (36 bytes)	1 second	1.1.1900	1.1.1900	31.12.3000 23:59:59
SYSTEMTIME	struct (16 bytes)	1 millisecond		1.1.1601	31.12.30827 23:59:59.999
OLE Date/Time	double	0.5 seconds ⁵	30.12.1899	1.1.100 ⁵	31.12.9999
FILETIME	64 bit unsigned integer ³	100 nanoseconds	1.1.1601	1.1.1601	~year 60055

1) Smallest unit that can be represented. Methods that get the current time may rely on an underlying counter that increases in larger steps. E.g. GetSystemTimeAsFileTime may return current time in time slice increments of ~15ms.
2) Typically, the day when the calendar begins. For continuous formats, this is what the date/time represented by 0, in UTC. Note that some allow negative values that are before the epoch.
3) In a struct of 2x32 bit unsigned integer
4) Some APIs may have more stringent limits, e.g. some conversion functions don't accept negative time_t values, 64 bit time_t is frequently limited to Dec 31, 3000.
5) OLE Date/Time is a little crazy, see below.

The Particular Nastiness of OLE Dates

OLE Dates (VT_DATE) could have been a great: they specify days since the epoch (Dec 30, 1899), the fractional part giving the fraction of the day (e.g. 0.5 for noon). The floating point allows to trade of resolution near the epoch for long term calculations.

Like other linear formats, DST and leap seconds are glanced over. The first can be avoided by storing only UTC, the second is a problem of all formats. However, there are two oddities in the implementation that you need to be aware of:

Negative values need to be split up into integer and fractional part: e.g. -2.5 means two days back and half a day forward from the epoch (i.e. the sign is applied to the integer part, but not the fractional part). This makes numeric errors especially troublesome, -2 means "two days back from the epoch", ending up at Dec 28, 1899. -1.9999999 means go back one day and almost an entire day forward again, ending you up just a bit before Dec 30, 1899 - almost two days apart.

The epoch was chosen to make Excel compatible with a bug in Lotus 1-2-3, leading to some inaccurate conversions in January and February 1900. For details, See link list below.

Selecting a Format for Storing Timestamps

FILETIME actually looks promising for typical needs: high resolution, a sufficient range and very simple calculations using an int64 for conversions and time spans. Portability is good enough with the simple conversion from/to time_t.

time_t (64 bit) is default choice, but you get only one second resolution which may not be enough for some applications. It's probably the most portable. Range is limited to Year 3000 by some API's. Avoid 32 bit time_t's at all cost.

I would avoid SYSTEMTIME because of being a platform-dependent struct, OLE DATE/TIME – while conceptually interesting – only offers half-second resolution, has broken arithmetic for negative values, and you have to be prepared for numeric precision issues.

Another option would be storing a string in a culture-neutral format. Requires variable length, and the extra cost of formatting and parsing may be prohibitive for large amounts of data, but you are more independent of the actual date/time API's and formats supported.

My current bet is on FILETIME. The API shows a way to pack a FILETIME and the UTC offset into an int64 without tangible losses, allowing for rather easy handling.

API Reference

`time_ref, time_const_ref`	Argument converters to implement methods that accept all of the supported types (You don't really need to know them, but then, the `TimeConvert` API looks weird)
`TimeConvert`	Converts between time formats
`FileTime`, `FileTime64`	Conversion between `FILETIME` structure and `unsigned __int64`
`GetUTCOffset`	returns the offset between local time and UTC (in seconds)
`FmtUTCOffset`	Format a UTC offset as s a string (e.g. -7 or +5:30)
PackFileTime64 UnpackFileTime GetPackedFileTimeNow	combines time and UTC offset in a 64 bit integer. Gets current time and UTC offset as 64 bit integer

time_ref, time_const_ref

This structure acts as an argument adapter, allowing to write functions that accept references to the types listed above. It is typically used only for an argument list, you don't create instances of this type. See TimeConvert for a method that uses it.

C++

struct time_ref
{
   ETimeFmt fmt;

   union
   {
      __time32_t * m_time32_t;
      __time64_t * m_time64_t;
      ...
   }; 

   time_const_ref(__time32_t & m)       : fmt(tftUnix32), m_time32_t(&m) {}
   time_const_ref(__time64_t & m)       : fmt(tftUnix64), m_time64_t(&m) {}
   ...
}

time_const_ref is implemented similar, but accepts const references and holds const pointers.

TimeConvert

C++

HRESULT TimeConvert(time_const_ref src, time_ref dst)

Converts the time passed as src into the time passed as dst. If the conversion fails, dst is unmodified.

Returns:

S_OK if the conversion succeeds
E_INVALIDARG if a custom conversion fails because the source value is out of range for the target value
Error code of a system conversion function, if that method fails

FileTime, FileTime64

C++

inline unsigned __int64 FileTime64(FILETIME ft)
inline FILETIME FileTime(unsigned __int64 ft64)

Converts between the FILETIME structure and an 64 bit integer.

GetUTCOffset

GetUTCOffset calculates the current difference between local time and UTC. This includes both time zone and DST, if applicable.

There is no official standard method to retrieve the data, I am using the following algorithm:

get time_t as UTC
use gmtime to convert it to a struct tm
pass this struct to mktime which assumes it is a local time, subtracts UTC offset and converts back to time_t
the difference between this result and the UTC is the UTC offset, just with the wrong sign.

(I surely hope this algorithm doesn't have any problems.)

There is an annoying little problem with that, though: between getting UTC and UTC offset, the UTC offset may change. I am not sure how relevant this is in practice, I can imagine there is a small race condition during change of DST, but I am not taking any chances.

For this reason, GetUTCOffset can optionally return the UTC time for which this offset applies. To guarantee consistency, the calculation is repeated at least two times, more if the UTC offset changes between the last two calculations. There is no repetition when the time_t * pNow_utc argument is NULL.

What if you need the current time and the UTC offset, but in a different format? Use a loop, as demonstrated here for SYSTEMTIME:

BOOL ok = FALSE;
SYSTEMTIME st = { 0 };
int utcOffset = 0, utcOffset2 = 0;

do
{
   utcOffset = GetUTCOffset();
   ok = GetSystemTime(&st);
   utcOffset2 = GetUTCOffset();
} while(ok && utcOffset != utcOffset2);

// !ok: GetSystemTime failed, otherwise, you can use st and utcOffset

FmtUTCOffset

Formats a UTC offset (given in seconds, as returned from GetUTCOffset) as a string. The string is empty when the offset is 0, and minutes are ommitted when they are 0. Seconds are always ignored.

It is intended to be used together with a UTC prefix after a local time:

string.Format(_T("%s UTC%s"),  myDateTimeString, FmtUTCOffset(utcOffset));

PackFileTime64, UnpackFileTime

What happens when a typical coder has to deal with file formats for a few times? He invents a new one! It's the thing I wanted to avoid when starting the article.

PackFileTime64 combines a FILETIME value and an UTC offset into a single 64 bit unsigned integer:

FILETIME resolution is reduced from 100ns to 12.8µs (rounded to nearest value)
UTC offset is rounded to the nearest multiple of 15 minutes, and must be in the range -15..+16h (currently, time zones range from -12h to +14h).

The UTC offset is simply stored in the low bits, since the high resolution is rarely required or even available. (On my system, GetSystemTimeAsFileTime increments in 15.6ms steps).

GetPackedFileTimeNow

Retrieves the current UTC time and UTC offset as "packed int64" (as described above).

More Data

Typical Increments

The following table shows typical increments of the continuous types for:

Increments of time (excluding leap seconds, leap days)

Type	second	minute	hour	day	week	non-leap-year ²⁾
time_t	1	60	3600	86400	604800	31536000
OLE Date/Time	¹/₈₆₄₀₀ 1,157407e-5	¹/₁₄₄₀ 6.94e-4	¹/₂₄ 0.0416...	1	7	365
FILETIME	1e7	6e8	3.6e10	8.64e11	6.048e12	315360000000 3.1536e14

1) Rule of thumb: There are roughly PI times 10 ⁷ seconds in a year
2) For very long time ranges, you can calculate with 365,25 days/year to accommodate for leap years

UTC vs. Local Time in Various Formats

Type	Current Time (UTC)	Current Time (local)	UTC to local	local to UTC
time_t	`time()`	`-`	`localtime`	`mktime`
OLE Date/Time	`-`	`-`	`-`	`-`
FILETIME	`-`	`-`	`FileTimeToLocalFileTime`	`LocalFileTimeToFileTime`
SYSTEMTIME	`GetSystemTime`	`GetLocalTime`	`SystemTimeToTzSpecificLocalTime`	`TzSpecificLocalTimeToSystemTime`

Dealing with DST Rule Changes

For dealing with the changing rules for DST, Windows offers the following additional APIs:

GetTimeZoneInformationForYear
GetDynamicTimeZoneInformation
SetDynamicTimeZoneInformation

I haven't evaluated how well they cover all changes, don't expect magic.

More Information

Tests and Implementation Details

The worst was not the actual conversions but writing the tests, figuring out reliable conversion paths, and checking unreliable ones against them. This is the conversion matrix I used:

Conversion Matrix

The following conversion matrix was used. green indicates the safe conversions (simple copy, or provided by a system library), red indicates custom calculations. The conversions between the continuous formats (time_t, FILETIME, OLE DATE / TIME) are linear, i.e.
timeB = timeA * factor + offset.
Thus, checks were made at two points for each conversion.

(The table is also included as OpenCalc document in the download.)

Time Conversion overview

History

This code was triggered by running into a nasty bug that a unit test would have uncovered only in one half of each year (settings the isdst member of a struct tm).

January 2011 - First release
November 2011 - fixed an incorrect value in the text, added support for UTC offset
December 2011 - cleaned up text flow a bit
April 2012 - what happens if you let a coder deal with half a dozen time formats? He invents a new one!
Added PackFileTime64, UnpackFileTime, GetPackedFileTimeNow

In Closing

Since this is a very dry topic, I give you a quote about time from a movie I happen to like (suitably redacted for international audiences).

Let me explain a couple of things.

Time is short. That's the first thing.

For the weasel, time is a weasel.
For the hero, time is heroic.
If you're gentle, your time is gentle.
If you're in a hurry, time flies.
Time is a servant if you are its master.
Time is your god if you are its dog.
We are the creators of time
the victims of time
and the killers of time.

Time is timeless.
That's the second thing.

You are the clock, Cassiel.
From: Faraway so close by Wim Wenders, 1993

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)