NAME
parsedate
—
date parsing function
LIBRARY
library “libutil”
SYNOPSIS
#include
<util.h>
time_t
parsedate
(const
char *datestr, const
time_t *time, const int
*tzoff);
DESCRIPTION
Theparsedate
()
function parses a datetime from datestr described in
English relative to an optional time point, and an
optional timezone offset (in minutes behind/west of UTC) specified in
tzoff. If time is
NULL
then the current time is used. If
tzoff is NULL
, then the current
time zone is used.
The datestr is a sequence of white-space separated items. The white-space is optional if the concatenated items are not ambiguous. An empty datestr is equivalent to midnight today (the beginning of this day).
The following words have the indicated numeric meanings:
last =
-1, this =
0,
first
, next
, or
one =
1, second
is unused so
that it is not confused with “seconds”, two
=
2, third
or three =
3, fourth
or four =
4,
fifth
or five =
5,
sixth
or six =
6,
seventh
or seven =
7,
eighth
or eight =
8,
ninth
or nine =
9,
tenth
or ten =
10,
eleventh
or eleven =
11,
twelfth
or twelve =
12.
The following words are recognized in English only:
AM
, PM
,
a.m.
, p.m.
,
midnight
, mn
,
noon
.
The months: january
,
february
, march
,
april
, may
,
june
, july
,
august
, september
,
october
, november
,
december
, and common abbreviations for them.
The days of the week: sunday
,
monday
, tuesday
,
wednesday
, thursday
,
friday
, saturday
, and common
abbreviations for them.
Time units: year
,
month
, fortnight
,
week
, day
,
hour
, minute
,
min
, second
,
sec
, tomorrow
,
yesterday
.
Timezone names: gmt (+0000)
,
ut (+0000)
, utc (+0000)
,
wet (+0000)
, bst (+0100)
,
wat (-0100)
, at (-0200)
,
nft (-0330)
, nst (-0330)
,
ndt (-0230)
, ast (-0400)
,
adt (-0300)
, est (-0500)
,
edt (-0400)
, cst (-0600)
,
cdt (-0500)
, mst (-0700)
,
mdt (-0600)
, pst (-0800)
,
pdt (-0700)
, yst (-0900)
,
ydt (-0800)
, hst (-1000)
,
hdt (-0900)
, cat (-1000)
,
ahst (-1000)
, nt (-1100)
,
idlw (-1200)
, cet (+0100)
,
met (+0100)
, mewt (+0100)
,
mest (+0200)
, swt (+0100)
,
sst (+0200)
, fwt (+0100)
,
fst (+0200)
, eet (+0200)
,
bt (+0300)
, it (+0330)
,
zp4 (+0400)
, zp5 (+0500)
,
ist (+0550)
, zp6 (+0600)
,
ict (+0700)
, wast (+0800)
,
wadt (+0900)
, awst (+0800)
,
awdt (+0900)
, cct (+0800)
,
sgt (+0800)
, hkt (+0800)
,
jst (+0900)
, cast (+0930)
,
cadt (+1030)
, acst (+0930)
,
acst (+1030)
, east (+1000)
,
eadt (+1100)
, aest (+1000)
,
aedt (+1100)
, gst (+1000)
,
nzt (+1200)
, nzst (+1200)
,
nzdt (+1300)
, idle
(+1200)
.
The timezone names specify an offset from Coordinated Universal Time (UTC) and do not imply validating the time/date to be reasonable in any zone that happens to use the abbreviation specified.
A variety of unambiguous dates are recognized:
- 9/10/69
- For years between 70-99 we assume 1900+ and for years between 0-69 we assume 2000+.
- 2006-11-17
- An ISO-8601 date.
- 69-09-10
- The year in an ISO-8601 date is always taken literally, so this is the year 69, not 2069.
- 10/1/2000
- October 1, 2000; the common, but bizarre, US format.
- 20 Jun 1994
- 23jun2001
- 1-sep-06
- Other common abbreviations.
- 1/11
- The year can be omitted. This is the US month/day format.
Standard e-mail (RFC822, RFC2822, etc) formats and the output from date(1), and asctime(3) are all supported as input.
As well as times:
- 10:01
- 10:12pm
- 12:11:01.000012
- 12:21-0500
Relative items are also supported:
- -1 month
- last friday
- one week ago
- this thursday
- next sunday
- +2 years
Note that, as a special case for midnight
with the name of a day only, “midnight tuesday” implies 00:00
at the beginning of Tuesday, whereas “Sat mn” implies 00:00 at
the end of Saturday (i.e. early Sunday morning.)
Seconds since epoch, UTC, (also known as UNIX time) are also supported:
- @735275209
- Tue Apr 20 03:06:49 UTC 1993
Text in datestr enclosed in parentheses
‘(
’ and
‘)
’ is treated as a comment, and
ignored. Parentheses nest (the comment ends when there have been the same
number of closing parentheses as there were opening parentheses.) There is
no escape character in comments, ‘)
’
always ends (or decreases the nesting level of) the comment.
RETURN VALUES
parsedate
() returns the number of seconds
passed since, or before (if negative,) the Epoch, or
-1
if the date could not be parsed properly. A
non-error result of -1
can be distinguished from an
error by setting errno to 0
before calling parsedate
(), and checking the value
of errno afterwards.
ENVIRONMENT
If the tzoff parameter is given as
NULL
, then:
TZ
- The timezone to which the input is relative, when no zone information is otherwise specified in the datestr input.
SEE ALSO
HISTORY
The parser used in parsedate
() was
originally written by Steven M. Bellovin while at the University of North
Carolina at Chapel Hill. It was later tweaked by a couple of people on
Usenet. Completely overhauled by Rich $alz and Jim Berets in August,
1990.
The parsedate
() function first appeared in
NetBSD 4.0.
BUGS
- 1
- The
parsedate
() function is not re-entrant or thread-safe. - 2
- The
parsedate
() function assumes years less than 0 mean − year, and in non ISO formats, that years less than 70 mean 2000 + year, otherwise years less than 100 mean 1900 + year. - 3
- The
parsedate
() function accepts “12 am” where “12 midnight” is correct, and similarly “12 pm” for “12 noon”. The correct forms are also accepted. - 4
- There are various weird cases that are hard to explain, but are nevertheless considered correct.
- 5
- It is very hard to specify years BC, and in any case, conversions of times before the commencement of the modern Gregorian calendar (when that occurred depends upon location, but late 16th century is a rough guide) are suspicious at best, and depending upon context, often just plain wrong.
- 6
- Despite what is stated above, “next” is actually 2. The input “next January”, instead of producing a timestamp for January of the following year, produces one for January 2nd, of the current year. Use caution with “next” it rarely does what humans expect.