-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdoc.html
More file actions
185 lines (172 loc) · 6.43 KB
/
doc.html
File metadata and controls
185 lines (172 loc) · 6.43 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<title>Natty Date Parser - documentation</title>
<link rel="stylesheet" type="text/css" href="index.css"/>
<style>
#syntax_list dt {
margin:20px 0 10px 20px;
color:#aaa;
}
#syntax_list dd {
margin-left:20px;
}
#syntax_list dd ul {
border:1px solid #aaa;
padding:10px;
list-style-type:none;
}
img#ast_diagram {
margin:40px 0 20px 40px;
}
</style>
<script async src="https://www.googletagmanager.com/gtag/js?id=G-QEMVC2XSD8"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-QEMVC2XSD8');
</script>
</head>
<body>
<div id="nav">
<ul>
<li class="first"><a href=".">about</a></li>
<li class=""><a href="try.html">try it out</a></li>
<li class="current"><a href="doc.html">documentation</a></li>
</ul>
</div>
<div id="left_labels">
<a href="https://central.sonatype.com/artifact/io.github.natty-parser/natty" rel="nofollow">
<img src="https://img.shields.io/maven-central/v/io.github.natty-parser/natty.svg?label=Maven%20Central"
alt="maven central" style="max-width: 100%;" />
</a>
<a href="https://javadoc.io/doc/io.github.natty-parser/natty" rel="nofollow">
<img src="https://javadoc.io/badge2/io.github.natty-parser/natty/javadoc.svg" alt="javadoc" style="max-width: 100%;" />
</a>
</div>
<div id="content">
<h1>some words of wisdom</h1>
<h2>how it works</h2>
<p>Natty makes heavy use of <a href="https://www.antlr3.org/">ANTLR</a> to tokenize and parse its input into a generic abstract syntax tree, and then walk that tree to determine the date(s) represented. Natty attempts to recognize a wide range of date formats, but since there is an inherent ambiguity in the grammar describing these formats, some decisions need to be made arbitrarily. For example, given the string: '10/10 500', we could interpret this as "october 10th at 5 o'clock" or as "october 10th in the year 500". Given this choice, natty will choose the former with the reasoning that people don't typically reference dates that are centuries away from the present.
</p>
<p>
The abstract syntax tree (AST) approach allows us to minimize the actual code we need to write (as opposed to the code generated for us by ANTLR,) to just a few date manipulation methods (see <a href="https://github.com/natty-parser/natty/blob/main/src/main/java/org/natty/WalkerState.java">WalkerState.java</a>). Another advantage to this approach is the theoretical ease at which natty could be ported to another target language. For the more curious, here's a rough sketch of the AST structure used:
</p>
<img id="ast_diagram" alt="ast diagram" src="images/ast.png"/>
<h2>supported formats</h2>
Following is an attempt at cataloging the date formats that natty recognizes. For the formal grammar definition, see <a href="https://github.com/natty-parser/natty/blob/main/src/main/antlr3/org/natty/generated/DateParser.g">DateParser.g</a>.
<dl id="syntax_list">
<dt>formal dates</dt>
<dd>
Formal dates are those in which the month, day, and year are represented as integers separated by a common separator character. The year is optional and may preceed the month or succeed the day of month. If a two-digit year is given, it must succeed the day of month.
<ul>
<li>1978-01-28</li>
<li>1984/04/02</li>
<li>1/02/1980</li>
<li>2/28/79</li>
</ul>
</dd>
<dt>relaxed dates</dt>
<dd>
Relaxed dates are those in which the month, day of week, day of month, and year may be given in a loose, non-standard manner, with most parts being optional.
<ul>
<li>The 31st of April in the year 2008</li>
<li>Fri, 21 Nov 1997</li>
<li>Jan 21, '97</li>
<li>Sun, Nov 21</li>
<li>jan 1st</li>
<li>february twenty-eighth</li>
</ul>
</dd>
<dt>relative dates</dt>
<dd>
Relative dates are those that are relative to the current date.
<ul>
<li>next thursday</li>
<li>last wednesday</li>
<li>today</li>
<li>tomorrow</li>
<li>yesterday</li>
<li>next week</li>
<li>next month</li>
<li>next year</li>
<li>3 days from now</li>
<li>three weeks ago</li>
</ul>
</dd>
<dt>date alternatives</dt>
<dd>
Natty is able to reconize a list of date alternatives. This is the reason why the <code>ParseResult</code> always contains a <code>List</code> of <code>Date</code> objects.
<ul>
<li>next wed or thurs</li>
<li>oct 3rd or 4th</li>
</ul>
</dd>
<dt>prefixes</dt>
<dd>
Most of the above date formats may be prefixed with a modifier.
<ul>
<li>day after</li>
<li>the day before</li>
<li>the monday after</li>
<li>the monday before</li>
<li>2 fridays before</li>
<li>4 tuesdays after</li>
</ul>
</dd>
<dt>time</dt>
<dd>
The above date formats may be prefixed or suffixed with time information.
<ul>
<li>0600h</li>
<li>06:00 hours</li>
<li>6pm</li>
<li>5:30 a.m.</li>
<li>5</li>
<li>12:59</li>
<li>23:59</li>
<li>8p</li>
<li>noon</li>
<li>afternoon</li>
<li>midnight</li>
</ul>
</dd>
<dt>relative times</dt>
<dd>
<ul>
<li>10 seconds ago</li>
<li>in 5 minutes</li>
<li>4 minutes from now</li>
</ul>
</dd>
<dt>time zones</dt>
<dd>
Any time may be suffixed with time zone information. Any arbitrary GMT
offset may be given in the form +00:500, -0600, etc. Common American
and Pacific time zone abbreviations may also be used. If you'd like
to help add common abbreviations for your locale, feel free to contact
us (via github)
<ul>
<li>+0500</li>
<li>-08:00</li>
<li>UTC</li>
<li>EST</li>
<li>EDT</li>
<li>ET</li>
<li>CST</li>
<li>PST</li>
<li>PDT</li>
<li>PT</li>
<li>MST</li>
<li>AKST</li>
<li>HAST</li>
</ul>
</dd>
</dl>
</div>
<div id="margin"></div>
</body>
</html>