Added support for fortnight and century #987
Added support for fortnight and century #987Mr-Sunglasses wants to merge 31 commits intoscrapinghub:masterfrom
Conversation
|
You seem to have accidentally deleted a file. |
|
ohk I'm fixing them |
|
@Gallaecio fixes the files , Now it seems like everything is Ok |
Codecov Report
@@ Coverage Diff @@
## master #987 +/- ##
==========================================
- Coverage 98.29% 98.29% -0.01%
==========================================
Files 234 234
Lines 2702 2700 -2
==========================================
- Hits 2656 2654 -2
Misses 46 46
Continue to review full report at Codecov.
|
Gallaecio
left a comment
There was a problem hiding this comment.
I see tests only use “century”, in singular. I think it would be good to also cover its plural, both the grammatically correct “centuries” and the supported typo “centurys”.
|
@Gallaecio ok I'll add them on tests , Thanks for your support and guidance |
| in \1 century: | ||
| - in (\d+) century? | ||
| \1 century ago: | ||
| - (\d+) century? ago | ||
| in \1 fortnight: | ||
| - in (\d+) fortnight? | ||
| \1 fortnight ago: | ||
| - (\d+) fortnight? ago |
There was a problem hiding this comment.
This is not correct. The ? indicates an optional letter.
Doing this it will accepts "in 1 century" but also "in 1 centur". You need to add an s before the ?.
Also, you should need to add support for centuries here.
You can check if both thigns work with this
dateparser.parse("in 3 centurys")and:
dateparser.parse("in 3 centuries")Now both are failing.
There was a problem hiding this comment.
@noviluni The tests are failing for centuries , even I added them on en.yaml.
| ], | ||
| "in \\1 century": [ | ||
| "in (\\d+) century?" | ||
| "in (\\d+) centurys?" |
There was a problem hiding this comment.
I think it needs to be something like centur(?:ys?|ies)
There was a problem hiding this comment.
@Gallaecio Just tried this , Tests are failing when I used it.
|
@Gallaecio - I have fixed all the changes that you were request 😊 |
tests/test_freshness_date_parser.py
Outdated
| param('last fortnight', ago={'days': 14}, period='day'), | ||
| param('14 fortnight', ago={'days': 196}, period='day'), | ||
| param('a fortnight ago', ago={'days': 14}, period='day'), | ||
| param('last fortnight', ago={'days': 14}, period='day'), |
There was a problem hiding this comment.
I believe this is repeated and the same as in line 61.
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
tests/test_freshness_date_parser.py
Outdated
| param('last fortnight', ago={'days': 14}, period='day'), | ||
| param('14 fortnight', ago={'days': 196}, period='day'), | ||
| param('a fortnight ago', ago={'days': 14}, period='day'), | ||
| param('last fortnight', ago={'days': 14}, period='day'), |
There was a problem hiding this comment.
Same as above. It is repeated as well.
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
| param('1 वर्ष, 8 महीने, 2 सप्ताह', ago={'years': 1, 'months': 8, 'weeks': 2}, period='week'), | ||
| param('1 वर्ष 7 महीने', ago={'years': 1, 'months': 7}, period='month'), | ||
| param('आज', ago={'days': 0}, period='day'), | ||
| param('1 दशक पहले', ago={'years': 10}, period='year'), |
There was a problem hiding this comment.
Can we add a few more test cases for the Hindi version here? For e.g.
- 1 दशक पूर्व
- दो दशक पहले
- 10 दशकों पहले
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
| param('17 सेकंड बाद', in_future={'seconds': 17}, period='day'), | ||
| param('1 वर्ष, 5 महीने, 1 सप्ताह में', | ||
| in_future={'years': 1, 'months': 5, 'weeks': 1}, period='week'), | ||
| param('1 दशक में', in_future={'years': 10}, period='year'), |
There was a problem hiding this comment.
Could we also add a few more test cases for the Hindi version here? For e.g.
- पांच दशक बाद
- दश दशक पश्चात
- 9 दशकों मे
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
| @parameterized.expand([ | ||
| # English dates | ||
| param('in a fortnight', in_future={'days': 14}, period='day'), | ||
| param('next fortnight', in_future={'days': 14}, period='day'), |
There was a problem hiding this comment.
I am not sure, but do we support coming fortnight?
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
| param('1 दशक', ago={'years': 10}, period='year'), | ||
| param('1 दशक पहले', ago={'years': 10}, period='year'), |
There was a problem hiding this comment.
I have added a few suggestions above to improve the Hindi test cases. Please have a look.
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
There was a problem hiding this comment.
I believe these test cases can be added here as well
- 1 दशक पूर्व
- दो दशक पहले
- 10 दशकों पहले
| december: | ||
| - दिसम्बर | ||
| decade: | ||
| - दशक |
There was a problem hiding this comment.
We can add the plural for this as well, e.g. दशकों
There was a problem hiding this comment.
@gutsytechster Thanks For Suggestions I'll be Implementing Them ASAP ....
There was a problem hiding this comment.
@gutsytechster Done this change, Please Look into it 😊.
|
@Gallaecio , @gutsytechster Please Look into it I have done all Changes. |
|
@Gallaecio Fixed the changes you requested and also add some improvements 👍🏻 |
|
|
||
| # Other | ||
| raw_data | ||
| raw_data No newline at end of file |
There was a problem hiding this comment.
💄 You removed the last empty line 🙂
| in \1 century: | ||
| - in (\d+) centurys? | ||
| - in (\d+) centur(?:ys?|ies) | ||
| - in (\d+) centuries? |
There was a problem hiding this comment.
You can remove the ?. And add "(\\d+) centuries ago" below.
gutsytechster
left a comment
There was a problem hiding this comment.
To confirm that you have added the following support
- fortnight support in English
- century support in English
- decade support in Hindi
| "(\\d+) दशक मे", | ||
| "(\\d+) दशकों मे", |
There was a problem hiding this comment.
Aren't these same as the first two expressions?
| "बाद" | ||
| "बाद", | ||
| "पश्चात", | ||
| "मे" |
There was a problem hiding this comment.
This is already present as the first item on this list.
| - में | ||
| - बाद | ||
| - पश्चात | ||
| - मे |
There was a problem hiding this comment.
This is repeated as well.
| - (\d+) दशक मे | ||
| - (\d+) दशकों मे |
There was a problem hiding this comment.
Repeated as the above two expressions.
| - (\d+) दशक पहले | ||
| - (\d+) दशकों पहले | ||
|
|
||
|
|
There was a problem hiding this comment.
just a nitpick: there is an extra line here.
| - दस: '10' | ||
| - दश: '10' | ||
| - ग्यारह: '11' | ||
| - बारह: '12' No newline at end of file |
There was a problem hiding this comment.
Need to add a new line here.
| param('1 दशक', ago={'years': 10}, period='year'), | ||
| param('1 दशक पहले', ago={'years': 10}, period='year'), |
There was a problem hiding this comment.
I believe these test cases can be added here as well
- 1 दशक पूर्व
- दो दशक पहले
- 10 दशकों पहले
Add support for other words ("decade", "fortnight", "century"...) #725