v0.29.0 2020-07-11T18:33:51+0800 - Add a site-specific extractor for news.cctv.com v0.28.0 2020-07-09T23:28:40+0800 - Add a site-specific extractor for www.xinhuanet.com - Add a site-specific extractor for hk.on.cc v0.27.0 2020-07-06T22:17:57+0800 - Add a site-specific extractor for new.ctv.com.tw - Add a site-specific extractor for hk.crntt.com v0.26.0 2020-07-03T18:13:05+0800 - Adjust the dateline output v0.25.0 2020-07-02T21:40:30+0800 - Add a site-specific extractor for www.twreporter.org - Convert extracted dateline to iso8601 format (pts) v0.24.0 2020-06-29T08:46:10+0800 - Convert extracted dateline to iso8601 format (peopo, fountmedia) - Improve the extraction of journalist names on SETN - Handle an error when parsing dateline on www.idn.com.tw v0.23.0 2020-05-15T23:10:02+0800 - Improve the extraction of journalist names on CNA, ETToday, turnnewsapp.com v0.22.0 2020-05-11T23:40:22+0800 - Improve the extraction of journalist names v0.21.0 2020-05-10T07:53:41+0800 - Improve the extraction of journalist names on news.tnn.tw - Improve the extraction of journalist names on NTDTV v0.20.0 2020-05-05T21:22:48+0800 - CTS: rewritten for quicker extraction. v0.19.0 2020-05-04T14:01:50+0800 - UDN: Update the CSS ruleset for udn.com - Add a site-specific extractor for www.idn.com.tw - Niusnews: Extract journalist name - SETN: Non-human journalist names are now extracted too. - Properly handle non-article pages. v0.18.0 2020-05-03T19:53:12+0800 - Add a site-specific extractor for www.ttv.com.tw - Add a site-specific extractor for www.hkcnews.com - Add a site-specific extractor for www.thestandnews.com - Add a site-specific extractor for www.epochtimes.com v0.17.0 2020-05-03T09:15:32+0800 - Improve the extraction of journalist names on cnews, EBC, CTEE and SETN - Add a site-specific extractor for newnet.tw v0.16.0 2020-04-26T21:36:41+0800 - Improve the extraction of journalist names on CTS, CTEE and rti.fr v0.15.0 2020-04-23T09:17:50+0800 - Reduce the amount of warnings. v0.14.0 2020-04-08T00:00:54+0800 - Improve the extraction of journalist names on ETToday and CNA v0.13.0 2020-03-22T16:20:13+0800 - Improve the extraction of journalist names on www.setn.com v0.12.0 2020-03-09T09:57:13+0900 - Improve the extraction of www.upmedia.mg - Improve the extraction of journalist names on www.setn.com v0.11.0 2020-02-15T09:54:00+0900 - Improve the accuracy of extracting www.taiwannews.com.tw - Improve the extraction of news.cts.com.tw - Add a site-specific extractor for estate.ltn.com.tw v0.10.0 2020-02-04T09:01:00+0900 - Improve the extraction of turnnewsapp.com - Improve coverage for various cases. v0.9.0 2020-02-04T01:21:00+0900 - Improve the extraction of news.tnn.tw - Improve the extraction of www.setn.com v0.8.0 2020-02-03T09:52:00+0900 - Improve the extraction of www.rti.org.tw - Improve the extraction of www.bcc.com.tw v0.7.0 2020-02-02T10:28:00+0900 - Improve the extraction of www.taipeitimes.com - Improve the extraction of udn.com - Improve the extraction of money.udn.com - Improve the extraction of stars.udn.com - Improve the extraction of house.udn.com v0.6.0 2020-02-01T19:51:00+0900 - Reject the extracted journalist name if it happens to be one of the known newspaper name. - Improve the extraction of https://www.storm.mg v0.5.0 2020-01-29T10:30:00+0900 - Improve the extraction of a few news site. 0.4.0 2020-01-26T01:51:45+0900 - Improve the extraction of a few news site. 0.3.0 2020-01-26T01:51:45+0900 - Improve the extraction of a few news site. 0.2.0 2020-01-25T23:23:39+0900 - Improve the extraction of a specific news site. 0.1.1 2020-01-24T11:13:11+0900 - Fix an fat-finger mistake. 0.1.0 2020-01-24T09:16:11+0900 - Improve the extraction of a few news sites. 0.0.9 - Improve the recall of a specific news site. 0.0.8 2020-01-20T22:08:39+0900 - Improve the extraction of journalist names on a few more news sites. 0.0.7 - Remove a bunch of wastes. 0.0.6 - Handle "utf-8" charset correctly. 0.0.5 - GenericExtractor can now extract directly from HTML file. 0.0.4 - Introduce JSONLD-based extractor. 0.0.3 - Introduce CSS-based site-specific extractors. 0.0.2 - Improvements on error-handling. 0.0.1 - Inital Release