The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
v0.45.0 2022-03-16T09:45:28+0900
  - Remove in-article ads on ebc

v0.44.0 2021-11-01T09:04:01+0900
  - Minor tweaks of the extractor of www.rti.org.tw

v0.43.0 2021-07-07T21:18:11+0900
  - Update the extractor of www.eventsinfocus.org

v0.42.0 2021-06-24T09:23:46+0900
  - Add a site-specific extractor for yimedia.com.tw
  - Update the extractor of news.pts.org.tw to adapt the updates of the website

v0.41.0 2021-03-31T21:26:10+0900
  - Update the extractor of ttv to catchup with the website updates.
  - Update the extractor of www.mdnkids.com to catchup with the website updates.

v0.40.0 2021-02-09T09:55:29+0900
  - Improve UDN extractor and let paragraphs be split correctly

v0.39.0 2020-08-23T22:19:18+0900
  - Improve the extraction of dateline of www.idn.com.tw
  - Add a site-specific extractor for www.bbc.com
  - Fix the format of extracted dateline of www.rti.org.tw

v0.38.0 2020-08-14T21:01:48+0800
  - Improve the accuracy of extraction of journalist and dateline on news.pts.gov.tw
  - Improve the recall of the extractor of news.tnn.tw
  - Add a site-specific extractor for www.aljazeera.com

v0.37.0 2020-08-08T13:11:45+0800
  - Add a site-specific extractor for www.penghutimes.com
  - dateline is reformatted differently. The time component is no longer default to 23:59:59

v0.36.0 2020-08-07T08:56:18+0800
  - Add a site-specific extractor for www.eventsinfocus.org
  - Add a site-specific extractor for m.news.cctv.com
  - Improve the extractor of newnet.tw

v0.35.0 2020-07-31T08:36:15+0800
  - re-format the dateline extracted from www.thinkingtaiwan.com
  - Handle a few special cases on chinatimes and ebc.
  - Improve the extraction of www.5ch.com.tw
  - Improve the recall of dateline and journalist on www.mdnkids.com

v0.34.0 2020-07-26T22:57:38+0800
  - Add a site-specific extractor for www.nownews.com
  - Add a site-specific extractor for www.mdnkids.com
  - Add a site-specific extractor for www.ustv.com.tw

v0.33.0 2020-07-22T06:44:18+0800
  - Improve the extraction of dateline and journalist for opinion.udn.com
  - Start parsing dateline string and refromat them as ISO8601

v0.32.0 2020-07-18T07:23:04+0800
  - Add a site-specific extractor for www.digitimes.com.tw
  - Add a site-specific extractor for www.hkcna.hk
  - Add a site-specific extractor for www.cw.com.tw
  - Handle the English version of www.hkcnews.com

v0.31.0 2020-07-14T08:57:12+0800
  - Add a site-specific extractor for newtalk.tw
  - Add a site-specific extractor for talk.ltn.com.tw

v0.30.0 2020-07-12T17:10:32+0800
  - Add a site-specific extractor for focustaiwan.tw
  - Improve extraction of dateline and journailst for a few existing news sites.

v0.29.0 2020-07-11T18:33:51+0800
  - Add a site-specific extractor for news.cctv.com

v0.28.0 2020-07-09T23:28:40+0800
  - Add a site-specific extractor for www.xinhuanet.com
  - Add a site-specific extractor for hk.on.cc

v0.27.0 2020-07-06T22:17:57+0800
  - Add a site-specific extractor for new.ctv.com.tw
  - Add a site-specific extractor for hk.crntt.com

v0.26.0 2020-07-03T18:13:05+0800
  - Adjust the dateline output

v0.25.0 2020-07-02T21:40:30+0800
  - Add a site-specific extractor for www.twreporter.org
  - Convert extracted dateline to iso8601 format (pts)

v0.24.0 2020-06-29T08:46:10+0800
  - Convert extracted dateline to iso8601 format (peopo, fountmedia)
  - Improve the extraction of journalist names on SETN
  - Handle an error when parsing dateline on www.idn.com.tw

v0.23.0 2020-05-15T23:10:02+0800
  - Improve the extraction of journalist names on CNA, ETToday, turnnewsapp.com

v0.22.0 2020-05-11T23:40:22+0800
  - Improve the extraction of journalist names

v0.21.0 2020-05-10T07:53:41+0800
  - Improve the extraction of journalist names on news.tnn.tw
  - Improve the extraction of journalist names on NTDTV

v0.20.0 2020-05-05T21:22:48+0800
  - CTS: rewritten for quicker extraction.

v0.19.0 2020-05-04T14:01:50+0800
  - UDN: Update the CSS ruleset for udn.com
  - Add a site-specific extractor for www.idn.com.tw
  - Niusnews: Extract journalist name
  - SETN: Non-human journalist names are now extracted too.
  - Properly handle non-article pages.

v0.18.0 2020-05-03T19:53:12+0800
  - Add a site-specific extractor for www.ttv.com.tw
  - Add a site-specific extractor for www.hkcnews.com
  - Add a site-specific extractor for www.thestandnews.com
  - Add a site-specific extractor for www.epochtimes.com

v0.17.0 2020-05-03T09:15:32+0800
  - Improve the extraction of journalist names on cnews, EBC, CTEE and SETN
  - Add a site-specific extractor for newnet.tw

v0.16.0 2020-04-26T21:36:41+0800
  - Improve the extraction of journalist names on CTS, CTEE and rti.fr

v0.15.0 2020-04-23T09:17:50+0800
  - Reduce the amount of warnings.

v0.14.0 2020-04-08T00:00:54+0800
  - Improve the extraction of journalist names on ETToday and CNA

v0.13.0 2020-03-22T16:20:13+0800
  - Improve the extraction of journalist names on www.setn.com

v0.12.0 2020-03-09T09:57:13+0900
  - Improve the extraction of www.upmedia.mg
  - Improve the extraction of journalist names on www.setn.com

v0.11.0 2020-02-15T09:54:00+0900
  - Improve the accuracy of extracting www.taiwannews.com.tw
  - Improve the extraction of news.cts.com.tw
  - Add a site-specific extractor for estate.ltn.com.tw

v0.10.0 2020-02-04T09:01:00+0900
  - Improve the extraction of turnnewsapp.com
  - Improve coverage for various cases.

v0.9.0 2020-02-04T01:21:00+0900
  - Improve the extraction of news.tnn.tw
  - Improve the extraction of www.setn.com

v0.8.0 2020-02-03T09:52:00+0900
  - Improve the extraction of www.rti.org.tw
  - Improve the extraction of www.bcc.com.tw

v0.7.0 2020-02-02T10:28:00+0900
  - Improve the extraction of www.taipeitimes.com
  - Improve the extraction of udn.com
  - Improve the extraction of money.udn.com
  - Improve the extraction of stars.udn.com
  - Improve the extraction of house.udn.com

v0.6.0 2020-02-01T19:51:00+0900
  - Reject the extracted journalist name if it happens to be one of the known newspaper name.
  - Improve the extraction of https://www.storm.mg

v0.5.0 2020-01-29T10:30:00+0900
  - Improve the extraction of a few news site.

0.4.0 2020-01-26T01:51:45+0900
  - Improve the extraction of a few news site.

0.3.0 2020-01-26T01:51:45+0900
  - Improve the extraction of a few news site.

0.2.0 2020-01-25T23:23:39+0900
  - Improve the extraction of a specific news site.

0.1.1 2020-01-24T11:13:11+0900
  - Fix an fat-finger mistake.

0.1.0 2020-01-24T09:16:11+0900
  - Improve the extraction of a few news sites.

0.0.9
  - Improve the recall of a specific news site.

0.0.8 2020-01-20T22:08:39+0900
  - Improve the extraction of journalist names on a few more news sites.

0.0.7
  - Remove a bunch of wastes.

0.0.6
  - Handle "utf-8" charset correctly.

0.0.5
  - GenericExtractor can now extract directly from HTML file.

0.0.4
  - Introduce JSONLD-based extractor.

0.0.3
  - Introduce CSS-based site-specific extractors.

0.0.2
  - Improvements on error-handling.

0.0.1
  - Inital Release