From: Lucian Mogosanu Date: Tue, 26 Feb 2019 21:13:17 +0000 (+0200) Subject: drafts: 087 X-Git-Tag: v0.11~90 X-Git-Url: https://git.mogosanu.ro/?a=commitdiff_plain;h=f5f7c14bf2d6bb6bd22b0d5f9b8c6ec0777bcb86;p=thetarpit.git drafts: 087 --- diff --git a/drafts/087-feedparse.markdown b/drafts/087-feedparse.markdown new file mode 100644 index 0000000..f6d00c7 --- /dev/null +++ b/drafts/087-feedparse.markdown @@ -0,0 +1,65 @@ +--- +postid: 087 +title: A feed parser for Common Lisp programs +date: February 18, 2019 +author: Lucian Mogoșanu +tags: tech, tmsr +--- + +This is the second part of [a series][tmsr-schedule-i] on building +blocks for [Feedbot][feedbot]. + +Now that we have an [XML parser][s-xml] on hand, we can use that to +obtain [RSS][rss] and [Atom][rss] feeds in a structured format. Once +again, as far as heathendom goes we are lucky -- one Kyle Isom has +provided us with a feed parser, cl-feedparse, that is under three +hundred lines of neat Lisp code, and quite importantly, it meets the +spec, as can be seen below. + +The code is available as a patch on the [S-XML V tree][s-xml-v]. + +The remainder of this post a. describes the structure and functionality +of feedparse; and b. it provides some usage examples. + +(cl-)feedparse provides the following simple structure for (RSS/Atom) +feeds. A feed can be decomposed into its title, kind (RSS or Atom), URL +and list of feed items. A feed item has the following elements: an ID, a +title, a (publication) date, a link and a body/description. + +In order to obtain a feed, feedparse: + +* performs an HTTP request, obtaining the raw feed data; +* parses the feed, using S-XML's `parse-xml-string`; +* identifies the kind of feed (`parser-dispatch`); and +* calls the appropriate parser, `parse-rss` or `parse-atom`. + +Both parsers are very easy to understand; the reader is encouraged to +peruse `feedparse.lisp`. + +After we've pressed to `s-xml-feedparse`: + +~~~~ +TODO +~~~~ + +and imported all the dependencies[^1], we can now run our parser: + +~~~~ +TODO +~~~~ + +[^1]: Unfortunately, both feedparse depends on Lisp code which is not + yet V-ified, namely [Drakma][drakma] and flexi-streams, which on + their own depend on other packages. The full set of dependencies is: + usocket, chipz, flexi-streams, trivial-gray-streams, chunga, + cl-base64, cl-puri and drakma. + + TODO: describe these. + +[tmsr-schedule-i]: /posts/y05/082-tmsr-schedule-i.html#selection-63.114-63.159 +[feedbot]: http://btcbase.org/log-search?q=feedbot +[s-xml]: TODO +[rss]: https://archive.is/3rHl +[atom]: https://archive.is/rdl5b +[s-xml-v]: http://lucian.mogosanu.ro/src/s-xml +[drakma]: TODO