From 8ff8ca18eaebbab97b8322dec93053cfa38fb945 Mon Sep 17 00:00:00 2001 From: Lucian Mogosanu Date: Mon, 18 Feb 2019 21:18:05 +0200 Subject: [PATCH] Move 085 -> drafts --- drafts/085-s-xml.markdown | 86 ++++++++++++++++++++++++++++++++++++++++++ posts/y05/085-s-xml.markdown | 86 ------------------------------------------ 2 files changed, 86 insertions(+), 86 deletions(-) create mode 100644 drafts/085-s-xml.markdown delete mode 100644 posts/y05/085-s-xml.markdown diff --git a/drafts/085-s-xml.markdown b/drafts/085-s-xml.markdown new file mode 100644 index 0000000..90b85e0 --- /dev/null +++ b/drafts/085-s-xml.markdown @@ -0,0 +1,86 @@ +--- +postid: 085 +title: An XML parser for Common Lisp programs +date: February 18, 2019 +author: Lucian Mogoșanu +tags: tech, tmsr +--- + +This post is part of a [series][tmsr-schedule-i] of published +[artifacts][intellectual-ownership] that (will) represent components for +The Republic's RSS bot, [Feedbot][feedbot]. The idea behind this series +is to grow Feedbot piece by piece[^1], starting from the smallest +elements that [fit in head][fits-in-head], then using them as building +blocks for the actual product, which will flow downstream from the +[botworks][botworks] V tree. + +The first item to be published is S-XML, an XML parser written in Common +Lisp. Both the name and the code have been lifted from files +[published][s-xml] by one [nonperson][nonperson] known as Sven Van +Caekenberghe, who, fortunately, wrote a library that is relatively small +(around a thousand LoC), is organized so that it can be grasped in one +sitting, and is known to work[^2]. Unfortunately though, as with most +(all?) heathen programs encountered, this one isn't without warts. Thus, +in addition to providing a patch, this article discusses the structure +of S-XML and its current problems. + +The patch for S-XML is available in my +[V source repository][s-xml-v]. Now, as to the library itself, it is +structured as follows. + +S-XML contains two layers of abstraction: a. the core parsing code, that +reads characters from a stream and returns XML elements, stored in +`xml.lisp`; b. a series of wrappers over the parser that take the parser +results and give them a particular structure; and c. an interface +between (a) and (b), stored in `dom.lisp`. In fact, one could say that +the layering goes exactly the other way around: the code in (b) provides +a set of functions for the parser (a), while the parser takes a +string/stream, reads it and calls the functions provided by (b) so it +can decide what to do once it has all the required data, e.g. tags, +attributes etc. The wrappers in (b) are stored in `xml-struct-dom.lisp`, +`lxml-dom.lisp` and `sxml-dom.lisp`[^3]. + +The advantage of this design is that it doesn't constrain the user to +any particular DOM tree representation. Personally, I find this +so-called feature to be entirely useless and not much of an advantage +after all, as I don't have any plans to parse XML files into multiple +tree formats, not even mentioning the fact that "doesn't constrain the +user" is the perfect recipe for +[hallucinated freedom][hallucinated-freedom], and not even discussing +the extra lines of (mostly dead) code this adds. This +[cleverness][clever] is more for its own sake than anything of +substance, and thus eventually some hero or another will surgically +excise this particular tumour. + +Until then, however, the thing works, so it's all the better to publish +it than to wait for the moment when said hero gets off his or her ass +and makes the thing shine. Meanwhile, the more pressing matter for yours +truly, and the next episode of this series, will involve publishing a +small RSS/Atom parser based on S-XML. + +[^1]: This style, nowadays immediately recognizable by TMSR citizens as + the "FFA style", draws from [Asciilifeform][alf]'s + [Finite Field Arithmetic][loper-os] series. + +[^2]: It's been powering Feedbot for some months now. + +[^3]: Corresponding, respectively, to a defstruct-based format, Franz's + [LXML][lxml] format and the so-called [SXML][sxml]. The latter two + structure XML markup as S-expressions. Currently, Feedbot's RSS + parser uses LXML, for no reason in particular other than it being + the implicit option. + +[tmsr-schedule-i]: /posts/y05/082-tmsr-schedule-i.html#selection-63.114-63.159 +[intellectual-ownership]: /posts/y04/069-on-intellectual-ownership.html +[feedbot]: http://btcbase.org/log-search?q=feedbot +[alf]: http://wot.deedbot.org/17215D118B7239507FAFED98B98228A001ABFFC7.html +[loper-os]: http://www.loper-os.org/?cat=49 +[fits-in-head]: http://btcbase.org/log-search?q=fits-in-head +[botworks]: /posts/y05/080-botworks-regrind.html +[s-xml]: https://archive.is/Ra6bA +[nonperson]: http://btcbase.org/log-search?q=nonperson +[s-xml-v]: http://lucian.mogosanu.ro/src/ TODO update this with actual dir +[lxml]: https://web.archive.org/web/20080108082030/http://opensource.franz.com/xmlutils/xmlutils-dist/pxml.htm +[sxml]: https://archive.is/Sl4Ev +[hallucinated-freedom]: http://trilema.com/2017/the-practical-costs-of-hallucinated-freedom/ +[clever]: http://btcbase.org/log-search?q=clever diff --git a/posts/y05/085-s-xml.markdown b/posts/y05/085-s-xml.markdown deleted file mode 100644 index 1b4c475..0000000 --- a/posts/y05/085-s-xml.markdown +++ /dev/null @@ -1,86 +0,0 @@ ---- -postid: 085 -title: An XML parser for Common Lisp programs -date: February 18, 2019 -author: Lucian Mogoșanu -tags: tech, tmsr ---- - -This post is part of a [series][tmsr-schedule-i] of published -[artifacts][intellectual-ownership] that (will) represent components for -The Republic's RSS bot, [Feedbot][feedbot]. The idea behind this series -is to grow Feedbot piece by piece[^1], starting from the smallest -elements that [fit in head][fits-in-head], then using them as building -blocks for the actual product, which will flow downstream from the -[botworks][botworks] V tree. - -The first item to be published is S-XML, an XML parser written in Common -Lisp. Both the name and the code have been lifted from files -[published][s-xml] by one [nonperson][nonperson] known as Sven Van -Caekenberghe, who, fortunately, wrote a library that is relatively small -(around a thousand LoC), is organized so that it can be grasped in one -sitting, and is known to work[^2]. Unfortunately though, as with most -(all?) heathen programs encountered, this one isn't without warts. Thus, -in addition to providing a patch, this article discusses the structure -of S-XML and its current problems. - -The patch for S-XML is available in my -[V source repository][s-xml-v]. Now, as to the library itself, it is -structured as follows. - -S-XML contains two layers of abstraction: a. the core parsing code, that -reads characters from a stream and returns XML elements, stored in -`xml.lisp`; b. a series of wrappers over the parser that take the parser -results and give them a particular structure; and c. an interface -between (a) and (b), stored in `dom.lisp`. In fact, one could say that -the layering goes exactly the other way around: the code in (b) provides -a set of functions for the parser (a), while the parser takes a -string/stream, reads it and calls the functions provided by (b) so it -can decide what to do once it has all the required data, e.g. tags, -attributes etc. The wrappers in (b) are stored in `xml-struct-dom.lisp`, -`lxml-dom.lisp` and `sxml-dom.lisp`[^3]. - -The advantage of this design is that it doesn't constrain the user to -any particular DOM tree representation. Personally, I find this -so-called feature to be entirely useless and not much of an advantage -after all, as I don't have any plans to parse XML files into multiple -tree formats, not even mentioning the fact that "doesn't constrain the -user" is the perfect recipe for -[hallucinated freedom][hallucinated-freedom], and not even discussing -the extra lines of (mostly dead) code this adds. This -[cleverness][clever] is more for its own sake than anything of -substance, and thus eventually some hero or another will surgically -excise this particular tumour. - -Until then, however, the thing works, so it's all the better to publish -it than to wait for the moment when said hero gets off his or her -ass. Until then, the more pressing matter for yours truly, and the next -episode of this series, will involve publishing a small RSS/Atom parser -based on S-XML. - -[^1]: This style, nowadays immediately recognizable by TMSR citizens as - the "FFA style", draws from [Asciilifeform][alf]'s - [Finite Field Arithmetic][loper-os] series. - -[^2]: It's been powering Feedbot for some months now. - -[^3]: Corresponding, respectively, to a defstruct-based format, Franz's - [LXML][lxml] format and the so-called [SXML][sxml]. The latter two - structure XML markup as S-expressions. Currently, Feedbot's RSS - parser uses LXML, for no reason in particular other than it being - the implicit option. - -[tmsr-schedule-i]: /posts/y05/082-tmsr-schedule-i.html#selection-63.114-63.159 -[intellectual-ownership]: /posts/y04/069-on-intellectual-ownership.html -[feedbot]: http://btcbase.org/log-search?q=feedbot -[alf]: http://wot.deedbot.org/17215D118B7239507FAFED98B98228A001ABFFC7.html -[loper-os]: http://www.loper-os.org/?cat=49 -[fits-in-head]: http://btcbase.org/log-search?q=fits-in-head -[botworks]: /posts/y05/080-botworks-regrind.html -[s-xml]: https://archive.is/Ra6bA -[nonperson]: http://btcbase.org/log-search?q=nonperson -[s-xml-v]: http://lucian.mogosanu.ro/src/ TODO update this with actual dir -[lxml]: https://web.archive.org/web/20080108082030/http://opensource.franz.com/xmlutils/xmlutils-dist/pxml.htm -[sxml]: https://archive.is/Sl4Ev -[hallucinated-freedom]: http://trilema.com/2017/the-practical-costs-of-hallucinated-freedom/ -[clever]: http://btcbase.org/log-search?q=clever -- 1.7.10.4