From 6f8954fa2e9c401e2e67aec2f7dc67123132c894 Mon Sep 17 00:00:00 2001 From: Lucian Mogosanu Date: Thu, 22 Aug 2019 18:17:26 +0300 Subject: [PATCH] drafts, hunchentoot-vi: Proofread --- drafts/000-hunchentoot-vi.markdown | 90 ++++++++++++++++++++---------------- 1 file changed, 50 insertions(+), 40 deletions(-) diff --git a/drafts/000-hunchentoot-vi.markdown b/drafts/000-hunchentoot-vi.markdown index 47a5c59..2f474cf 100644 --- a/drafts/000-hunchentoot-vi.markdown +++ b/drafts/000-hunchentoot-vi.markdown @@ -16,8 +16,9 @@ Hunchentoot. Other posts in the series include: * a review of [acceptors][hunchentoot-iv]; and * a review of [taskmasters][hunchentoot-v]. -This post will discuss the objects known as "requests" and "replies", -as they are part of the very same fundamental mechanism. +This post is a two-parter (see below why) that will discuss the +objects known as "requests" and "replies", as they are part of the +very same fundamental mechanism. The reader has probably noticed that [little][chunking] to nothing has been discussed about the core of this whole orchestra, the core being @@ -88,8 +89,9 @@ instantiation. a. an error handling context is defined; in which b. [script-name][ht-sn] and [query-string][ht-qs] are set based on the request URI[^4]; c. [get-parameters][ht-gp] are set[^5]; d. [cookies-in][ht-ci-acc] are set[^6]; e. [session][ht-sess] is set[^7]; -and finally, if everything fails, [log-message\*][ht-lms] is called to -log the error and [return-code\*][ht-rcs] is set to http-bad-request. +and finally, if everything fails, f. [log-message\*][ht-lms] is called +to log the error and [return-code\*][ht-rcs] is set to +http-bad-request. By the way, since HTTP hasn't escaped Unicode, URL decoding needs a character format, which is determined based on the content-type field @@ -109,8 +111,8 @@ convert the result into a flexi-streams "external-format" via checking on the parameters it receives, namely it only does something when: the content-type header is set; and the request method is POST; and [the "force" parameter is set; or the [raw-post-data][ht-rpd-slot] -slot is not set]; and the [raw-post-data][ht-rpd-slot] slot is not -true -- to quote from a comment, "can't reparse multipart posts, even +slot is not set]; and the [raw-post-data][ht-rpd-slot] slot is not set +to t -- to quote from a comment, "can't reparse multipart posts, even when FORCEd". Furthermore, for the function to do anything, the content-length header must be set or [input chunking][ht-icp] must be enabled; otherwise, a warning is logged and the function returns. @@ -152,13 +154,13 @@ its own, for which there already exists a CL "library"[^10]. Unfortunately, the coad written around said "library" is still kludgy. Let's see. -a. parse the content-type header; then b. look for a "boundary" +a\. parse the content-type header; then b. look for a "boundary" content-type parameter, and return empty-handed if that doesn't exist; otherwise c. for each MIME part; d. get the MIME headers; and -e. particularly, the content-disposition header; and f. particularly, +e\. particularly, the content-disposition header; and f. particularly, the "name" field of that header. -g. when the item at (f) exists, append the following to the result: +g\. when the item at (f) exists, append the following to the result: g1. the item at (f), converted using [convert-hack](#ch); and g2. the contents, converted using the same [convert-hack](#ch). However, mime-part-contents can return either[^11] g2i. a path to a local file, @@ -167,8 +169,8 @@ its content-type; or g2ii. a string, in which case the (converted) string is stored. [ch] [**convert-hack**][ht-ch]: You might -have been wondering what this does and why the fuck it exists in the -first place. Let's quote from the documentation itself: +wonder what this does and why it exists in the first place. Let's +quote from the documentation itself: > The rfc2388 package is buggy in that it operates on a character > stream and thus only accepts encodings which are 8 bit @@ -182,8 +184,9 @@ octets, then converts said vector (using [octets-to-string][flex-ots]) to a string of the encoding given by the external-format parameter. So this is just dancing around the previous [latin1](#pmfd) game -- yes, if you send a UTF-8-encoded file wrapped in a (ISO-8859-1-encoded) -POST request, the result will be mixed-encoded data, and whoever gets -this will have to make heads and tails of the resulting pile of shit. +POST request, the result will be mixed-encoding data, and whoever gets +said data will have to make heads and tails of the resulting pile of +shit. I *can't wait* for the moment when the ban on this multipart fungus comes into effect, it'll be a joyous day. @@ -192,26 +195,26 @@ comes into effect, it'll be a joyous day. data from the request stream and sets the [raw-post-data][ht-rpd-slot] slot: -a. if the want-stream argument is set, then the stream is converted to +a\. if the want-stream argument is set, then the stream is converted to latin-1-encoded (as per above) and the slot is set to this stream, bound by the content-length (if this field exists). -b. if content-length is set and it's greater than the already-read +b\. if content-length is set and it's greater than the already-read argument -- i.e. there is still data to be read from the stream, assuming the user has already read some of it -- then check whether -[chunking][ht-icp] is enabled and log a warning if so; and read the -content and let it be assigned. +[chunking][ht-icp] is enabled and, if so, log a warning; either way, +read the content and let it be assigned to raw-post-data. -c. if [chunking][ht-icp] is enabled, then c1. setup two arrays: an +c\. if [chunking][ht-icp] is enabled, then c1. setup two arrays: an adjustable "content" array and a buffer; c2. setup a position marker for the content array; c3. read into the buffer; then c4. adjust the content array to the new size; then c5. copy data from the buffer into the content array at the current position; and finally, c6. stop when there's no more content to be read. -I'm running out of space (and time) here, so contrary to [the -schedule][tmsr-work-iv] I'm splitting this into two pieces, the second -part to be published next week. Annoyingly enough, this is also +As you can well see, I am running out of space, so contrary to [the +schedule][tmsr-work-iv] I'm going to split this into two pieces, the +second part to be published next week. Annoyingly enough, this is also delaying [other work][logs-ttp-comments], including the fabled tarpitian-comment-server, so for now the venue for comments remains [#spyked][contact]. @@ -255,7 +258,7 @@ tarpitian-comment-server, so for now the venue for comments remains [^5]: The query-string is split by ampersands and passed to [form-url-encoded-list-to-alist][ht-fuelta], which takes this list - and splits each element by "equals". Thus the string + and splits each element by the equals sign. Thus the string p1=v1&p2=v2... ends up being represented as the association list: ~~~~ {.commonlisp} @@ -275,8 +278,8 @@ tarpitian-comment-server, so for now the venue for comments remains "content-type" field contains a type/subtype part (e.g. "text/plain", or "application/octet-stream" or whatever); but it also contains *parameters*, such as a charset, or - "multipart" markers, into which I won't get, because my blood - pressure is already going up. + "multipart" markers, into which I won't get just yet, because my + blood pressure is already going up. Anyway, [parse-content-type][ht-pct] reads all these and returns a type, a subtype and a charset, if it exists. @@ -288,7 +291,7 @@ tarpitian-comment-server, so for now the venue for comments remains [^9]: This looks confusing after all the previous "external-format" fudge. Note that *the content* has a user-provided format, while - the multipart-blah-blah thing is [by default][rfc-1945-3-6-1] + the multipart-blah-blah syntax is [by default][rfc-1945-3-6-1] Latin1. This is not fucking amusing, I know. [^10]: Written by one Janis Dzerins. There's also a @@ -300,8 +303,9 @@ tarpitian-comment-server, so for now the venue for comments remains libraries" problems in the future, which will require porting/adaptation work. No, I'm not gonna use a thousand libraries for XML parsing, there's [one][s-xml], and that's it. If - you're gonna complain, then you'd better have a good reason; where - the fuck were you when I needed that code? + you're gonna complain, then you'd better have a good reason, and + be ready to explain where the fuck you were when I needed that + code. [^11]: Y'know, I didn't set out to review *this* piece of code back when I started this, but it can't be helped. mime-part-contents is @@ -319,17 +323,6 @@ tarpitian-comment-server, so for now the venue for comments remains "write-content-to-file" parameter to nil. However, this is the *default* behaviour, and what Hunchentoot expects. Fuck my life. -[chunking]: /posts/y06/098-hunchentoot-iv.html#fn2 -[tcp]: /posts/y05/096-hunchentoot-ii.html#fn1 -[alf-cl-on-pc]: http://trilema.com/2019/trilema-goes-dark/#comment-130686 -[likbez]: http://logs.nosuchlabs.com/log-search?q=likbez&chan=trilema -[inca]: http://trilema.com/republican-thesaurus/?b=(cca%202016&e=femstate.#select -[rfc-1945]: https://tools.ietf.org/html/rfc1945#section-5.1.1 -[rfc-2616]: https://tools.ietf.org/html/rfc2616#section-5.1.1 -[http-rfcs]: https://www.w3.org/Protocols/ -[rfc-8615]: https://tools.ietf.org/html/rfc8615 -[rfc-1945-5]: https://tools.ietf.org/html/rfc1945#section-5 -[rfc-1945-6]: https://tools.ietf.org/html/rfc1945#section-6 [ht-c-req]: http://coad.thetarpit.org/hunchentoot/c-request.lisp.html#L31 [ht-ci-acc]: http://coad.thetarpit.org/hunchentoot/c-request.lisp.html#L68 [ht-gp]: http://coad.thetarpit.org/hunchentoot/c-request.lisp.html#L71 @@ -354,12 +347,29 @@ tarpitian-comment-server, so for now the venue for comments remains [ht-lms]: /posts/y06/098-hunchentoot-iv.html#lms [ht-rcs]: http://coad.thetarpit.org/hunchentoot/c-reply.lisp.html#L103 [ht-pct]: http://coad.thetarpit.org/hunchentoot/c-util.lisp.html#L283 -[flex-mef]: http://edicl.github.io/flexi-streams/#make-external-format [ht-hw]: http://coad.thetarpit.org/hunchentoot/c-conditions.lisp.html#L58 -[chunga-rnvp]: https://edicl.github.io/chunga/#read-name-value-pairs [ht-icp]: http://coad.thetarpit.org/hunchentoot/c-util.lisp.html#L342 [ht-shdefs]: http://coad.thetarpit.org/hunchentoot/c-specials.lisp.html#L275 [ht-arh]: /posts/y06/098-hunchentoot-iv.html#selection-762.0-762.5 +[cl-www]: /posts/y05/090-tmsr-work-ii.html#selection-108.0-108.17 +[hunchentoot-i]: /posts/y05/093-hunchentoot-i.html +[hunchentoot-ii]: /posts/y05/096-hunchentoot-ii.html +[hunchentoot-iii]: /posts/y06/097-hunchentoot-iii.html +[hunchentoot-iv]: /posts/y06/098-hunchentoot-iv.html +[hunchentoot-v]: /posts/y06/09b-hunchentoot-v.html +[chunking]: /posts/y06/098-hunchentoot-iv.html#fn2 +[tcp]: /posts/y05/096-hunchentoot-ii.html#fn1 +[alf-cl-on-pc]: http://trilema.com/2019/trilema-goes-dark/#comment-130686 +[likbez]: http://logs.nosuchlabs.com/log-search?q=likbez&chan=trilema +[inca]: http://trilema.com/republican-thesaurus/?b=(cca%202016&e=femstate.#select +[rfc-1945]: https://tools.ietf.org/html/rfc1945#section-5.1.1 +[rfc-2616]: https://tools.ietf.org/html/rfc2616#section-5.1.1 +[http-rfcs]: https://www.w3.org/Protocols/ +[rfc-8615]: https://tools.ietf.org/html/rfc8615 +[rfc-1945-5]: https://tools.ietf.org/html/rfc1945#section-5 +[rfc-1945-6]: https://tools.ietf.org/html/rfc1945#section-6 +[flex-mef]: http://edicl.github.io/flexi-streams/#make-external-format +[chunga-rnvp]: https://edicl.github.io/chunga/#read-name-value-pairs [rfc-1945-3-6-1]: https://tools.ietf.org/html/rfc1945#section-3.6.1 [clhs-prog1]: http://clhs.lisp.se/Body/m_prog1c.htm [rfc-2388]: https://tools.ietf.org/html/rfc2388 -- 1.7.10.4