posts: 042, 043

author Lucian Mogosanu <lucian.mogosanu@gmail.com>

Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)

committer Lucian Mogosanu <lucian.mogosanu@gmail.com>

Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)
author Lucian Mogosanu <lucian.mogosanu@gmail.com>
Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)
committer Lucian Mogosanu <lucian.mogosanu@gmail.com>
Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)
diff --git a/posts/y02/042-category-theory-software-engineering.markdown b/posts/y02/042-category-theory-software-engineering.markdown

new file mode 100644 (file)

index 0000000..0f475a4
--- /dev/null
+++ b/posts/y02/042-category-theory-software-engineering.markdown
@@ -0,0 +1,310 @@
+---
+postid: 042
+title: Category theory and its application in software engineering
+date: January 30, 2016
+author: Lucian Mogoșanu
+tags: math, tech
+---
+
+I have touched on the subject of category theory in the past, motivated
+partly by my enthusiasm of working with a mathematical framework that is
+so simple yet so powerful, and partly by the usefulness of categorical
+models in software.  This essay draws from previous posts
+[on the old blog][bricks1] and from my previous experience with the
+subject, and I am posting it hoping that it will represent a starting
+point for other interesting writings.
+
+I am fairly sure that most of the work in this post is in no way
+original; that is, there are other publications where categorical
+approaches to software modeling, and matters pertaining to category
+theory in general, are already explained, and most probably better than
+they are here. For example Steve Awodey has an excellent book providing
+an in-depth mathematical exploration of category theory[^1]; Robert
+Harper discusses on the (major) impact of categories on type theory[^2],
+computation and computer programs; Brent Yorgey makes a really good
+overview of the relation between categories and Haskell type
+classes[^3]. There is much more material on the web and in books, and
+while you're not required to peruse it in order to read this, I
+certainly encourage you to have a look.
+
+## Category theory: introduction, definitions
+
+While mathematics is an exact "science"[^4], its methodology differs
+from that of, say, physics or biology, which have fundamentally
+different ojectives, although the latter very often make use of
+mathematical means to make sense of the world. Instead, it'd be fairer
+to find the origins of mathematics in philosophy, which discusses
+concepts, or ideas, or essences, rather than objective experience.
+
+For the last century or so all mathematicians and philosophers have been
+in agreement on the fact that mathematics must have a philosophical and
+logical basis. For quite a long time, that basis was, and to some degree
+still is, set theory; the limitations of naïve set theory[^5] have been
+thoroughly explored in the 20th century and the need for a "more
+complete" theory of mathematics was and is still felt by
+mathematicians. Even though nowadays we prefer using computers to solve
+problems requiring mathematics, this has nothing to do with computers
+themselves, although it has everything to do with the theory of
+computation.
+
+Category theory was for a while believed to be this new, previously
+missing, foundation of mathematics. This doesn't seem to be the
+consensus among mathematicians anymore, but despite that, categories
+still play an important role in defining the new framework[^6]. Also
+note that in Harper's Holy Trinity, the categorical approach defines the
+so-called "universe of reasoning" in terms of mappings and structures, a
+view that is very much in sync with that of software architecture and
+software engineering.
+
+What is then a category? According to the definition, any category
+necessarily consists of the following three: *objects*, *morphisms* (or
+*arrows*) and a *composition law* bearing well-defined properties.
+
+Intuitively, any mathematical object could constitute an **object** in a
+category.  Category theory classes often provide sets as the most
+intuitive example of objects; that is, any set is an object in the
+category of sets. Note that the categorical view doesn't necessarily
+care about how an object is *defined*, but rather about its properties
+in relation to the given category's arrows and the overall category's
+structure. Formally, given a category $\mathcal{C}$, we can denote its
+set of objects as $\text{Ob}(\mathcal{C})$.
+
+Also intuitively, any mapping between two objects could constitute an
+**arrow** in a category. The canonical example here is represented by
+functions, i.e. mappings between sets, but many other binary relations
+fit this description. An interesting example is that of
+[partially-ordered sets][posets]. Formally, for a given category
+$\mathcal{C}$ and two objects $A, B \in \text{Ob}({\mathcal{C}})$,
+$\text{Hom}_{\mathcal{C}}(A, B)$ denotes the set of arrows from $A$ to
+$B$; however, the function notation $\forall f, f : A \rightarrow B$ is
+also often used.
+
+Finally, **composition** is denoted using the "$\circ$" operator or
+juxtaposition, and it represents a binary operation on two arrows in a
+category. Intuitively, one may see composition similarly to function
+composition: given a category $\mathcal{C}$, three arbitrary objects $A,
+B, C \in \text{Ob}(\mathcal{C})$ two arrows $f \in
+\text{Hom}_{\mathcal{C}}(A, B)$ and $g \in \text{Hom}_{\mathcal{C}}(B,
+C)$, then there exists an arrow $h \in \text{Hom}_{\mathcal{C}}(A, C)$,
+where $h \equiv g \circ f$. A good intuition is that the "path" from $A$
+to $C$ could be represented as another arrow in $\mathcal{C}$.
+
+Composition is *associative*; that is, given $f : A \rightarrow B$, $g :
+B \rightarrow C$ and $h : C \rightarrow D$, then:
+
+$(h \circ g) \circ f \equiv h \circ (g \circ f)$
+
+Intuitively, this tells us that composition "paths" are unique and that
+the order of application of composition doesn't matter.
+
+Additionally, every object has an associated *identity* arrow; $\forall
+A \in \text{Ob}(\mathcal{C})$, then:
+
+$\exists 1_{A} \in \text{Hom}_{\mathcal{C}}(A, A)$
+
+which is invariant under composition. That is, $\forall A, B \in
+\text{Ob}(\mathcal{C})$, $\forall f : A \rightarrow B$,
+
+$1_{B} \circ f = f \circ 1_{A} = f$.
+
+These are all the elements defining a category. Intuitively, they
+naturally apply to sets and functions, giving rise to the category of
+sets, denoted **Set**: all sets are objects and all functions are
+arrows; functions may be composed associatively and every set has an
+identity function.
+
+There are other examples of categories in the world of mathematics and
+computer science, which I advise you to explore on your own. The
+concepts of *functors* and *natural transformations* are also
+fundamental to category theory, but I will skip them for now due to lack
+of space. I will instead leave the remainder of this essay to a more
+interesting example and attempt to model version control systems as
+categories. This, to my knowledge, provides a new perspective on the
+subject, so I'm hoping it will prove to be interesting and maybe even a
+bit challenging.
+
+## Example: The Git category
+
+Those of you who are coming from software engineering should be familiar
+with version control systems (VCS). VCS have been devised as
+collaborative tools between programmers who want to share code and have
+a means to keep track of changes in the code base of some particular
+piece of software. They remain crucial to software development, although
+nowadays technical people are using them to maintain all sorts of other,
+usually text-based projects such as papers or web sites. The popularity
+of [GitHub][github] has also drawn less technical people to this world
+of programming, so everyone and their dog can keep a public project
+nowadays.
+
+One particular case of version control system are distributed version
+control systems (DVCS). All VCS maintain a *repository* where code is
+stored and where the entire history of a project is maintained as a set
+of *commits*. In particular, DVCS state that every contributor to a
+project has their own copy of the repository offline, and they can keep
+their changes in sync with a remote repository by *pushing* their local
+copy. We're not particularly interested in this aspect at the moment,
+but it's interesting to note that our categorical model should also
+apply to distributed systems.
+
+Let's take the Linux kernel as an example: Linux is kept under version
+control using [Git][git]. It has multiple branches and forks (remote
+copies of a repository) and the code base of the kernel changes as new
+commits are added to the remote repository. The code is therefore in a
+particular **state** at a given point in time and its state changes with
+each commit, usually by applying a patch, or a **diff**, which holds as
+information the "difference", in lines of code (LOC) added or deleted,
+between the old state and the new one. So far, so good.
+
+Given that there are many possible modifications that could arise from a
+given state, the code might diverge into multiple **branches** which
+will later need to be **merged** or **rebased**. I won't go into detail
+regarding these concepts, but they should nevertheless prove to be
+interesting from a categorical point of view. For now we assume that the
+repository goes through a list (as opposed to a graph) of states as it
+changes, each change, or set of changes, being marked by a diff.
+
+Intuitively, it should be fairly obvious that repository states can be
+viewed as objects in a category: assuming for example that the commit
+hashes in a Git repository are unique[^7], each hash marks the
+identifier of a "version" of the code in that repository. If we wanted
+to prove an isomorphism between code revisions and mathematical sets, we
+would intuitively see each revision as a set comprising arbitrary
+strings, i.e. the actual code.
+
+Also intuitively enough, we could look at commit diffs in the same way
+we look at a categorical arrow, each diff providing a mapping between
+two states in the same way a function provides a mapping between two
+sets. For example, in git, this difference is provided in terms of lines
+added and removed from a certain code base[^8].
+
+This representation gives rise to a small complication. In practice
+there is usually more than one way to go from one revision to
+another. Given for example a certain code base upon which various
+modifications have been made, the developer may choose to either create
+a big commit containing all the changes, or various smaller commits,
+each comprising a unit of their work[^9]. For the sake of making our
+model simpler, we can define a "minimal commit" unit, represented by the
+removal or addition of a certain line in a code base.
+
+We also note that commit diffs are composable most of the
+time[^10]. Given two successive commits, one may represent them as a
+single commit, e.g. by [squashing][git-squash] them in Git, or by simply
+applying git-diff between two commit hashes. This is fortunate for us,
+as it allows us to represent a possible commit as a chain of
+compositions of multiple "minimal commits". The possible compositions
+are conceptually very similar to a [Hasse diagram][hasse-diagram],
+which, interestingly enough, provides an analogy between commits and
+posets.
+
+Finally, we can look at the empty diff, i.e. the diff with no additions
+and no removals, as the canonical representation of an identity
+arrow. Git doesn't actually allow empty commits, given that the new
+generated repository state would be (needlessly) identical to the old
+one, but we can model them anyway, as we know for sure that a git-diff
+between an arbitrary commit hash and itself will always be empty.
+
+From all the above emerges the Git category. The usefulness of this
+representation is a whole different problem, but I am guessing that
+various operations, e.g. merging, rebasing, defining submodules or other
+useful operations that haven't been yet designed into state of the art
+DVCS, can be represented as monadic actions. This of course would
+involve answering deeper questions, such as what is an endofunctor in
+the Git category, but for the sake of brevity we will stop this train of
+thought here.
+
+## Exercise: The Blockchain category, analogy with DVCS
+
+The [blockchain][blockchain] is a database design coming from
+Bitcoin[^11]. Although the idea was conceived specially for implementing
+a new form of [representing money][infrastructure-iii], its uses may
+theoretically go [beyond that][infrastructure-iv], into other
+distributed systems and applications.
+
+Simply put, the blockchain is a distributed chain of transactions. It is
+distributed in the sense that all the participants, e.g. in the Bitcoin
+system, should hold a copy of it. It contains transactions, that is,
+statements that a certain piece of information, e.g. money in Bitcoin's
+case, is transferred from one participant to the other, in the broad
+sense that a "participant" is the same thing as an
+account. Transactions, and more specifically parent transactions, are
+identified by their hashes.
+
+There is an immediate analogy between VCS and blockchains. The
+categorical likeness of the two follows from that directly: in both
+cases, system states are objects and transitions between states are
+arrows; in both cases, arrow composition is representable and both allow
+the existence of a conceptual identity transaction. This shows that the
+architectural differences between the two are very few.
+
+The design and implementation differences are in the
+details. Transactions are inserted in the blockchain by a consensus
+protocol; in Git, the policy for insertion is determined by the
+computing systems where the bare repositories are stored. Git
+transactions are independent of their content, containing anything from
+source code to binary data; blockchain transactions have a more
+restrictive format, depending on their application.
+
+In theory one could generalize databases[^12] using categories. These
+examples show that category theory is or could be, among other
+mathematical abstractions, very useful to defining software both
+architecturally and at the implementation level. Given that software
+developers are faced with the pain of building robust and/or resilient
+systems in a context where software verification and specification
+doesn't scale, such abstractions are (arguably) needed now more than
+ever.
+
+[^1]: Awodey, Steve. Category theory. Vol. 49. Oxford University Press,
+    2006.
+
+[^2]: [The Holy Trinity][trinitarianism]
+
+[^3]: [Typeclassopedia][typeclassopedia]
+
+[^4]: In the broadest sense of the word "science", that coming from its
+Latin root, where its meaning overlaps with that of "knowledge".
+
+[^5]: [Russell's paradox][russell], for example.
+
+[^6]: Univalent Foundations Program. [Homotopy Type Theory: Univalent
+Foundations of Mathematics][hott]. Univalent Foundations, 2013.
+
+[^7]: Which, by the way, they aren't. Fortunately the basic properties
+of the SHA-1 hash make collisions [highly improbable][sha-1-git], and in
+theory one could devise a (D)VCS commit addressing scheme that
+completely avoids this problem.
+
+[^8]: I am deliberately avoiding to see repositories as collections of
+files, as this would make our definition a lot more complicated.
+
+[^9]: This is not an easy problem, as seen in [Commit Often, Perfect Later,
+Publish Once][git-best-practices].
+
+[^10]: There is an interesting mention to be made here regarding merge
+conflicts. In mathematical terms, this only tells us that the "minimal
+diff" doesn't provide a full mesh of mappings between repository states.
+
+[^11]: Nakamoto, Satoshi. "[Bitcoin: A peer-to-peer electronic cash
+system.][bitcoin]" Consulted 1.2012 (2008): 28.
+
+[^12]: Transactions are of particular interest to us in this post, but other
+aspects such as relational algebra could be seen as a particular case of
+categories. See "[Category Theory as a Unifying Database Formalism][database]"
+for more details.
+
+[bricks1]: http://lucian.mogosanu.ro/bricks/o-introducere-usor-neobisnuita-in-domeniul-arhitecturii-software/
+[trinitarianism]: http://existentialtype.wordpress.com/2011/03/27/the-holy-trinity/
+[typeclassopedia]: https://www.haskell.org/haskellwiki/Typeclassopedia
+[russell]: http://en.wikipedia.org/wiki/Russell%27s_paradox
+[hott]: http://homotopytypetheory.org/book/
+[posets]: http://en.wikipedia.org/wiki/Partially_ordered_set
+[github]: https://github.com/
+[git]: http://git-scm.com/
+[sha-1-git]: http://git-scm.com/book/es/v2/Git-Tools-Revision-Selection
+[git-best-practices]: https://sethrobertson.github.io/GitBestPractices/
+[git-squash]: http://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#Squashing-Commits
+[hasse-diagram]: http://mathworld.wolfram.com/HasseDiagram.html
+[blockchain]: https://en.bitcoin.it/wiki/Block_chain
+[bitcoin]: https://bitcoin.org/bitcoin.pdf
+[infrastructure-iii]: /posts/y01/027-bitcoin-as-infrastructure-iii.html
+[infrastructure-iv]: /posts/y01/031-bitcoin-as-infrastructure-iv.html
+[database]: http://math.mit.edu/~dspivak/informatics/notes/unorganized/PODS.pdf
diff --git a/posts/y02/043-on-the-failure-of-marketing.markdown b/posts/y02/043-on-the-failure-of-marketing.markdown

new file mode 100644 (file)

index 0000000..630bc27
--- /dev/null
+++ b/posts/y02/043-on-the-failure-of-marketing.markdown
@@ -0,0 +1,159 @@
+---
+postid: 043
+title: On the failure of marketing (and civilization in general)
+date: February 14, 2016
+author: Lucian Mogoșanu
+tags: asphalt
+---
+
+"Marketing" is, or should be, in fact a bit of an umbrella term for at
+least two or three things.
+
+Firstly, marketing is, or should be, the science that studies the needs
+of the market, or more exactly the needs of the people that make up the
+market. This is the so-called "market research": what products do people
+*need* and what *can* they (afford to) buy?
+
+Secondly, marketing is, or should be, a set of techniques for making a market,
+or rather the people that make up a market, aware of the existence of some
+product, no more, no less. This is roughly the same as what people nowadays
+call "public relations".
+
+It happens, as the history goes, that in the past few decades[^1] a slow
+but sure rupture between the term and its meaning occured, among others,
+in marketing, and this phenomenon will, I am assuming, continue along
+its path towards a slow and painful death. The meaning of marketing has
+already inflated, or rather, it has become more and more diluted, as
+definitions such as the couple above have shifted more from is to
+should-be.
+
+To illustrate this, we will take a very simple example, that of the
+mobile phone[^2]. A first observation would be that nowadays' mobile
+phones are no longer phones in the classic sense of the word, such that
+their stupid creators[^3] had to pollute the space of ideas with the new
+concept of smartphone. And this has been going on and on with the
+tablet, phablet and who knows what's next.
+
+Notice how these new products are not really innovative. Smartphones are
+in fact mobile phones with an integrated camera of lower quality than
+previous dedicated cameras[^4] and an integrated computer of much less
+power[^5] than the average desktop computer, among the other integrated
+products, usually of lower quality than their predecessors. Tablets are
+bigger smartphones that can pack a bit more hardware, while phablets
+are, I don't know, FSM[^6] knows what.  The next thing they'll do is try
+to put the same thing on the head unit of your car and in your fridge,
+in a desperate attempt to mix stuff together in the other new
+meta-buzzword called "the Internet of Things".
+
+What's more outrageous is that the mobile phone has an artificially
+induced lifespan of about one to two years[^7]. That is because most
+modern organizations impose themselves this magical thing called
+time-to-market, which means that a given product must imperatively be
+released until some given date. It doesn't matter that it's unusable,
+that it has bugs or that
+[software engineering is a myth][software-engineering], they'll want it
+out by then and the armies of employees will have to work their asses
+off for that. That is, until the next iteration, when they'll ship with
+some other useless "features" and a set of new, shiny handicaps that'll
+make your life a nightmare. And I thought it was now long established
+that the only product worth buying once a year was the calendar, as per
+the ol' communist centralized planned economy model.
+
+<p style="text-align:center; font-weight:bold;">⁂</p>
+
+Although it doesn't look like it on a first glance, marketing is failing
+because it doesn't inform people of the existence of things that they
+need to buy. What it does instead is to aggressively lure them into
+wanting, that is, into believing that they need to buy a certain
+product, regardless of whether they actually do; or, more importantly,
+not.
+
+How product owners do that is a whole different story. Branding is in
+fact not so harmful as one might believe. The introduction of jargon up
+to saturation is however a great source of confusion for clients, who
+don't feel safe delving into technical details, and thus they're given
+some weird term to cater to their naïveté. Returning to the smartphone
+example, tell me what Corning Gorilla Glass *actually* means and you win
+a prize. No, you don't know, you just trust[^8] what you're told, and
+the producers could give you a piece of post-processed horse manure as
+far as they're concerned, you'll still buy it.
+
+This, combined with 24/7 mass propaganda are *the* things that make the
+market go round, "tech start-ups" gain billions of fake dollars[^9] and
+pop stars chill with their homies in their cribs.
+
+Now, why they do that is yet another different story. They do it because
+it's easy, first and foremost. It didn't use to be easy back in the day,
+but it's gotten progressively so as the generations got dumber[^10] and
+the dumb taught their children to be even dumber, so that they just
+returned to shopping shortly after the airplanes took down at least a
+part of the non-dumbness that was left in this otherwise dumb
+"civilized" world.
+
+<p style="text-align:center; font-weight:bold;">⁂</p>
+
+Of course, "it's not marketing's fault"[^11] that marketing is failing,
+or has failed.  The fault -- not a moral fault, but a deep, technical
+fault, in the sense of "failure" -- lies in a culture who found it
+easier to manipulate adults than to educate their children properly,
+where memes, tropes and quotes taken out of context hold more value than
+a book and where one must "do what they enjoy"[^12].
+
+The net effect of this marketing that is not a marketing is
+[post-religion][post-religion], transitioning into full-blown
+[fundamentalism][religiousness].
+
+Still think I'm full of shit? Here's what: take a popular video on
+YouTube, preferably one that you also like; look in the comments
+section, but promise you're going to read it in its entirety. If you
+don't see anything wrong with what's going on there, then there's the
+door, have fun with your Bieber and your tablet and stop wasting your
+time and my bandwidth.
+
+[^1]: About roughly the same time as my age. Is this a coincidence? I
+have no idea.
+
+[^2]: Although any product would do. Really. Go ahead, choose
+one. You'll be surprised by how most things have been twisted into
+useless junk by today's "marketing".
+
+[^3]: Yes, I am looking at you, rotten Steve Jobs.
+
+[^4]: Although the gap between the two has narrowed and it continues to
+    do so.
+
+[^5]: Not in terms of *raw* computing power, but in terms of what -- and
+this is a very broad "what" -- its master can do with it. You don't even
+own your smartphone, so you can only use it for whatever "apps" your
+master has designed for you. Oh, and the gap between *these* two will
+only continue to widen. Just look at your
+[average mobile operating system][android].
+
+[^6]: Flying Spaghetti Monster.
+
+[^7]: Nobody cares of the poor hardware. Most sane people can and will
+still make use of that old Nokia 3310, *and* break someone's head with
+it in self-defense. There, integration!
+
+[^8]: The takeaway message here is that trust does indeed mean
+something, only not on the mass market. No, not when you're one of the
+billion clueless consumers. So whatever you'd say, they tricked you into
+buying their latest and greatest.
+
+[^9]: Don't tell me you thought WhatsApp are *really* worth that
+much. Well, you'll be surprised, sooner rather than later.
+
+[^10]: Or maybe "the generations got dumber" is just bias? It might be,
+but this is a story for another time.
+
+[^11]: On the same note as "information wants to be free".
+
+[^12]: I'm probably a hedonist at least as much as anybody else, but the
+question is: if you look around you, can you easily spot the things that
+you don't enjoy? And moreover, what are you going do to purge them out
+of your life?  Starting, say, yesterday.
+
+[android]: /posts/y02/03f-android-the-bad-and-the-ugly.html
+[software-engineering]: /posts/y02/03c-the-myth-of-software-engineering.html
+[post-religion]: /posts/y00/018-on-post-religion.html
+[religiousness]: /posts/y01/034-the-transition-back-into-religiousness.html
author	Lucian Mogosanu <lucian.mogosanu@gmail.com>
	Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)
committer	Lucian Mogosanu <lucian.mogosanu@gmail.com>
	Sun, 14 Feb 2016 17:17:17 +0000 (19:17 +0200)
posts/y02/042-category-theory-software-engineering.markdown	[new file with mode: 0644]	patch \| blob
posts/y02/043-on-the-failure-of-marketing.markdown	[new file with mode: 0644]	patch \| blob