Lambda calculus

Originally posted on dev.to/gillchristian.

Lambda calculus

I figured for some of the days of #100DaysOfFP it makes more sense to write something longer than a tweet. So every now and then I'll post in dev.to instead of only a tweet. 😄

This week the Delft Haskell Study Group started again. I attended last year but didn't put in as much focus as I wanted, so I decided to give it another shot this year.

The goal of the study group is to read some chapter/s of Haskell Programming from First Principles and then go through the exercices together.

My goal for these series is to share a few things from the chapter. More than a blogpost this is a collection of notes that could serve as a recap of the chapter.

Back to the roots

Even the greatest mathematicians, the ones that we would put into our mythology of great mathematicians, had to do a great deal of leg work in order to get to the solution in the end.

― Daniel Tammett

The first chapter of the book doesn't go directly into Haskell and programming but instead introduces us to Lambda calculus. Why? Because it's the foundation on top of what Functional Programming, and thus Haskell, is based.

Lambda calculus is "a model of computation devised in the 1930s by Alonzo Church. [...] Functional programming languages are all based on the lambda calculus" ¹.

Functional programming: "a computer programming paradigm that relies on functions modeled on mathematical functions [...] programs are a combination of expressions".

Expressions: "include concrete values, variables, and also functions".

Functions: are "expressions that are applied to an argument or input, and once applied, can be reduced or evaluated". And they define a "relation between a set of possible inputs and a set of possible outputs".

Lambda expressions

In lambda calculus we have: expressions, variables and abstractions. Where expressions are a superset as they can be variables, abstractions or a combination of them.

Variables: names for potential inputs of functions.

Abstractions: functions. A lambda term consisting of a head (the lamda) and a body, and it's applied to an argument.

Argument: a value.

Application: applying a lambda function to an argument.

λx.x

Let's break that down ²:

λ x . x
^─┬─^
  └────── extent of the head of the lambda.

λ x . x
  ^────── the single parameter of the
          function. This binds any
          variables with the same name
          in the body of the function.

λ x . x
      ^── body, the expression the lambda
          returns when applied. This is a
          bound variable.

Functions in lambda calculus are anonymous. But we still call them abstractions as they generalize a problem and it's resolution.

Alpha equivalence:

λx.x
λd.d
λz.z

Beta reduction

"When we apply a function to an argument, we substitute the input expression for all instances of bound variables within the body of the abstraction. You also eliminate the head of the abstraction, since its only purpose was to bind a variable." ³

(λx.x) 2
[x ∶= 2]
2

The argument can be anything, even other lamda:

(λx.x)(λy.y)
[x ∶= (λy.y)]
λy.y

Applications in the lambda calculus are left associative:

(λx.x)(λy.y)z
[x := (λy.y)]
(λy.y)z
[y := z]
z

Some variables might be free, meaning they aren't bound in the head:

λx.xy

If we apply such lambda:

(λx.xy)z
[x := z]
zy

Multiple arguments

"Each lambda can only bind one parameter and can only accept one argument. Functions that require multiple arguments have multiple, nested heads. When you apply it once and eliminate the first (leftmost) head, the next one is applied and so on" ⁴. This is called currying.

λxy.xy === λx.(λy.xy)

Beta normal form: when you cannot beta reduce the terms any further.

λx.x    -- beta reduced, no arguments to apply

(λx.x)z -- not beta reduced, `z` hasn't been applied

Combinator: lambda term with no free variables.

λxy.x       -- Combinator
λxyz.xz(yz) -- Combinator

λy.x        -- Not combinator
λx.xz       -- Not combinator

Divergence: lambdas that never reduce to a normal form are said to diverge.

(λx.xx)(λx.xx)
[x := (λx.xx)]
(λx.xx)(λx.xx)
[x := (λx.xx)]
(λx.xx)(λx.xx)
[x := (λx.xx)]
(λx.xx)(λx.xx)
...

In programming, terms that diverge don't produce an actual answer, at least not a meaningful one.

But why?

After reading this chapter the first time it was not clear to me how all of this applied to Haskell, or Functional Programming in general.

Turns out that the process of reducing the expressions (beta reduction) is how Haskell code evaluates.

The following lambda expression that reduces to z

(λabc.cba)yz(λwv.w)
[a := y]
(λbc.cby)z(λwv.w)
[b := z]
(λc.czy)(λwv.w)
[c := (λwv.w)]
(λwv.w)zy
[w := z]
(λv.z)y
[v := y]
z

It's the equivalent of this in Haskell:

-- y = 1
-- z = 2

(\a b c -> c b a) 1 2 (\w v -> w)
(\b c -> c b 1) 2 (\w v -> w)
(\c -> c 2 1) (\w v -> w)
(\w v -> w) 2 1
(\v -> 2) 1
2

If we run it in GHCi we get the same result:

λ> (\a b c -> c b a) 1 2 (\w v -> w)
2

Haskell Programming from First Principles, page 2. ↩
Haskell Programming from First Principles, page 6. ↩
Haskell Programming from First Principles, page 7. ↩
Haskell Programming from First Principles, page 10. ↩