If you want to (or have to) learn Haskell, you can’t avoid the two functions foldl
and foldr
.
They are two of the most powerful tools in Haskell’s toolbox.
However, they continue to confuse beginners and experts alike, so here’s my take on making these functions a little more accessible.
What do foldl
and foldr
do?
They both fold, i.e., reduce, a list to a single value, albeit in a different way.^{1}
Using foldl
and/or foldr
, you can implement every other function.
In fact, many functions use foldl
or foldr
internally—that’s how they are implemented in the first place!
Both foldl
and foldr
are so mighty, you could merely think of them as extremely powerful lowlevel functions used by those functions you normally use.
You could wonder why you should bother understanding lowlevel code and learning the interna of your highlevel abstractions, when you can also use the highlevel abstractions themselves.
But that would do those functions injustice.
When you really understand foldl
and foldr
, you can use these mighty functions to implement your own highlevel abstractions and harness their power to make your Haskell code so much better and faster.
Disclaimer: I myself am only just beginning to learn Haskell, so there’s no guarantee that everything I state is actually correct.
Comparison Between Haskell and Python
If you know Python, Python’s reduce
function would be the analogon to a slight variation of Haskell’s foldl
.
I will now briefly go off at a tangent because I presume that you a more proficient in Python than you are in Haskell.
I figure that laying a groundwork first and building upon it will help you understand Haskell better.
If you’re not interested in this tangent, skip the rest of this section and jump directly to this section.
Let’s examine an example.
In Python, the function call sum([1,2,3,4,5])
of course calculates the value 15
.
You could achieve the same goal writing the Python code
reduce(lambda x, y: x+y, [1,2,3,4,5])
instead, which calculates ((((1+2)+3)+4)+5)
which is also 15
.
“Hold on a minute. How exactly does this work?”
First, lambda x, y: x+y
creates a function that simply adds two values x
and y
.
Let’s name this function add_two_values
.
Now we provide this function add_two_values
as an argument to reduce
, i.e., we call reduce(add_two_values, [1,2,3,4,5])
.
This is what’s known as higherorder functions.
One function serves as parameter of a different function.
After all, we could also want to calculate the product 1*2*3*4*5=120
and provide reduce
with a selfdefined function named multiply_two_values
.
In this case, however, we want to calculate the sum and thus provide our function add_two_values
.
According to the documentation, internally, a variable—let’s call it result
—would be initiated with the first list item, i.e., result=1
.
Then, in a forloop we would add list item for list item, i.e., 1+2
equals 3
.
Then 3+3
equals 6
, and so on, until we return result=15
.
This is almost like foldl
in Haskell works.
foldl and foldl'
Why do I keep saying almost and slight variation?
Because foldl
doesn’t immediately evaluate the intermediate results, i.e., it doesn’t add 1+2
to 3
and then 3+3
, et cetera, like Python’s reduce
does.
Instead, foldl
keeps 1+2
as 1+2
.
Adding 3
to that would then generate (1+2)+3
instead of 3+3
.
The foldl
function builds the socalled thunk chain until we get ((((1+2)+3)+4)+5)
(instead of (10+5)
).
This is because of a thing in Haskell called lazy evaluation.
And in contrast to Python’s reduce
function, foldl
first has to build the entire thunk chain before it can start reducing that thunk chain.
This means that we have to allocate memory on the heap for all elements of the list until the thunk chain is finally established.
Only then can we start reducing the thunk chain.
Whereas Python’s reduce
and Haskell’s foldl'
use an accumulator in order to immediately reduce these expressions.
More on that in a second.
Haskell’s equivalent to the way Python evaluates that function call would be foldl'
(notice the apostrophe).
In Haskell, we use the variable name together with an '
at the end when the implementation is so similar and only such a slight variation, that this new function doesn’t deserve a name of its own.
In Haskell, foldl'
is way more efficient than foldl
because you don’t have to first build up a huge thunk chain before you can finally start reducing the expression.
As I said, with foldl
you have to first allocate memory on the heap for every single list item until you have the finished thunk chain.
The heap is limited only by the amount of RAM your computer has available and the swap.
Basically, for large lists Haskell will eat all your RAM and for very large lists, your computer will become very slow because the content of your exhausted RAM has to be shoveled to the relatively slow SSD.
Only when the thunk chain is built in its entirety can you start reducing:
((((1+2)+3)+4)+5)
But wait! It get’s worse.
The foldl
function now has to work its way from the outside to the inside again.
That is, before you can reduce something something something plus 5, you first have to know what that something is.
To do that, Haskell has to push the value 5
to the stack, i.e., save that value for later, and examine what that unknown something is.
Now it’s the same all over again, only this time with the value 4
.
That 4
too gets pushed to the stack.
We repeat that same procedure and push every value we encounter to the stack for the time being until we finally arrive at 1+2
.
Finally the +
operator now has two known operands instead of an unknown something plus a known value.
Finally the +
operator can do it’s job and actually calculate a result.
Once we’ve arrived at this point, we can walk forward again, retrieve those values saved on the stack as we go and finally determine the ultimate result, in the same fashion reduce
and foldl'
would have done it from the getgo.
((((1+2)+3)+4)+5)
becomes
(((3+3)+4)+5)
becomes
((6+4)+5)
becomes
(10+5)
becomes
(15)
The catch is … if we arrived at that point 1+2
at all.
Another problem we’d encounter with foldl
is that for very large lists, we’d have to push many, many values to the stack before we reach the innermost expression 1+2
.
This can lead to a stack overflow before we even get to 1+2
.
That means two things:
 not only did Haskell eat all our RAM and thereby made our computer very sluggish while building the thunk chain in the first place
 when we traverse the thunk chain from its end back to its beginning in order to evaluate/reduce the chain, we sometimes have to save so many items for later that the entire Haskell program crashes due to a stack overflow
As we now know, foldl'
is much better than the regular foldl
.
So much so, that in reality, the decision is not between foldr
and foldl
but between foldr
and foldl'
.
We usually don’t use foldl
at all for this reason.
However, foldl'
also has its disadvantage.
Lazy evaluation is one of Haskell’s most powerful and elegant features, and foldl'
can’t make use of it.
However … while Haskell is extremely smart and while lazy evaluation is an extremely smart feature, in some very rare cases Haskell can be too smart for its own good.
This is one of those rare cases.
With foldl'
we tell Haskell to explicitly not use lazy evaluation.
If you want to know whether it’s bad to give up lazy evaluation or what the implications are, check out this site.
Example: let’s define a weird multiplication operator and call it ?
.
_ ? 0 = 0
x ? y = x*y
The important takeaway is that
undefined ? 0
returns0
; if the second operand is0
, we return0
, no matter what0 ? undefined
hits an exception; you can’t multiply0
withundefined
Now consider
foldl' (?) 1 [2,3,undefined,5,0]
== foldl' (?) 6 [undefined,5,0]
This leads to an exception, since the next step would be to multiply 6
with undefined
.
Whereas the regular foldl
wouldn’t even try to evaluate the undefined value and would just skip it until the entire thunk chain is built, or rather, until the time has come that Haskell absolutely needs to evaluate undefined
(that’s the power of lazy evaluation).
foldl (?) 1 [2,3,undefined,5,0]
== ((((1 ? 2) ? 3) ? undefined) ? 5) ? 0
As soon as the complete thunk chain is established, we can start reducing it.
And the very first step in this reduction is to apply our operator ?
to something and 0
.
Since the second operand of ?
is 0
, we return 0
.
(Notice that while with the addition, we had to push 0
to the stack and save it for later when we knew what that something was, with ?
we don’t need to push 0
to the stack. The implementation of ?
is clear: when the second operand is 0
, return 0
. We don’t need to first know what that something on the left side of ?
is.)
Therefore, if you thought of some clever trick that needs to make use of lazy evaluation in order to work, use foldl
, otherwise use foldl'
.
foldl and foldr
Well, this was a long prelude (Ha! See what I did there?) to foldl
and foldr
.
We wanted to talk Haskell, not Python.
We wanted to talk the difference between foldl
and foldr
.
You would surely like to hear easy ways to memorize their differences.
And we will do that.
Nonetheless, I think now that we laid the groundwork and have a very basic understanding of foldl
, we will have a much easier time grasping the things to come.
Differences Between foldl and foldr
How exactly does foldl
in Haskell work?
In Python’s reduce
, you can provide an optional initializer
value.
In Haskell, this initial value is nonoptional.
In Python, when you provide an initializer
, e.g., 100
, the result
variable will not be initialized with the first list item but with that initial value.
The call reduce(add_two_values, [1,2,3,4,5], 100)
would return 115
instead of 15
.
Similarly, reduce(add_two_values, [1,2,3,4,5], 0)
is equivalent to reduce(add_two_values, [1,2,3,4,5])
, i.e., to omitting the initial value.
In Haskell, being a functional language by design, we don’t need a lambda expression to express the function add_two_values
.
We can simply use the +
operator.
Therefore, in Haskell notation, the equivalent to our sum
implementation written in Python as reduce(add_two_values, [1,2,3,4,5])
(apart from that foldl
vs foldl'
thing) would be:
foldl (+) 0 [1,2,3,4,5]
Now, what’s the difference between this and foldr
, i.e.
foldr (+) 0 [1,2,3,4,5]
They both calculate 15
.
And since it’s only addition which is an associative and commutative operation, in this case it makes absolutely no matter whether we use foldl
or foldr
.
However, there are functions, e.g., subtraction or division, where the distinction between foldl
and foldr
is crucial.
More on that in a moment.
Here’s one very simple way to memorize the difference between foldl
and foldr
.
My professor would most certainly not approve of it, but between you and me it’s okay to be a little informal and toss the formalities aside to simplify things.
With that being said, imagine it this way:
With foldl
you go through the list from left to right.
With foldr
you come at the list from the right side, but you nevertheless write it down as if you were going lefttoright.
In both cases, the initial value is the first element.
This gives you the following ordering:
foldl: 0→1→2→3→4
foldr: 1←2←3←4←0
Now put brackets around those two numbers you read first. When you then treat this bracketed expression as one single value, you can think of it as always bracketing the first two values. You will then get this:
foldl: ((((0→1)→2)→3)→4)
foldr: (1←(2←(3←(4←0))))
Now replace the arrow with your function and you have the infix notation (e.g., 5 `mod` 2
rather than mod 5 2
; similarly, 5 + 2
rather than (+) 5 2
).
For example:
foldl (+) 0 [1,2,3,4] == ((((0+1)+2)+3)+4)
foldr (+) 0 [1,2,3,4] == (1+(2+(3+(4+0))))
Why the Differentiation Is So Important
Now, why and when is this important? As I said, for simple calculations that are associative as well as commutative such as addition or multiplication, there’s no mathematical difference. The result is the same, regardless whether we use lefttoright associativity or righttoleft associativity (and reposition the initial value). However, consider the subtraction or, in this case, the division:
foldl (/) 0 [1,2,3,4] == ((((0 / 1) / 2) / 3) / 4)
Nothing surprising here, foldl
returns 0.0
.
Now comes the interesting part:
foldr (/) 0 [1,2,3,4] == (1 / (2 / (3 / (4 / 0))))
Division by zero! 😱
What’s even more surprising: this doesn’t cause an error.
Instead, the result of 4/0
is treated as Infinity
.
And therefore, the result of that function call depends on the length of the list.
Wow!
The above foldr
with an even number of list elements returns 0.0
whereas foldr
with an odd number of list elements returns Infinity
… two completely opposite results!
A Simple Reordering
The infix notation is familiar.
We’re used to it since first grade.
Unfortunately, functions in Haskell typically don’t use the infix notation but the prefix notation.
So let’s rewrite the notation in the above example evaluations to make it more general and align it with how code would look like in production.
To keep it clear, let’s also replace (+)
with addTwoValues
or, more generally, f
.
Otherwise, the many (
and )
characters introduced by (+)
would only unnecessarily add visual clutter.
foldl f 0 [1,2,3,4] == (f (f (f (f 0 1) 2) 3) 4)
foldr f 0 [1,2,3,4] == (f 1 (f 2 (f 3 (f 4 0))))
This basically really only is a reordering of the above.
How to Memorize foldl and foldr
Now that we have these examples clear before our eyes, we can ponder an alternative way to memorize folding/reducing which is probably more formally correct and would more likely be accepted by my professor.
foldl
With foldl
, you walk through the list from left to right.
 The left argument is the result of the previous iteration. In the first iteration, it’s the initial value.
 The right argument is the current list element.
The result of the current iteration becomes the left argument of the next iteration.
In our example, we start with the leftmost list element, i.e., 1
.
The “result” of the (nonexisting) iteration prior to the first iteration is the initial value, i.e., 0
.
This means:
f 0 1
becomes
f (f 0 1) 2
becomes
f (f (f 0 1) 2) 3
becomes
f (f (f (f 0 1) 2) 3) 4
where f
here equals to (+)
.
foldr
With foldr
, you walk through the list from right to left.
 The left argument is the current list element.
 The right argument is the result of the previous iteration. In the first iteration, it’s the initial value.
The result of the current iteration becomes the right argument of the next iteration.
We start with the rightmost list element, i.e., 4
.
The “result” of the (nonexisting) iteration prior to the first iteration is the initial value, i.e., 0
.
This means:
f 4 0
becomes
f 3 (f 4 0)
becomes
f 2 (f 3 (f 4 0))
becomes
f 2 (f 3 (f 4 0))
becomes
f 1 (f 2 (f 3 (f 4 0)))
where f
here equals to (+)
.
Verification
If we compare this approach to the other approach presented earlier, we can see that the resulting structures of the evaluated expressions are identical.
Final Words
I hope you now have a better understanding of foldl
and foldr
in Haskell.
If you could learn something from this article, consider showing some appreciation for the time I invested to write this article.
Leave a comment and tell me what you think, or contact me on Twitter.
If you follow me, you will also be notified when I publish my next guide.
Thanks, and take care.

More precisely, not just lists but any data structure of the type
Foldable
(see documentation). ↩︎