Back

Haskell’s foldl and foldr Explained

If you want to (or have to) learn Haskell, you can’t avoid the two functions foldl and foldr. They are two of the most powerful tools in Haskell’s toolbox. However, they continue to confuse beginners and experts alike, so here’s my take on making these functions a little more accessible.

What do foldl and foldr do? They both fold, i.e., reduce, a list to a single value, albeit in a different way.1 Using foldl and/or foldr, you can implement every other function. In fact, many functions use foldl or foldr internally—that’s how they are implemented in the first place! Both foldl and foldr are so mighty, you could merely think of them as extremely powerful low-level functions used by those functions you normally use. You could wonder why you should bother understanding low-level code and learning the interna of your high-level abstractions, when you can also use the high-level abstractions themselves. But that would do those functions injustice. When you really understand foldl and foldr, you can use these mighty functions to implement your own high-level abstractions and harness their power to make your Haskell code so much better and faster.

Disclaimer: I myself am only just beginning to learn Haskell, so there’s no guarantee that everything I state is actually correct.

Comparison Between Haskell and Python

If you know Python, Python’s reduce function would be the analogon to a slight variation of Haskell’s foldl. I will now briefly go off at a tangent because I presume that you a more proficient in Python than you are in Haskell. I figure that laying a groundwork first and building upon it will help you understand Haskell better. If you’re not interested in this tangent, skip the rest of this section and jump directly to this section.

Let’s examine an example. In Python, the function call sum([1,2,3,4,5]) of course calculates the value 15. You could achieve the same goal writing the Python code

reduce(lambda x, y: x+y, [1,2,3,4,5])

instead, which calculates ((((1+2)+3)+4)+5) which is also 15. “Hold on a minute. How exactly does this work?” First, lambda x, y: x+y creates a function that simply adds two values x and y. Let’s name this function add_two_values. Now we provide this function add_two_values as an argument to reduce, i.e., we call reduce(add_two_values, [1,2,3,4,5]). This is what’s known as higher-order functions. One function serves as parameter of a different function. After all, we could also want to calculate the product 1*2*3*4*5=120 and provide reduce with a self-defined function named multiply_two_values. In this case, however, we want to calculate the sum and thus provide our function add_two_values. According to the documentation, internally, a variable—let’s call it result—would be initiated with the first list item, i.e., result=1. Then, in a for-loop we would add list item for list item, i.e., 1+2 equals 3. Then 3+3 equals 6, and so on, until we return result=15. This is almost like foldl in Haskell works.

foldl and foldl'

Why do I keep saying almost and slight variation? Because foldl doesn’t immediately evaluate the intermediate results, i.e., it doesn’t add 1+2 to 3 and then 3+3, et cetera, like Python’s reduce does. Instead, foldl keeps 1+2 as 1+2. Adding 3 to that would then generate (1+2)+3 instead of 3+3. The foldl function builds the so-called thunk chain until we get ((((1+2)+3)+4)+5) (instead of (10+5)). This is because of a thing in Haskell called lazy evaluation. And in contrast to Python’s reduce function, foldl first has to build the entire thunk chain before it can start reducing that thunk chain. This means that we have to allocate memory on the heap for all elements of the list until the thunk chain is finally established. Only then can we start reducing the thunk chain. Whereas Python’s reduce and Haskell’s foldl' use an accumulator in order to immediately reduce these expressions. More on that in a second. Haskell’s equivalent to the way Python evaluates that function call would be foldl' (notice the apostrophe). In Haskell, we use the variable name together with an ' at the end when the implementation is so similar and only such a slight variation, that this new function doesn’t deserve a name of its own. In Haskell, foldl' is way more efficient than foldl because you don’t have to first build up a huge thunk chain before you can finally start reducing the expression. As I said, with foldl you have to first allocate memory on the heap for every single list item until you have the finished thunk chain. The heap is limited only by the amount of RAM your computer has available and the swap. Basically, for large lists Haskell will eat all your RAM and for very large lists, your computer will become very slow because the content of your exhausted RAM has to be shoveled to the relatively slow SSD. Only when the thunk chain is built in its entirety can you start reducing:

((((1+2)+3)+4)+5)

But wait! It get’s worse. The foldl function now has to work its way from the outside to the inside again. That is, before you can reduce something something something plus 5, you first have to know what that something is. To do that, Haskell has to push the value 5 to the stack, i.e., save that value for later, and examine what that unknown something is. Now it’s the same all over again, only this time with the value 4. That 4 too gets pushed to the stack. We repeat that same procedure and push every value we encounter to the stack for the time being until we finally arrive at 1+2. Finally the + operator now has two known operands instead of an unknown something plus a known value. Finally the + operator can do it’s job and actually calculate a result. Once we’ve arrived at this point, we can walk forward again, retrieve those values saved on the stack as we go and finally determine the ultimate result, in the same fashion reduce and foldl' would have done it from the getgo.

((((1+2)+3)+4)+5)

becomes

(((3+3)+4)+5)

becomes

((6+4)+5)

becomes

(10+5)

becomes

(15)

The catch is … if we arrived at that point 1+2 at all. Another problem we’d encounter with foldl is that for very large lists, we’d have to push many, many values to the stack before we reach the innermost expression 1+2. This can lead to a stack overflow before we even get to 1+2. That means two things:

  • not only did Haskell eat all our RAM and thereby made our computer very sluggish while building the thunk chain in the first place
  • when we traverse the thunk chain from its end back to its beginning in order to evaluate/reduce the chain, we sometimes have to save so many items for later that the entire Haskell program crashes due to a stack overflow

As we now know, foldl' is much better than the regular foldl. So much so, that in reality, the decision is not between foldr and foldl but between foldr and foldl'. We usually don’t use foldl at all for this reason. However, foldl' also has its disadvantage. Lazy evaluation is one of Haskell’s most powerful and elegant features, and foldl' can’t make use of it. However … while Haskell is extremely smart and while lazy evaluation is an extremely smart feature, in some very rare cases Haskell can be too smart for its own good. This is one of those rare cases. With foldl' we tell Haskell to explicitly not use lazy evaluation.

If you want to know whether it’s bad to give up lazy evaluation or what the implications are, check out this site. Example: let’s define a weird multiplication operator and call it ?.

_ ? 0 = 0
x ? y = x*y

The important takeaway is that

  • undefined ? 0 returns 0; if the second operand is 0, we return 0, no matter what
  • 0 ? undefined hits an exception; you can’t multiply 0 with undefined

Now consider

foldl' (?) 1 [2,3,undefined,5,0]
== foldl' (?) 6 [undefined,5,0]

This leads to an exception, since the next step would be to multiply 6 with undefined. Whereas the regular foldl wouldn’t even try to evaluate the undefined value and would just skip it until the entire thunk chain is built, or rather, until the time has come that Haskell absolutely needs to evaluate undefined (that’s the power of lazy evaluation).

foldl (?) 1 [2,3,undefined,5,0]
== ((((1 ? 2) ? 3) ? undefined) ? 5) ? 0

As soon as the complete thunk chain is established, we can start reducing it. And the very first step in this reduction is to apply our operator ? to something and 0. Since the second operand of ? is 0, we return 0. (Notice that while with the addition, we had to push 0 to the stack and save it for later when we knew what that something was, with ? we don’t need to push 0 to the stack. The implementation of ? is clear: when the second operand is 0, return 0. We don’t need to first know what that something on the left side of ? is.)

Therefore, if you thought of some clever trick that needs to make use of lazy evaluation in order to work, use foldl, otherwise use foldl'.

foldl and foldr

Well, this was a long prelude (Ha! See what I did there?) to foldl and foldr. We wanted to talk Haskell, not Python. We wanted to talk the difference between foldl and foldr. You would surely like to hear easy ways to memorize their differences. And we will do that. Nonetheless, I think now that we laid the groundwork and have a very basic understanding of foldl, we will have a much easier time grasping the things to come.

Differences Between foldl and foldr

How exactly does foldl in Haskell work? In Python’s reduce, you can provide an optional initializer value. In Haskell, this initial value is non-optional. In Python, when you provide an initializer, e.g., 100, the result variable will not be initialized with the first list item but with that initial value. The call reduce(add_two_values, [1,2,3,4,5], 100) would return 115 instead of 15. Similarly, reduce(add_two_values, [1,2,3,4,5], 0) is equivalent to reduce(add_two_values, [1,2,3,4,5]), i.e., to omitting the initial value. In Haskell, being a functional language by design, we don’t need a lambda expression to express the function add_two_values. We can simply use the + operator. Therefore, in Haskell notation, the equivalent to our sum implementation written in Python as reduce(add_two_values, [1,2,3,4,5]) (apart from that foldl vs foldl' thing) would be:

foldl (+) 0 [1,2,3,4,5]

Now, what’s the difference between this and foldr, i.e.

foldr (+) 0 [1,2,3,4,5]

They both calculate 15. And since it’s only addition which is an associative and commutative operation, in this case it makes absolutely no matter whether we use foldl or foldr. However, there are functions, e.g., subtraction or division, where the distinction between foldl and foldr is crucial. More on that in a moment.

Here’s one very simple way to memorize the difference between foldl and foldr. My professor would most certainly not approve of it, but between you and me it’s okay to be a little informal and toss the formalities aside to simplify things. With that being said, imagine it this way:
With foldl you go through the list from left to right. With foldr you come at the list from the right side, but you nevertheless write it down as if you were going left-to-right. In both cases, the initial value is the first element. This gives you the following ordering:

foldl: 0→1→2→3→4

foldr: 1←2←3←4←0

Now put brackets around those two numbers you read first. When you then treat this bracketed expression as one single value, you can think of it as always bracketing the first two values. You will then get this:

foldl: ((((0→1)→2)→3)→4)

foldr: (1←(2←(3←(4←0))))

Now replace the arrow with your function and you have the infix notation (e.g., 5 `mod` 2 rather than mod 5 2; similarly, 5 + 2 rather than (+) 5 2). For example:

foldl (+) 0 [1,2,3,4] == ((((0+1)+2)+3)+4)
foldr (+) 0 [1,2,3,4] == (1+(2+(3+(4+0))))

Why the Differentiation Is So Important

Now, why and when is this important? As I said, for simple calculations that are associative as well as commutative such as addition or multiplication, there’s no mathematical difference. The result is the same, regardless whether we use left-to-right associativity or right-to-left associativity (and reposition the initial value). However, consider the subtraction or, in this case, the division:

foldl (/) 0 [1,2,3,4] == ((((0 / 1) / 2) / 3) / 4)

Nothing surprising here, foldl returns 0.0. Now comes the interesting part:

foldr (/) 0 [1,2,3,4] == (1 / (2 / (3 / (4 / 0))))

Division by zero! 😱 What’s even more surprising: this doesn’t cause an error. Instead, the result of 4/0 is treated as Infinity. And therefore, the result of that function call depends on the length of the list. Wow! The above foldr with an even number of list elements returns 0.0 whereas foldr with an odd number of list elements returns Infinity … two completely opposite results!

A Simple Reordering

The infix notation is familiar. We’re used to it since first grade. Unfortunately, functions in Haskell typically don’t use the infix notation but the prefix notation. So let’s rewrite the notation in the above example evaluations to make it more general and align it with how code would look like in production. To keep it clear, let’s also replace (+) with addTwoValues or, more generally, f. Otherwise, the many ( and ) characters introduced by (+) would only unnecessarily add visual clutter.

foldl f 0 [1,2,3,4] == (f (f (f (f 0 1) 2) 3) 4)
foldr f 0 [1,2,3,4] == (f 1 (f 2 (f 3 (f 4 0))))

This basically really only is a reordering of the above.

How to Memorize foldl and foldr

Now that we have these examples clear before our eyes, we can ponder an alternative way to memorize folding/reducing which is probably more formally correct and would more likely be accepted by my professor.

foldl

With foldl, you walk through the list from left to right.

  • The left argument is the result of the previous iteration. In the first iteration, it’s the initial value.
  • The right argument is the current list element.

The result of the current iteration becomes the left argument of the next iteration.

In our example, we start with the left-most list element, i.e., 1. The “result” of the (non-existing) iteration prior to the first iteration is the initial value, i.e., 0. This means:

f 0 1

becomes

f (f 0 1) 2

becomes

f (f (f 0 1) 2) 3

becomes

f (f (f (f 0 1) 2) 3) 4

where f here equals to (+).

foldr

With foldr, you walk through the list from right to left.

  • The left argument is the current list element.
  • The right argument is the result of the previous iteration. In the first iteration, it’s the initial value.

The result of the current iteration becomes the right argument of the next iteration.

We start with the right-most list element, i.e., 4. The “result” of the (non-existing) iteration prior to the first iteration is the initial value, i.e., 0. This means:

f 4 0

becomes

f 3 (f 4 0)

becomes

f 2 (f 3 (f 4 0))

becomes

f 2 (f 3 (f 4 0))

becomes

f 1 (f 2 (f 3 (f 4 0)))

where f here equals to (+).

Verification

If we compare this approach to the other approach presented earlier, we can see that the resulting structures of the evaluated expressions are identical.

Final Words

I hope you now have a better understanding of foldl and foldr in Haskell. If you could learn something from this article, consider showing some appreciation for the time I invested to write this article. Leave a comment and tell me what you think, or contact me on Twitter. If you follow me, you will also be notified when I publish my next guide. Thanks, and take care.


  1. More precisely, not just lists but any data structure of the type Foldable (see documentation). ↩︎

As an Amazon Associate I earn from qualifying purchases.
Built with Hugo
Theme Stack designed by Jimmy