I have written previously about solving the first problem on the Project Euler website using F# and Haskell.
The problem is to find the sum of the natural numbers that are divisible by 3 or 5 that are less than 1000. To solve this in F#, my first solution looked like this:
[0..999] |> List.filter (fun x -> x % 3 = 0 || x % 5 = 0) |> List.sum
This works. However, it was pointed out to me that using lists to generate the numbers below 1000 in my F# solution was memory hungry as the list will be created and stored in memory then piped to the filter and so on. The list is evaluated eagerly. This isn’t a problem with a small list like this, but it could create issues with more elements.
I took a look at sequences in F#, which are evaluated lazily. To get an idea of what this means, consider the sequence of natural numbers:
let countingNumbers = seq { for i in 1 .. System.Int32.MaxValue do printfn "Yielding: %d" i yield i }
If you are unfamiliar with lazy evaluation, it might look like this will create all the natural numbers up to System.Int32.MaxValue (2147483647) and print each number as it goes. However, if you run this with F# Interactive (aka fsi
), it creates a seq<int>
without printing anything out. At this point, the sequence is yet to be enumerated.
The ingenious thing about sequences is that we can work with them without enumerating them. Consider these lines:
let max = 999 let firstUpToMax = Seq.take max countingNumbers
The Seq.take
function allows us to take the first elements in a sequence. However, running these lines in fsi
does not cause the for
loop in the sequence above to be run. The print and yield statements are not run.
F#’s ability to delay enumeration is not limited to sequences. Queries also allow us to describe a sequence of numbers without enumerating them.
I define the following function that will be used to filter the sequence:
let isDivisibleBy3Or5 x = x % 3 = 0 || x % 5 = 0
and then create this query:
let divisibleBy3Or5 = query { for countingNumber in firstUpToMax do where (isDivisibleBy3Or5 countingNumber) select countingNumber }
Programmers who are familiar with SQL or LINQ should have no problem understanding this query. In plain English, we might say “we will go through the sequence and find the values that match the given criterion.” Note the use of the future tense. Running this code in fsi
does not result in enumeration of our sequence and the printing of the “Yielding…” lines.
I could find the sum of the numbers matching our criteria using a built in function like this:
let seqSum = Seq.sum divisibleBy3Or5
This will cause the sequence to be enumerated and finds the same sum as the list method above.
However, by defining the following function:
let noisySum total next = printfn "Adding %d to %d" next total total + next
I will be able to see the order in which the sequence is enumerated and when the values are pumped down the pipeline to be folded into the sum.
let seqFoldSum = divisibleBy3Or5 |> Seq.fold noisySum 0
At this point we get the following out put at fsi
:
Yielding: 1 Yielding: 2 Yielding: 3 Adding 3 to 0 Yielding: 4 Yielding: 5 Adding 5 to 3 Yielding: 6 ... Yielding: 996 Adding 996 to 231173 Yielding: 997 Yielding: 998 Yielding: 999 Adding 999 to 232169 val seqFoldSum : int = 233168
Which clearly demonstrates the order of execution and that only the numbers that are divisible by 3 or 5 are added to the sum.
Does this make much difference to the memory usage? For this problem, this is not the bottle neck. The integers storing the sum of the numbers overflow before the input lists become large enough to use a lot of memory. However, passing values down the pipeline as they are yielded is somehow more pleasing to this programmer. Also, lazy evaluation means that enumerating the numbers and performing the mathematics do not take place until the result is actually needed. By creating a sequence and query, the programmer can create the code and pass it around the program without doing any heavy computation or using up much memory.
The complete listing of this code can be found at GitHub:
https://github.com/robert-impey/CodingExperiments/blob/master/F%23/Loose/Euler1.fs