Optimize references in closures in F#

Stefan Savev 2

5.00/5 (2 votes)

21 Dec 2010CPOL2 min read

13.1K

Shows how to encode a closure as a class where the references are stored as mutable fields, thus removing a level of indirection

This quick tip shows how to encode a closure as a class in the F# programming language. The advantage is that references are stored as mutable fields in the class, thus removing a level of indirection. The disadvantage is that an F# closure/function is created indirectly with more work than necessary. If the closure uses a large number of references (e.g. more than 4) which are accessed a large number of times
(e.g. at least a few billion times), the described approach can improve the speed of the closure application.

let makeGen () =
    let i = ref 0L
    fun () -> i := !i + 1L
              !i
let g = makeGen ()
g() //returns 1L
g() //returns 2L
g() //returns 3L

Using Red Gates' Reflector we can see how that the F# closure is compiled as a .NET class:

[Serializable]
internal class makeGen@4 : FSharpFunc<Unit, long>
{
    // Fields
    public FSharpRef<long> i;
    // Methods
    internal makeGen@4(FSharpRef<long> i)
    {
        this.i = i;
    }
    public override long Invoke(Unit unitVar0)
    {
        this.i.contents += 1L;
        return this.i.contents;
    }
}

F# reference is implemented as a class that wraps a value and allows view and update of that value:

public sealed class FSharpRef<T> //:Omitted code
{
    // Fields
    public T contents;
    // Methods
    public FSharpRef(T contents)
    {
        this.contents = contents;
    }
    //Omitted code
    public T Value
    {
        get
        {
            return this.contents;
        }
        set
        {
            this.contents = value;
        }
    }
}

As we can see, the int64 value is updated through an extra level of indirection.

Following is a way to remove the indirection:

//convert FSharp<'a,'b> to 'a -> 'b  
let inline toFunc<'a,'b,'c when 'c :> FSharpFunc<'a,'b>> (x: 'c ): 
                ('a -> 'b) = 
    (# "" x : 'a -> 'b #)

//implement an F# function as a class
[<Class>]
type EfficientGen =
    val mutable i: int64
    new () = {i = 0L}
    inherit FSharpFunc<unit,int64> with
    override  this.Invoke(_) =  this.i <- this.i + 1L
                                this.i
let makeGen' () = EfficientGen() |> toFunc

Decompiling the above shown F# code with Reflector, we can make sure
that the extra indirection due to the use of the reference is removed.

public class EfficientGen : FSharpFunc<unit,long>
{
    // Fields
    public long i = 0L;

    // Methods
    public override long Invoke(Unit _arg1)
    {
        this.i += 1L;
        return this.i;
    }
}

When does it matter?
From my experiments the described transformation will have a noticeable effect if the closure uses a large number of references (for example more than 4) and is called an extremely large number of times (see below the code for the experiments).

module InEfficient = 
    let makeGen () = 
        let i = ref 0L
        let j = ref 0L
        let k = ref 0L
        let l = ref 0L
        fun () -> i := !i + 1L
                  j := !j + 1L
                  k := !k + 1L
                  l := !l + 1L
                  !i

module Efficient = 
    //convert FSharp<'a,'b> to 'a -> 'b
    let inline toFunc<'a,'b,'c when 'c :> FSharpFunc<'a,'b>> (x: 'c ): ('a -> 'b) = (# "" x : 'a -> 'b #)

[<Class>]
type EfficientGen =
    val mutable i: int64
    val mutable j: int64
    val mutable k: int64
    val mutable l: int64
    new () = {i = 0L; j = 0L;k = 0L;l = 0L}
    inherit FSharpFunc<unit,int64> with
    override  this.Invoke(_) =  this.i <- this.i + 1L
                                this.j <- this.j + 1L
                                this.k <- this.k + 1L
                                this.l <- this.l + 1L
                                this.i
let makeGen () = EfficientGen() |> toFunc

For example, I compared the above two versions by calling each generator 1,000,000,000 times and I got 05.85 seconds for the F# closure, and 03.76 seconds for the closure implemented as a class. The overhead of running an empty loop with a function call was about 02.8. So, the F# closure ran 55% slower than the class closure (including testing overhead).

This is due to the large number of references (four) that I used.
I did not see any noticeable improvement for the original example with one reference. If one captures object variables in the closure instead of references, there will be no improvement to F#'s automatic closure.

Additionally, my results depend on the .NET runtime. For the 32 bit .NET runtime, I got 15.27 seconds vs. 09.89 seconds, while for the 64 bit runtime, I got 05.85 vs. 03.76.

Conclusion

Closures are handy because they automatically capture variables outside the direct scope of the function. However, if we want to capture a mutable primitive value such as an integer, a float, etc. for the purpose of updating, we have to take penalty of one indirect access.
If the number of captured variables is large, that penalty can be substantial. In such cases, it might make sense to manually encode the closure as a class while keeping the F# function interface the same.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)