Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / F#

A Simple Overview on How Pattern Match Compiles

5.00/5 (4 votes)
4 Jan 2013CPOL2 min read 24K  
This article investigates how Pattern Match compiles under the hood in a number of simple common scenarios.

Introduction  

This article investigates how Pattern Match compiles under the hood in a number of simple common scenarios.  

Background  

A pattern match is compiled to an efficient automaton. There are two styles of automaton, "decision tree" and the "French style." Each style offers different advantages: the decision tree inspects the minimum number of values needed to make a decision, but a naive implementation may require exponential code space in the worst case. The French style offers a different time-space trade-off, with good but not optimal guarantees for both. 

The absolutely definitive work on this problem is Luc Maranget's excellent paper "Compiling Pattern Matching to Good Decisions Trees"from the 2008 ML Workshop. Luc's paper basically shows how to get the best of both worlds. 

However here we are not going to touch the deep end of the theory. So for a working programmer's point of view, is it valid that we assume the compiler is always smart? Would the programmers' coding styles affect the compiler's ability to detect redundant branches and resolve to an optimal approach at all? 

Source Code 

You can view the source code here

Test 1 

F# 

F#
let test1 x =
    match x with
    | [| 1; 2; 3 |] -> A
    | [| 1; 2; _ |] -> A
    | [| 1; _; _ |] -> A

Decompiled C# 

C#
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
                return Program.MyType.A;
            }
            break;
        default:
            return Program.MyType.A;
        }
        break;
    }
}
throw new MatchFailureException(...);

Decompiled IL 

Code size 107 

Conclusion 

  1. Pattern Match doesn't optimize based on the values after ->.
  2. Pattern Match is able to find the optimized approach for array decomposition under conclusion 1.
  3. Incomplete pattern matches always throw exceptions, so there is no harm to add a wildcard to catch the missing patterns and throw exceptions explicitly.  

Test 1a & 1b 

F# 

F#
let test1a x =
    match x with
    | [| 1; 2; 3 |] | [| 1; 2; _ |] | [| 1; _; _ |] -> A // redundancy
		
let test1b x =
    match x with
    | [| 4; 5; 6 |] & [| 4; 5; _ |] & [| 4; _; _ |] -> A // redundancy

Decompiled C# 

C#
public static Program.MyType test1a(int[] x)
{
	if (x != null && x.Length == 3)
	{
		switch (x[0])
		{
		case 1:
			switch (x[1])
			{
			case 2:
				switch (x[2])
				{
				}
				break;
			}
			return Program.MyType.A;
		}
	}
	throw new MatchFailureException(...);
}

public static Program.MyType test1b(int[] x)
{
	if (x != null && x.Length == 3)
	{
		switch (x[0])
		{
		case 4:
			switch (x[1])
			{
			case 5:
				switch (x[2])
				{
				case 6:
					if (x != null && x.Length == 3)
					{
						switch (x[0])
						{
						case 4:
							switch (x[1])
							{
							case 5:
								if (x != null && x.Length == 3)
								{
									switch (x[0])
									{
									case 4:
										return Program.MyType.A;
									}
								}
								break;
							}
							break;
						}
					}
					break;
				}
				break;
			}
			break;
		}
	}
	throw new MatchFailureException(...);
}   

Conclusion     

  1. The compiler is unable to check completeness / duplicity within And / Or Patterns, so it fails to find the optimal approach.  

Test 2  

F# 

F#
let test2 x =
    match x with
    | [| 1; 2; 3 |] -> A
    | [| _; 2; 3 |] -> B
    | [| _; _; 3 |] -> C 

Decompiled C#  

C#
if (x != null && x.Length == 3)
{
    switch (x[0]) 
    {
    case 1:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
                goto IL_49;
            }
            break;
        default:
            switch (x[2])
            {
            case 3:
                break;
            default:
                goto IL_49;
            }
            break; 
        }
        break; 
    default:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.B;
            default:
                goto IL_49;
            }
            break;
        default:
            switch (x[2])
            {
            case 3:
                goto IL_58;
            }
            goto IL_49;
        }
        break;
    }
    IL_58:
    return Program.MyType.C;
}
IL_49:
throw new MatchFailureException(...);  

Decompiled IL 

Code size 185 

Conclusion

  1. Pattern Match checks values from the beginning of an array to end. So it fails to find the optimal approach.
  2. Code size is 2x as much as an optimal one.

Test 3

F# 

F#
let test3 x =
    match x with
    | [| 1; 2; 3 |] -> A
    | [| 1; 2; a |] when a <> 3 -> B
    | [| 1; 2; _ |] -> C 

Decompiled C#

C#
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1: 
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
                if (x[2] != 3)
                {
                    int a = x[2];
                    return Program.MyType.B;
                }
                break;
            }
            break;
        }
        break;
    }
}
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
            return Program.MyType.C;
        }
        break;
    }
}
throw new MatchFailureException(...);   

Conclusion

  1. The compiler isn't smart enough to see through Guard to check completeness / duplicity.
  2. Guard makes Pattern Match produce weird unoptimized code.

Test 4

F#

F#
let (| Is3 | IsNot3 |) x =
   if x = 3 then Is3 else IsNot3

let test4 x =
    match x with
    | [| 1; 2; 3      |] -> A
    | [| 1; 2; Is3    |] -> B
    | [| 1; 2; IsNot3 |] -> C
    | [| 1; 2; _      |] -> D // This rule will never be matched.

Decompiled C# 

C#
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
            {
                FSharpChoice<Unit, Unit> fSharpChoice = Program.|Is3|IsNot3|(x[2]);
                if (fSharpChoice is FSharpChoice<Unit, Unit>.Choice2Of2)
                {
                    return Program.MyType.C;
                }
                return Program.MyType.B;
            }
            }
            break;
        }
        break;
    }
}
throw new MatchFailureException(...);

Conclusion

  1. Multiple cases Active Patterns compile to FSharpChoice.
  2. The compiler is able to check completeness / duplicity of active patterns, however it cannot compare them with normal patterns. 
  3. Unreachable patterns are not compiled. 

Test 5

F#

F#
let (| Equal3 |) x =
   if x = 3 then Equal3 1 else Equal3 0 // Equivalent to "then 1 else 0"

let test5 x =
    match x with
    | [| 1; 2; 3        |] -> A
    | [| 1; 2; Equal3 0 |] -> B
    | [| 1; 2; Equal3 1 |] -> C
    | [| 1; 2; _        |] -> D 

Decompiled C#

C#
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
            {
                int num = x[2];
                switch ((num != 3) ? 0 : 1)
                {
                case 0:
                    return Program.MyType.B;
                case 1:
                    return Program.MyType.C;
                default:
                    return Program.MyType.D;
                }
                break;
            }
            }
            break;
        }
        break;
    }
}
throw new MatchFailureException(...);

Conclusion

  1. Single case Active Patterns compile to the return type. 
  2. The compiler sometimes auto inline the function. (Surprise!) 

Test 6

F# 

F#
let (| Partial3 | _ |) x =
   if x = 3 then Some (Partial3 true) else None // Equivalent to "then Some true"
let test6 x =
    match x with
    | [| 1; 2; 3 |] -> A
    | [| 1; 2; Partial3 true |] -> B
    | [| 1; 2; Partial3 true |] -> C

Decompiled C#

C#
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
            switch (x[2])
            {
            case 3:
                return Program.MyType.A;
            default:
            {
                FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
                if (fSharpOption != null && fSharpOption.Value)
                {
                    return Program.MyType.B;
                }
                break;
            }
            }
            break;
        }
        break;
    }
}
if (x != null && x.Length == 3)
{
    switch (x[0])
    {
    case 1:
        switch (x[1])
        {
        case 2:
        {
            FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
            if (fSharpOption != null && fSharpOption.Value)
            {
                return Program.MyType.C;
            }
            break;
        }
        }
        break;
    }
}
throw new MatchFailureException(...); 

Conclusion

  1. Partial Active Patterns compile to FSharpOption.
  2. The compiler is unable to check completeness / duplicity of partial active patterns. 

Test 7 

F#

F#
type MyOne =
    | AA
    | BB of int
    | CC

type MyAnother =
    | AAA
    | BBB of int
    | CCC
    | DDD

let test7a x =
    match x with
    | AA -> 2

let test7b x =
    match x with
    | AAA -> 2  

Decompiled C# 

C#
public static int test7a(Program.MyOne x)
{
    if (x is Program.MyOne._AA)
    {
        return 2;
    }
    throw new MatchFailureException(...);
}

public static int test7b(Program.MyAnother x)
{
    if (x.Tag == 0)
    {
        return 2;
    }
    throw new MatchFailureException(...);
} 

Conclusion  

  1. If there are more than 3 cases in the union, Pattern Match would use Tag property instead of is. (It also applies to Multiple cases Active Patterns.) 
  2. Often a Pattern Match would result in multiple is, which could degenerate performance greatly. 

Change History 

  • 03/Jan/2013 - First Edition
  • 04/Jan/2013 - Added Test 1a & 1b 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)