Introduction
This article investigates how Pattern Match compiles under the hood in a number of simple common scenarios.
Background
A
pattern match is compiled to an efficient automaton. There are two styles
of automaton, "decision tree" and the "French style." Each
style offers different advantages: the decision tree inspects the minimum
number of values needed to make a decision, but a naive implementation may require
exponential code space in the worst case. The French style offers a different
time-space trade-off, with good but not optimal guarantees for both.
The
absolutely definitive work on this problem is Luc Maranget's excellent paper
"Compiling
Pattern Matching to Good Decisions Trees"from the 2008 ML Workshop.
Luc's paper basically shows how to get the best of both worlds.
However
here we are not going to touch the deep end of the theory. So for a working
programmer's point of view, is it valid that we assume the compiler is always
smart? Would the programmers' coding styles affect the compiler's ability to
detect redundant branches and resolve to an optimal approach at
all?
Source Code
You can view the source code here.
Test 1
F#
let test1 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; _ |] -> A
| [| 1; _; _ |] -> A
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
return Program.MyType.A;
}
break;
default:
return Program.MyType.A;
}
break;
}
}
throw new MatchFailureException(...);
Decompiled IL
Code size 107
Conclusion
- Pattern Match doesn't optimize based on the values after
->
. - Pattern Match is able to find the optimized approach for array decomposition under conclusion 1.
- Incomplete pattern matches always throw exceptions, so there is no harm to add a wildcard to catch the missing patterns and throw exceptions explicitly.
Test 1a & 1b
F#
let test1a x =
match x with
| [| 1; 2; 3 |] | [| 1; 2; _ |] | [| 1; _; _ |] -> A
let test1b x =
match x with
| [| 4; 5; 6 |] & [| 4; 5; _ |] & [| 4; _; _ |] -> A
Decompiled C#
public static Program.MyType test1a(int[] x)
{
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
}
break;
}
return Program.MyType.A;
}
}
throw new MatchFailureException(...);
}
public static Program.MyType test1b(int[] x)
{
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 4:
switch (x[1])
{
case 5:
switch (x[2])
{
case 6:
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 4:
switch (x[1])
{
case 5:
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 4:
return Program.MyType.A;
}
}
break;
}
break;
}
}
break;
}
break;
}
break;
}
}
throw new MatchFailureException(...);
}
Conclusion
- The compiler is unable to
check completeness / duplicity within And / Or Patterns, so it fails to
find the optimal approach.
Test 2
F#
let test2 x =
match x with
| [| 1; 2; 3 |] -> A
| [| _; 2; 3 |] -> B
| [| _; _; 3 |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
goto IL_49;
}
break;
default:
switch (x[2])
{
case 3:
break;
default:
goto IL_49;
}
break;
}
break;
default:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.B;
default:
goto IL_49;
}
break;
default:
switch (x[2])
{
case 3:
goto IL_58;
}
goto IL_49;
}
break;
}
IL_58:
return Program.MyType.C;
}
IL_49:
throw new MatchFailureException(...);
Decompiled IL
Code size 185
Conclusion
- Pattern Match checks values from the beginning of an array to end. So it fails to find the optimal approach.
- Code size is 2x as much as an optimal one.
Test 3
F#
let test3 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; a |] when a <> 3 -> B
| [| 1; 2; _ |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
if (x[2] != 3)
{
int a = x[2];
return Program.MyType.B;
}
break;
}
break;
}
break;
}
}
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
return Program.MyType.C;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
- The compiler isn't smart enough to see through Guard to check completeness / duplicity.
- Guard makes Pattern Match produce weird unoptimized code.
Test 4
F#
let (| Is3 | IsNot3 |) x =
if x = 3 then Is3 else IsNot3
let test4 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Is3 |] -> B
| [| 1; 2; IsNot3 |] -> C
| [| 1; 2; _ |] -> D
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
FSharpChoice<Unit, Unit> fSharpChoice = Program.|Is3|IsNot3|(x[2]);
if (fSharpChoice is FSharpChoice<Unit, Unit>.Choice2Of2)
{
return Program.MyType.C;
}
return Program.MyType.B;
}
}
break;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
- Multiple cases Active Patterns compile to
FSharpChoice
. - The compiler is able to check completeness / duplicity of active patterns, however it cannot compare them with normal patterns.
- Unreachable patterns are not compiled.
Test 5
F#
let (| Equal3 |) x =
if x = 3 then Equal3 1 else Equal3 0
let test5 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Equal3 0 |] -> B
| [| 1; 2; Equal3 1 |] -> C
| [| 1; 2; _ |] -> D
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
int num = x[2];
switch ((num != 3) ? 0 : 1)
{
case 0:
return Program.MyType.B;
case 1:
return Program.MyType.C;
default:
return Program.MyType.D;
}
break;
}
}
break;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
- Single case Active Patterns compile to the return type.
- The compiler sometimes auto
inline
the function. (Surprise!)
Test 6
F#
let (| Partial3 | _ |) x =
if x = 3 then Some (Partial3 true) else None
let test6 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Partial3 true |] -> B
| [| 1; 2; Partial3 true |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
if (fSharpOption != null && fSharpOption.Value)
{
return Program.MyType.B;
}
break;
}
}
break;
}
break;
}
}
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
{
FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
if (fSharpOption != null && fSharpOption.Value)
{
return Program.MyType.C;
}
break;
}
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
- Partial Active Patterns compile to
FSharpOption
. - The compiler is unable to check completeness / duplicity of partial active patterns.
Test 7
F#
type MyOne =
| AA
| BB of int
| CC
type MyAnother =
| AAA
| BBB of int
| CCC
| DDD
let test7a x =
match x with
| AA -> 2
let test7b x =
match x with
| AAA -> 2
Decompiled C#
public static int test7a(Program.MyOne x)
{
if (x is Program.MyOne._AA)
{
return 2;
}
throw new MatchFailureException(...);
}
public static int test7b(Program.MyAnother x)
{
if (x.Tag == 0)
{
return 2;
}
throw new MatchFailureException(...);
}
Conclusion
- If there are more than 3 cases in the union, Pattern Match would use
Tag
property instead of is
. (It also applies to Multiple cases Active Patterns.) - Often a Pattern Match would result in multiple
is
, which could degenerate performance greatly.
Change History
- 03/Jan/2013 - First Edition
- 04/Jan/2013 - Added Test 1a & 1b