Introduction
Deep Copy is not implemented in C#, there is only function Object.MemberwiseClone()
on each object and it creates just a shallow copy. This article brings code of a very fast deep copy function implemented by Expression Trees (Linq.Expressions).
Background
General deep cloning function can be easily implemented by Serialization, with moderate effort by Reflection and with a lot of effort by Expression Trees. Serialization is very slow and may have some additional limitations (e.g. serialization by BinaryFormatter
needs Serializable
attribute, serialization by XmlSerializer
does not maintain circular references). Reflection is 5 times faster but still much slower than manually implemented copy functions. Expression Trees are of the same speed as manually implemented copy functions and therefore are the fastest.
For nice working codes by Serialization and Reflection see section Links to Alternatives at the bottom of the article.
Speed Comparison
Speed comparison was done with the mentioned Serialization and Reflection codes from the section "Links to Alternatives". Used object: ModerateClass
(included in the zip).
Table with exact numbers follows:
Number of Copied Items | 1 | 10 | 100 | 1000 | 10000 | 100000 |
Serialization (ms) | 6 | 7 | 29 | 189 | 1496 | 12969 |
Reflection (ms) | 5 | 5 | 10 | 36 | 242 | 2432 |
Expression Trees (ms) | 19 | 17 | 19 | 20 | 69 | 570 |
Why are Expression Trees So Fast?
Expression Trees enable a developer to create functions and methods generically in runtime. It is closely connected to Reflection and has almost the same possibilities (e.g. hidden members can be accessed or assigned, no public
constructor is needed, etc.). Difference from Reflection is that you have to write much longer code and also that a method made by Expression Trees needs to be compiled before its first use. Compilation in runtime takes some time (which explains why expression trees are slower by a constant for a small number of copied objects). But if program reuses the "in runtime compiled" method, it is as fast as compiled source code.
Advantages and Limitations
The code advantages are listed below:
- Copies both public and private members
- Works for readonly fields and properties
- (new) Works for both classes and structs
- Works also for circular references
- Objects with multiple references are copied only once
- Copies inherited types
- Copies generic types
- Works well in multithreading
- Works for classes without parameterless constructor
- Classes do not need
Serializable
attribute
The code limitations:
- Works for .NET 4 and higher (C# 4.0 and higher)
Shallow-copies struct
s (user-defined struct
s containing classes in its fields will have problem)
- Does not copy delegates and events (leaves
null
instead)
- Fails on
ComObjects
(e.g. on some WPF dispatcher subobjects or Excel Interop)
- Fails on any unmanaged object (e.g. from some external C++ library)
How to Use the Code
Copy the file DeepCopyByExpressionTrees.cs somewhere into your solution.
Then, you can use deep copy as an extension method on any object:
var copy = original.DeepCopyByExpressionTree();
or by implementing the DeepCopy
method directly on the object:
class ExampleClass
{
public ExampleClass DeepCopy()
{
return this.DeepCopyByExpressionTree();
}
}
Implementation Description
If you are not interested in the technical description of the code, you can omit this section.
Main Methods and the References Dictionary
The generic public static
method T DeepCopyByExpressionTree<T>(this T original,...)
calls the nongeneric private
method object DeepCopyByExpressionTreeObj(object original,...)
and passes to it newly created references dictionary of type Dictionary<object, object>
. References dictionary (or copy dictionary) stores original classes and their corresponding copies - they are reused based on ReferenceEquals
comparison and therefore we avoid multiple copies of the same object or problems with circular references.
Creating and Storing Compiled Lambda Functions
For each class type is created its specific dynamic deep copy function. It is created (or retrieved) by function GetOrCreateCompiledLambdaCopyFunction(Type type)
, which is made thread safe and stores the already compiled functions to the static
dictionary Dictionary<Type, Func<...>> CompiledCopyFunctionsDictionary
.
Workflow of Copy Functions
Type specific deep copy lambda functions are created with the following workflow:
- The object is first copied by
MemberwiseClone
function.
- In the case of
class
es the copy is stored to the references dictionary under key of original object.
- The need-deep-copy fields are obtained by Reflection (with care for inheritance) and copied one by one again through
DeepCopyByExpressionTreeObj
, which means that compiled lambda function for object Obj1
calls (different) compiled lambda function for its child Obj1.Obj2
.
Delegate
fields and event
s are set to null
.
- For arrays of classes or interfaces, the
copy
function is called for each index set by dynamically created for
loops. Arrays of value types are left untouched.
The FieldInfo
s of the need-deep-copy fields are obtained by reflection only once: before compilation of the copy function and therefore do not slow down the compiled function.
Note (Need-deep-copy fields): Those are fields of types which are either not value types or not primitive. Specifically omitted are string
s, enum
s and the Decimal
type. By this we get all class
es other than string
s and struct
s other than basic value types. Furthermore, from deep-copy are omitted also struct
types which do not contain any class
or interface
in its hierarchy.
Note 2 (Properties): We do not care for properies at all - each property has its declared or auto-generated field, so it is enough to copy just fields.
Note 3 (Readonly fields): Readonly fields are the only thing which slows down the copying, because expression trees do not work here and repeated reflection calls are required. Read more here.
How are Expression Trees Involved
By Expression Trees, we imitate the regular code dynamically (the same as Reflection). I illustrate the process by presenting a small sample code of MemberwiseClone
call, cast and assignment of newly created object.
In regular code, we would easily write one line:
ExampleClass output = (ExampleClass)input.MemberwiseClone();
but the code would be possible to use only from inside of the ExampleClass
since MemberwiseClone
is protected.
By using Reflection, it would be:
MethodInfo MemberwiseCloneMethod =
typeof(Object).GetMethod("MemberwiseClone", BindingFlags.Instance | BindingFlags.NonPublic);
ExampleClass output = (ExampleClass)MemberwiseCloneMethod.Invoke(input, null);
which is already possible to call from outside of the class
.
And by using the Expression Trees, we write:
ParameterExpression inputParameter = Expression.Parameter(typeof(ExampleClass));
ParameterExpression outputParameter = Expression.Parameter(typeof(ExampleClass));
MethodInfo MemberwiseCloneMethod =
typeof(Object).GetMethod("MemberwiseClone", BindingFlags.Instance | BindingFlags.NonPublic);
Expression lambdaExpression =
Expression.Assign(
outputParameter,
Expression.Convert(
Expression.Call(
inputParameter,
MemberwiseCloneMethod),
typeof(ExampleClass)));
Points of Interest (Structs and readonly fields)
Without deep copy of struct
s, deep copy does not work also for such basic classes as Dictionary<,>
, because dictionaries store key-value pairs into struct
s of the type Dictionary<,>.Entry
and these Entry
struct
s have to be therefore deep-copied. See Reference Source of Dictionary<,>. So we heavily need implementation of the struct
s copy.
Tricky part of struct
s copy code comes with writable and readonly fields. Copying of writable fields by Expression Trees must be performed directly on the struct type (without cast or boxing), but for copying of readonly fields we must use Reflection and reflection in structs must be done using boxing. With complexity of Expression Trees code and its limited possibility of debugging - this implementation was tough.
The following code (translated from Expression Trees code to regular C# code) ilustrates the obstacles.
ExampleStruct copy = (ExampleStruct)original.MemberwiseClone();
Object boxedCopy = (Object)copy;
FieldInfo fieldInfo = copy.GetType().GetField("field");
copy.field = value;
((ExampleStruct)boxedCopy).field = value
fieldInfo.SetValue(boxedCopy, value);
copy.field = value;
fieldInfo.SetValue(copy, value);
fieldInfo.SetValue((Object)copy, value);
fieldInfo.SetValue(boxedCopy, value);
Links to Alternatives
An alternative Expression Tree DeepCloner was written by Cygon (full article) and improved by Ymris (codeplex project). Another code (with some limitations) was written by Marcin Juraszek (codeplex project).
The best working and well tested alternative made by Reflection was coded by Alexey Burtsev (codeplex project). It passed all tests enclosed in the .zip.
The best working, tested and shortest by amount of code is Serialization through BinaryFormatter (code 1, code 2). Though it is the slowest possibility and requires the Serializable attribute, it clones very well everything what our solution does and furthermore does some copy of Delegates and Events. It passed all tests too.
Acknowledgement
To my former boss Roger Lord for the whole idea of using Expression Trees for fast deep copy.
History
List of changes:
- [2016-07-12]
- Added Implementation description
- Note about readonly properties
- Link to the alternative solution
- [2016-07-13]
- Corrected note about serialization: BinaryFormatter handles circular references
- [minor] Changed name ClassType to ExampleClass
- [2016-07-14]
- [major] New .zip uploaded: code handles structs (and therefore dictionaries)
- Uploaded also much better tests (in the same zip)
- Added section Points of Interest
- [minor] Edited section Workflow of Copy Functions
- [2016-07-15]
- New .zip uploaded: structs to deep-copy are filtered better - speed improvement in some cases
- [minor] Full Visual Studio Solution was uploaded
- [minor] Edited section Workflow of Copy Functions
- [2016-07-21]
- Acknowledgement
- [minor] Alternatives added
- [minor] "of Objects" removed from the Title, http address not changed
- [2016-08-03]
- Http address changed
- [minor] Minor changes in wording
- [2017-01-27]
- Repaired copy of Object-typed fields that are assigned strings (and value types)
- [2017-01-30]
- Added
ReferenceEqualityComparer
according to Alexey Burtsev solution