Introduction
I assume that the readers are familiar with the basic concepts of covariance and contravariance. Still I can't omit the definitions here, though it's clear to me that the point of variance is not that easy to grasp without examples. All right, there are a lot of easy-to-grasp examples ahead.
A complex type F(T)
is covariant on a type-parameter T
, if the fact that A
is a subtype of B
implies that F(A)
is a subtype of F(B)
.
A complex type F(T)
is contravariant on a type-parameter T
, if the fact that A
is a subtype of B
implies that F(B)
is a subtype of F(A)
.
A complex type F(T)
is invariant on a type-parameter T
, if it is neither covariant nor contravariant on T
.
Running-ahead-of-myself hint: a complex type F
here can be array, generic-type and more.
Table of contents
Here I am providing the brief summary of points to be discussed in the article, including the notes of whether a language supports a particular variance feature or not:
| C#
| Java
| Scala
|
Arrays covariance
| +
(unsafe at runtime)
| +
(unsafe at runtime)
| _
(arrays are invariant by design)
Though, there is support for Java's "covariant" arrays, of course.
|
Arrays contravariance
| _
| _
| _
|
Generics variance
(covariance/contravariance)
| +
Defined by a generic type creator (definition-site).
(Restricted to generic interfaces and generic delegates)
| +
Defined by clients of generic type using wildcards (use-site).
| +
Defined by a generic type creator (definition-site).
Also, there are existential types that cover Java's wildcards functionality.
|
Overriding: return type covariance
| _
| +
| +
|
Overriding: parameter type contravariance
| _
| _
| _
|
When reading the article, you can dive directly into a language you are interested in, but I would recommend you to read it all: in this way you will grasp the general concepts better.
Comparing covariance/contravariance rules
Arrays covariance
C#
Let’s consider the following example:
Cat[] cats = new Cat[] { new Cat(), new Cat() };
Animal[] animals = cats;
animals[0] = new Dog();
This code compiles without errors. It means that arrays are covariant in C#, because we can use Cat[]
array where Animal[]
array is expected (see * line).
But it's obvious that the last line (**) is going against common sense here. And indeed, the code fails at execution time with ArrayTypeMismatchException
. So, formally, C# supports array covariance, but it is not safe and not enforced fully by the compiler. Support for this kind of covariance was added mainly because Java supported it. At that time it was important for C# to be very close to Java to spread the new language widely across the Java community. Now roads of the two languages have diverged much, but supporting for the "broken" array covariance goes deeply in CLR and probably will never be changed.
Java
The same code has the similar behavior in Java as in C#, except that we would get Java's ArrayStoreException
at runtime.
Scala
Java's arrays (internally) are represented not as a single type, but as nine different ones: one for array of references and 8 more for each primitive type (int, short, float, etc.). For Scala language designers it was real challenge to support interoperation with Java, and, at the same time, incorporate arrays into Scala's rich collections hierarchy. As a result, Scala's arrays are represented by generic Array[T]
which is mapped directly to the Java's arrays T[]
. They have the same representation in bytecode, that is why you can pass arrays between Java and Scala in either direction.
We'll discuss generics variance later, but now let's try to understand why Scala's architectures have decided to make Array[T]
invariant. Consider the following code in Scala:
val cats: Array[Cat] = Array[Cat](new Cat(), new Cat())
val animals: Array[Animal] = cats
animals.update(0, new Dog())
By the virtue of the Scala's compiler, we get compile-time error here (line *). Otherwise it would be possible to break type safety like we could in Java (and C#). Interesting, that not all Scala's collections have the same behavior. Let's change Array
to List
in our example:
val cats:List[Cat] = List[Cat](new Cat(), new Cat())
val animals:List[Animal] = cats
val newAnimals = animals.updated(0, new Dog())
This code compiles in Scala and also absolutely safe at execution time. How and why it's possible to do it with List
and impossible with Array
? The answer is mutability. Arrays in Scala are mutable, so it's impossible to guarantee type safety due to the same reasons it's impossible with arrays in Java (C#), which are mutable as well. On the other hand, Scala's List
is immutable (Scala has mutable lists too, but we are talking about the immutable one here). So, when updating element in the list, actually we are creating new List
containing elements from the old one (with a newly updated element). In other words, for immutable lists there is a guarantee that List
can't be updated in place causing inconsistency at runtime.
A brief summary of array covariance:
Only read-only (immutable) arrays can be truly covariant. But they are not immutable. When we update something in an array we don't get a new array, we just update the target array in place. That's why it's impossible to make arrays truly covariant, at the same time providing safety at runtime. Language designers need to make not an easy choice about that. And it's the matter of dispute what's better, to support "broken" covariance for arrays (like C# and Java do), or to make the deliberated decision to make them invariant (Scala).
Arrays contravariance
C#/Java/Scala
The all 3 languages do not support array contravariance. And even though covariance (not safe at runtime) is supported for arrays in C# and Java – it would have been impractical to allow contravariance for arrays. Let's try to find out why.
Let's imagine for a minute that the following code works (really it does not):
Animal[] animals = new Animal[] { new Cat(), new Cat() };
Dog[] dogs = animals;
dogs[0] = new Dog();
The actual type of array is Animal
(it can contain both cats and dogs), so, from the data-changing point of view, there is nothing terrible that we can update one element to contain Dog
instead of Cat
, even via dogs
variable. But how can we read an element from Animal
array via dogs
variable, if we have no compile-time guarantee that there are no Cats
in our array? May be language designers could have implemented some workaround, e.g. to fail at runtime when an incompatible read operation is performed, but that kind of array essentially would be useless. So, our conclusion would be:
Only write-only arrays can be contravariant.
Generics variance
C#
Generics covariance in C#
Let's consider the following code:
interface IAnimalFarm<out T> where T: Animal
{
T ProduceAnimal();
}
class CatFarm : IAnimalFarm<Cat>
{
public Cat ProduceAnimal()
{
return new Cat();
}
}
Now we are ready to try out the example with generics covariance at work:
IAnimalFarm<Cat> catFarm = new CatFarm();
IAnimalFarm<Animal> animalFarm = catFarm;
Animal animal = animalFarm.ProduceAnimal();
This code (all the attention on the line marked with *) compiles without problems. It means that the compiler guarantees that it's safe to work with CatFarm
via animalFarm
variable. And indeed, what's wrong could happen if we call ProduceAnimal
that has Animal
return type, when the actual type of a returned object is Cat
? The answer is nothing, because, thanks to assignment compatibility, it's OK to assign a value of a more specific type (Cat
) to a variable of less specific (Animal
).
Basically, in order to be covariant on a generic type parameter, type should contain the generic parameter only in output positions. In our example, it means that in order to be covariant IAnimalFarm
should contain the generic type parameter T
only as outputs of the methods.
Why we have such a restriction?
Consider the following hierarchy, where the generic type parameter T
is presented both in output and input positions of methods of IAnimalFarm
interface:
interface IAnimalFarm<T> where T : Animal
{
T ProduceAnimal();
void FeedAnimal(T animal);
}
class AnimalFarm : IAnimalFarm<Animal>
{
public Animal ProduceAnimal()
{
return new Animal();
}
public void FeedAnimal(Animal animal)
{
}
}
class CatFarm : IAnimalFarm<Cat>
{
public Cat ProduceAnimal()
{
return new Cat();
}
public void FeedAnimal(Cat animal)
{
}
}
Imagine if covariance supported for IAnimalFarm
. It would mean that the following code is legal:
IAnimalFarm<Cat> catFarm = new CatFarm();
IAnimalFarm<Animal> animalFarm = catFarm;
animalFarm.FeedAnimal(new Dog());
We are working with CatFarm
via animalFarm
variable(*) here. It seems OK. But then we are trying to feed a Dog
object via animalFarm
variable (where an underlying type of object is CatFarm
). So, basically we are trying to feed the dog on a cat's farm – the dog would not be happy. Each line in this sample looks reasonable, but in conjunction they produce the unsafe behavior.
As you can see, the reason for a compile-time restriction for a generic type parameter position (only outputs) is clear: to provide run-time safety. As you remember, in case of arrays it was decided to support covariance, even though, an element of an array can be both in input and output positions of array's operations, paying runtime safety for it. In case of generics, you have compile-time support, but get certain inflexibility instead.
Generics contravariance in C#
Let's consider the following code:
interface IAnimalFarm<in T> where T : Animal
{
void FeedAnimal(T animal);
}
class AnimalFarm : IAnimalFarm<Animal>
{
public void FeedAnimal(Animal animal)
{
}
}
And the following code with contravariance at work:
IAnimalFarm<Animal> animalFarm = new AnimalFarm();
IAnimalFarm<Cat> catFarm = animalFarm;
catFarm.FeedAnimal(new Cat());
This code compiles without problems. It means that the compiler guarantees that it's safe to work with AnimalFarm
via catFarm
variable. And indeed, nothing wrong could happen if we call FeedAnimal
passing Cat
object as a parameter. AnimalFarm
's FeedAnimal
expects Animal
object, but it's OK to pass a more specific object (Cat
) to it, thanks, again, to the assignment compatibility.
In order to be contravariant on a generic type parameter, type should contain the generic parameter only in input positions. In our example, it means that in order to be contravariant IAnimalFarm
should contain the generic type parameter T
only as inputs for methods.
Why such a restriction?
Consider again the following hierarchy, where the generic type parameter T
is presented in both output and input positions of methods of IAnimalFarm
interface:
interface IAnimalFarm<T> where T : Animal
{
T ProduceAnimal();
void FeedAnimal(T animal);
}
class AnimalFarm : IAnimalFarm<Animal>
{
public Animal ProduceAnimal()
{
return new Animal();
}
public void FeedAnimal(Animal animal)
{
}
}
class CatFarm : IAnimalFarm<Cat>
{
public Cat ProduceAnimal()
{
return new Cat();
}
public void FeedAnimal(Cat animal)
{
}
}
Imagine if contravariance supported for IAnimalFarm
. It would mean that the following code is legal:
IAnimalFarm<Animal> animalFarm = new AnimalFarm();
IAnimalFarm<Cat> catFarm = animalFarm;
Cat animal = catFarm.ProduceAnimal();
We are working with AnimalFarm
via catFarm
variable and then trying to produce Cat
. But an underlying object we are working with is of type AnimalFarm
, so animal farm can produce only some abstract Animal
, but not a concrete Cat
by no means. Again, each line is reasonable, but in conjunction they produce the unsafe behavior.
Important notes on a few C# variance limitations
Let us think a bit about why these limitations exist.
So, why generic classes are invariant in C#? As you understand, a class needs to contain only output method parameters (to be covariant) and to contain only input method parameters (to be contravariant). The point is that it's hard to guarantee that for classes: for example, covariant class (by T
type parameter) cannot have fields of T
, because you can write to those fields. It would work great for truly immutable classes, but there is no a comprehensive support for immutability in C# at the moment. But, honestly, I have a feeling that we may expect a better support for it in future.
Why value types are not supported in generics variance? The short answer is that variance only works when the CLR does not need to make changes (conversions) to values of generic type parameters. The conversions are divided into representation-preserving and representation-changing. An example of the representation-preserving conversion is casting operation on a reference: you are not changing the origin object (where the reference points to) when you perform casting; you just verify that the object is compatible with applied type and get new reference. Examples of representation-changing conversions are user-defined conversions, conversion from int to double, boxing and unboxing. For the CLR all references look the same - it's just an address of real object in memory (32 or 64 bits depending on a machine). That's why it can use IAnimalFarm<Cat>
instead of IAnimalFarm<Animal>
without changes in data-representation. You can't say the same about some of value-type conversions (boxing/unboxing, for instance), that's why variance would not work, for example, between IEnumerable<int>
and IEnumerable<object>
. In other words, the easiest way to guarantee that variant conversions are representation-preserving is to allow them only for reference types.
Quick note about Java/Scala (getting ahead of myself): generics in Java and Scala are completely compile-time construct. There is no info about generic type parameters preserved at run-time due to a type erasure process. All generic parameters are maintained as Object (reference types), including value-types (primitives). That's why there are no problems with the value-types data-representation in Java/Scala – every reference looks the same. It's one of a few advantages of using type erasure (JVM) comparing with reified generics (CLR).
Java
Java has another solution for the variance problem in generics. As you have seen recently, in C# creator of a generic type actually responsible to make it invariant/covariant/contravariant. This approach is known as definition-site variance annotations. On the other hand, in Java client of a generic type decides whether to treat it as invariant/covariant/contravariant. It is known as use-site variance annotations.
Consider the following code:
interface AnimalFarm<T>
{
T produceAnimal();
}
class CatFarm implements AnimalFarm<Cat>
{
public Cat produceAnimal()
{
return new Cat();
}
}
Now let's try to use it covariantly:
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<Animal> animalFarm = catFarm;
Animal animal = animalFarm.produceAnimal();
Compiler does not allow it (line marked with *), because generic types are invariant by default in Java. However, we can "force" covariance for a generic type via wildcards. The next example works:
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<? extends Animal> animalFarm = catFarm;
Animal animal = animalFarm.produceAnimal();
The nice thing about specifying variance on a "client-side" is that even if you have some problematic generic type (with generic type parameter presented both in input and output positions) you can still work with it in a covariant/ contravariant way. The bad thing about this approach is that variance is not incorporated into the design by a generic type creatror. Instead, client of this generic type should strain his brain thinking about how to use it properly.
Wildcard <? extends Animal>
means that animalFarm
can hold an object of any AnimalFarm<T>
type, with a generic type parameter(T
) of Animal
subtype. Obviously, Cat
type parameter satisfies this condition.
Consider the following example:
interface AnimalFarm<T>
{
T produceAnimal();
void feedAnimal(T animal);
}
class AnimalFarmDefault implements AnimalFarm<Animal>
{
public Animal produceAnimal()
{
return new Animal();
}
public void feedAnimal(Animal animal)
{
}
}
class CatFarm implements AnimalFarm<Cat>
{
public Cat produceAnimal()
{
return new Cat();
}
public void feedAnimal(Cat animal)
{
}
}
As you remember, in C# a similar generic type would be invariant, because the generic type parameter is presented both in input and output positions of the methods.
In Java by default it's invariant too, but using wildcards a client of a generic type can specify how to treat it.
You can treat it as covariant:
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<? extends Animal> animalFarm = catFarm;
Animal animal = animalFarm.produceAnimal();
Or as contravariant:
AnimalFarm<Animal> animalFarm = new AnimalFarmDefault();
AnimalFarm<? super Cat> catFarm = animalFarm;
catFarm.feedAnimal(new Cat());
Wildcard <? super Cat>
means that catFarm
can hold an object of any AnimalFarm<T>
type with a generic type parameter(T
) of Cat
supertype. Surely, the Animal
type parameter satisfies this condition.
The idea here is that when you treat a generic type as covariant you can only access methods where a generic type parameter is presented in output positions of the methods. And when you treat a generic type as contravariant you can only access methods where a generic type parameter is presented in input positions of the methods.
By the way, wildcard variance is not restricted to interfaces – you can use generic classes in a variance manner as well.
Scala
Consider the following example:
trait AnimalFarm[T]
{
def produceAnimal(): T
}
class CatFarm extends AnimalFarm[Cat]{
def produceAnimal(): Cat = new Cat()
}
As you would expect, the below example does not compile, because by default generic types are invariant in Scala:
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm
val animal: Animal = animalFarm.produceAnimal()
But you can make it covariant as follows:
trait AnimalFarm[+T]
{
def produceAnimal(): T
}
class CatFarm extends AnimalFarm[Cat]{
def produceAnimal(): Cat = new Cat()
}
As you see, the only difference is "+" sign which indicates that the trait is covariant with respect to T
type parameter. Now you can use covariance:
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm
val animal: Animal = animalFarm.produceAnimal()
You see that it's similar to the approach used in C#. You specify that type is a covariant on a definition-site, not on a use-site like Java.
Similarly, you can make trait (or class) contravariant using "-" sign:
trait AnimalFarm[-T]
{
def feedAnimal(animal: T): Unit
}
class AnimalFarmDefault extends AnimalFarm[Animal]
{
def feedAnimal(animal: Animal): Unit = {
}
}
And use it as contravariant:
val animalFarm:AnimalFarm[Animal] = new AnimalFarmDefault()
val catFarm: AnimalFarm[Cat] = animalFarm
catFarm.feedAnimal(new Cat())
All it looks like variance in C#, except the two important points: first of all, we are not restricted to traits – we can apply the same rules to Scala's classes. Another thing is that Scala provides more flexibility for managing generic constraints via Lower and Upper Bounds.
Remember we discussed "problematic" example in C#, when generic type parameter was presented both in input and output positions. Let's reproduce the similar situation in Scala:
trait AnimalFarm[T]
{
def produceAnimal(): T
def feedAnimal(animal: T): Unit
}
class AnimalFarmDefault extends AnimalFarm[Animal]{
def produceAnimal(): Animal = new Animal()
def feedAnimal(animal: Animal): Unit = {
}
}
class CatFarm extends AnimalFarm[Cat]{
def produceAnimal(): Cat = new Cat()
def feedAnimal(animal: Cat): Unit = {
}
}
AnimalFarm
trait is invariant as well as a similar interface would be invariant in C#. And it can't be made, for example, covariant by simply adding "+" sign before the type parameter. We still need to deal with the fact that the type parameter is also presented in the input position of feedAnimal
method, if we want to make the trait covariant. In C# we would've been forced to give up on our desire to make the interface covariant.
But in Scala we can do the following:
trait AnimalFarm[+T]
{
def produceAnimal(): T
def feedAnimal[S >: T](animal: S): Unit
}
class AnimalFarmDefault extends AnimalFarm[Animal]{
def produceAnimal(): Animal = new Animal()
def feedAnimal[S >: Animal](animal: S): Unit = {
}
}
class CatFarm extends AnimalFarm[Cat]{
def produceAnimal(): Cat = new Cat()
def feedAnimal[S >: Cat](animal: S): Unit = {
}
}
And use it as covariant:
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm
val animal: Animal = animalFarm.produceAnimal()
animalFarm.feedAnimal(new Dog)
The cool thing here is that you can even call feedAnimal
on a covariant type without compromising type safety. Let's explore how it works using the following method as an example:
def feedAnimal[S >: Cat](animal: S): Unit = {
}
The Lower bound (e.g. [S >: Cat]
) specifies a reflexive relationship which means that you can pass to the method an object of any type (S
) that is a supertype of Cat
. If you pass a Cat
object, then the common supertype between Cat
and Cat
is Cat
itself, so S
becomes Cat
. If you pass Animal
object, then the common supertype between Animal
and Cat
is Animal
, so S
becomes Animal
. If you pass a Dog
object, then the common supertype between Dog
and Cat
is Animal
again, so S
becomes Animal
. Having this smart inference mechanism, the compiler can guarantee that type safety will never be compromised.
A brief summary of generics variance:
As you have seen, C# and Scala have the similar approach to deal with variance which is specified on a definition-site, though, there are essential differences in constraints rules and other divergent details. Java has another approach where variance is specified on a use-site via wildcards.
Strictly speaking, Scala also has use-site variance via existential types to cover Java's wildcards functionality and to facilitate few more interoperability problems. Even so, it's not a Scala's ideological way of solving a variance challenge. For Scala's architectures method of choice is to use definition-site variance. Still, you can find existential types variance examples in the attached archive/on GitHub.
Overriding: return type covariance
A language supports return type covariance if you can override a method from the base class (that returns a less-specific type) with a method in the derived class (that returns a more specific type).
C#
Consider the following example (does not compile in C#):
class AnimalFarm
{
public virtual Animal ProduceAnimal()
{
return new Animal();
}
}
class CatFarm: AnimalFarm
{
public override Cat ProduceAnimal()
{
return new Cat();
}
}
As you see, return type covariance is not supported in C#. Moreover, it is not supported by the CLR itself. So, it's very unlikely that someday we'll see this feature in C#.
Java
Surprisingly (or not) the similar example works in Java:
class AnimalFarm
{
public Animal produceAnimal()
{
return new Animal();
}
}
class CatFarm extends AnimalFarm
{
@Override
public Cat produceAnimal()
{
return new Cat();
}
}
Java supports return type covariance since JAVA 5.0.
Scala
In Scala it works as well:
class AnimalFarm
{
def produceAnimal(): Animal = new Animal()
}
class CatFarm extends AnimalFarm
{
override def produceAnimal(): Cat = new Cat()
}
No surprisingly, Scala supports return type covariance. Not surprisingly, because Scala is a JVM-based language compatible with Java.
Overriding: parameter type contravariance
A language supports parameter type contravariance if you can override a method from the base class (that has a parameter of more-specific type), with a method in the derived class (that has a parameter of less-specific type).
The all 3 languages do not support parameter type contravariance:
C#
class AnimalFarm
{
public virtual void FeedAnimal(Cat animal)
{
}
}
class CatFarm : AnimalFarm
{
public override void FeedAnimal(Animal animal)
{
}
}
Java
class AnimalFarm
{
public void feedAnimal(Cat animal)
{
}
}
class CatFarm extends AnimalFarm
{
@Override
public void feedAnimal(Animal animal)
{
}
}
Scala
class AnimalFarm
{
def feedAnimal(animal: Cat)={
}
}
class CatFarm extends AnimalFarm
{
override def feedAnimal(animal: Animal)= {
}
}
But why there is no support? What's wrong with parameter type contravariance? It seems that there is no harm can be done by supporting it. At first glance it is true. But the problem is that adding it to a language would create a bunch of controversial situations: e.g. how to distinguish overloading and overriding? A return type can be made covariant, because the return type is not considered during overloading: so there is no ambiguity. But method parameters are a part of the method signature and they are considered during overloading, so there could be potential ambiguity between overloading and overriding. There are more potential problems around, but it is the most apparent one.
Conclusion
As you have seen, variance is pretty interesting and (sometimes) a complicated thing. Language designers need to make a lot of compromises along the way: to enrich language to meet modern requirements and to deal with big legacy codebases at the same time. And it's very interesting to compare approaches that different languages use to solve similar problems. You can find all the examples in the attached archive and on GitHub. Thanks for attention. Till next time!