Java 5 was released in 2004 and introduced generics to the language using a mechanism called “erasure”. In the following couple of years, a lot of discussions took place comparing this approach to its counterpart, usually referred to as “reified generics”. The discussions then tapered down for a few years, until recently when two languages brought this topic back on the scene by supporting reified generics. These two languages are Gosu (created and used in production by GuideWire) and Kotlin (under development, created by JetBrains).
There is a lot of literature available on the subject of erasure and reified generics, but I thought I would take a few minutes to summarize the current state of the world and share a few thoughts on the pros and cons of each approach.
Let’s start with a few concrete examples. Here are snippets of code which do not compile under an erasure system but which would work fine with reified generics. If you are not familiar with the issues involved, here is a short rule of thumb that should allow you to understand what is going on: whenever you see a generic type, replaced it with Object (since that’s exactly what’s happening behind the scenes):
Overloading
public class Test<K, V> {
public void f(K k) {
}
public void f(V v) {
}
}
T.java:2: name clash: f(K) and f(V) have the same erasure
public void f(K k) {
^
T.java:5: name clash: f(V) and f(K) have the same erasure
public void f(V v) {
The workaround here is simple: rename your methods.
Introspection
public class Test {
public <T> void f() {
Object t;
if (t instanceof List<T>) { ... }
}
}
Test.java:6: illegal generic type for instanceof
if (t instanceof List<T>) {}
There is no easy workaround for this limitation, you will probably want to be more specific about the generic type (e.g. adding an upper bound) or ask yourself if you really need to know the generic type T or if the knowledge that t is an object of type List is sufficient.
Instantiation
public class Test {
public <T> void f() {
T t = new T();
}
}
Test.java:3: unexpected type
found : type parameter T
required: class
T t = new T();
This case is also a bit tricky but since it is, in my experience, more common than the others, I’ll spend a little more time discussing it.
As mentioned above, the virtual machine has no knowledge about the type T, which it only sees as an Object, so it won’t allow you to create an instance of it.
This is a good thing.
From a type standpoint, you know nothing about the T
so you shouldn’t be able to instantiate it with the amount of knowledge you have. For all you know, T
could be an abstract class, an interface or a class with no default constructor. In this case, type erasure keeps you honest by limiting what you can do on T
to the operations that it knows about.
If you need to manipulate T
, you will have to enable this through the type system. Do you need a new instance? Pass an additional Factory<T>
. Do you need to call query()?
on it? Create a type that contains this method and constrain T
with it.
In 2006, Neal Gafter came up with a clever idea to make it possible to instantiate such erased types. He dubbed the technique “super type tokens”, and like most good ideas, it’s extremely simple: you force the creation of an anonymous class that contains the generic type, which you can then retrieve by a clever use of the introspection API.
Here is an example:
abstract class TypeReference<T> {}
public class TT {
public static <T> void f(TypeReference<T> t) {
ParameterizedType pt = (ParameterizedType)
t.getClass().getGenericSuperclass();
System.out.println(pt.getActualTypeArguments()[0]);
}
public static void main(String[] args) {
TT.f(new TypeReference<String>() {});
}
}
This will print "class java.lang.String" on the console, even though the method printing it is completely generic. The trick is on the highighted line: notice the empty braces, which create an anonymous instance of TypeReference, setting the stage for the introspection code in the method to retrieve the generic type.
Shortly thereafter, Bob Lee picked up this feature, fleshed it out and included it in Guice under the name TypeLiteral. Scala supports a similar feature called Manifest.
Ever since Java 5 came out, and despite my initial fears, I can’t say that I have been bothered much by the absence of type information in Java generics, and I am tempted to generalize this observation to the general Java population. Erasure turns out to have quite a few advantages and as it turns out, reified generics come with their own set of issues. Here are some of them.
The main problem is that reified generics would be incompatible with the current collections. In binary form, for sure, and probably in source form as well (we would want to distinguish between collections making used of reified types from their older counterpart). Rewriting would probably be mostly a matter of copy/pasting, except for the parts that make use of introspection and which would need to be adjusted. The generated byte code would also contain more information. For example, the following test:
o instanceof List<String>
would now test that the object is an instance of List but also that its elements are of type String. That’s quite a bit more work.
The extra type information also impacts the interoperability between languages within the JVM but also outside of it. For example, Scala recently announced some progress on its .Net compiler, which contains the following caveat:
The key limitation for the moment is that Scala programs cannot use libraries in .Net that are compiled using CLR generics, such as the .Net collections.
This is just a consequence of the fact C# has reified generics, and bytecode containing this supplemental type information requires more work to be parsed and converted in a form suitable for the client, as opposed to a simple List type.
So, where does this leave us?
Erasure has proven to work quite well for Java, and actually for quite a few other languages as well. Besides the two languages that I named above, there are only two other popular languages that support reified generics: C# and C++. All the others use erasure of some sort, and overall, it’s hard to argue that either approach brings a significant improvement in ease of use.
All in all, I am pretty happy with erasure and I’m hoping that the future versions of Java will choose to prioritize different features that are more urgently needed, such as closures or an improved module system.
Oh, and happy Java 7 day everyone!
#1 by Daniel Serodio on July 29, 2011 - 10:47 am
I don’t think Java needs closures. Code written in Java using closures will be a Frankenstein of OOP and FP styles.
Groovy, on the other hand, has had closures from the beginning, so they fit into the language/API much better.
If you want closures in Java, why not use Groovy (or Groovy++) ?
#2 by Samuel Tardieu on July 29, 2011 - 11:20 am
Scala has introduced manifests to work around type erasure and the need to get type information. Here is an example quite similar to yours: https://gist.github.com/1114634
And, from experience, I find that type erasure has been a real pain in the neck when you have a language which allows pattern matching. You cannot distinguish between Tuple2[Int, String] and Tuple2[String, String], and this is really annoying.
#3 by Samuel Tardieu on July 29, 2011 - 11:22 am
Oh, and don’t forget not to upgrade to Java 7, since the compiler seems to generate bad code in a non-negligible number of cases (due to a change in default optimization options).
#4 by Craig Tataryn on July 29, 2011 - 2:15 pm
Just curious what you meant by:
It’s my understanding the only way you could ever test to see if the variable t is of type List would be to do:
public class Test {
public void f() {
Object t;
if (List.class.isAssignableFrom(t)) {
List l = (List) t;
if (! l.isEmpty() && l.iterator().next() instanceOf SomeType) { … }
}
}
}
I don’t think there is a way (at all) to check if the contained type was of the generic type T. But perhaps I’m wrong.
#5 by m.nikic on July 31, 2011 - 12:40 am
Hello Cedric, interesting topic you got here.
So the main problem of reificatin is the backward compatibility with other languages that use erasure.
So if one wanted to create a hypothetical new language that doesn’t need to be compatible with java (and other languages with erased generics), than reification would be a clear winner, right?
Cheers.
#6 by Steve on July 31, 2011 - 5:30 am
Selective citing FTW …
> The key limitation for the moment is that Scala programs cannot use libraries in .Net that are compiled using CLR generics, such as the .Net collections.
Three sentences later:
“The .Net generics will be supported in the fall.”
#7 by Bubak on August 1, 2011 - 3:10 am
For me erasure is just another crazy legacy stuff which should die. It is bad enought Scala is already cripled by it. And I believe Jetbrains in Kotlin can provide good bridge to old java collections.
#8 by Thai Dang Vu on August 1, 2011 - 10:02 am
Cedric, why is there no Ceylon from your friend Gavin in the title? 🙂
#9 by Tom Davies on August 1, 2011 - 6:23 pm
I think it would be clearer and more correct to say: “notice the empty braces, which create an anonymous *subclass* of TypeReference”.
I completely fail to understand why people get so excited about erasure. The only case in which it has bothered me is when I wanted to create an array of type T[]. Proponents never seem to present compelling use cases for reification.
#10 by Tom Davies on August 1, 2011 - 6:27 pm
P.S. if you could say ‘new T(…)’ how would the assumption that types supplied as parameters had a constructor with that signature affect the type system?
#11 by Gavin King on August 2, 2011 - 7:22 am
Tom, you need to introduce a new kind of type constraint. Here’s how we express that in Ceylon:
http://in.relation.to/Bloggers/IntroductionToCeylonPart6#H-GenericTypeConstraints
C# has something a bit similar, but it only works for classes with a default constructor.
#12 by Serge on August 2, 2011 - 12:35 pm
http://www.engineyard.com/blog/2011/how-invokedynamic-just-might-save-dynamic-languages-on-the-jvm/
some interesting comments on how erasure has made the vm a great host for alternate languages.
#13 by nonoitall on August 7, 2011 - 2:11 pm
Actually, it would simply test that ‘o’ is an instance of List. It needn’t test the elements, because with reified generics the actual runtime type guarantees that only Strings can be placed into the collection. This is in contrast to erasure where you would actually have to test each element in the collection in order to establish what the common element type is.
#14 by Andrew Lee Rubinger on April 27, 2012 - 5:31 pm
Alternate workaround to the “Overloading” example: Change the return type.