Recently, both Steve Vinoski (CORBA veteran) and Joe Armstrong (creator of Erlang) have come out strongly against Remote Procedure Calls (here and here).
First of all, it seems to me that when they say “RPC”, they really mean “remote procedure calls that try to pass up as local procedure calls”. Second, this message is not exactly new: ever since the seminal paper “A note on distributed computing” came out (in 1994!), we have known that trying to disguise remote calls as local calls is wrong, and some of the principles described in this paper were consolidated over the years into “The Fallacies of Distributed Computing”, which is also a must-read for anyone interested in this space.
This paper gave birth to RMI and to a whole generation of distributed frameworks based on this very principle: remote calls throw a checked exception in order to differentiate them from local calls and to force the caller to deal with the possible failures that can result from sending a call over a network.
The outrage justifying this string of blog posts is fourteen years overdue, but fine, after all, it’s an important lesson and it doesn’t hurt to repeat it.
Where I’m a bit stumped is that it seems to me that Erlang is built on exactly this false premise and therefore, repeating the errors we made before that paper came out.
The main point behind Erlang’s philosophy about distribution is that you never really know if a process you are calling is remote or local. In Erlang, you should assume that anything can potentially be remote. I’ve always been puzzled by this but I hadn’t put my finger on it until I read the blog posts mentioned above.
Joe seems aware of this problem:
If programmers cannot tell the difference between local and remote calls then it will be impossible to write efficient code.
So why can’t I differentiate a remote process call from a local one in Erlang?
Distributed computing is hard, but is the answer really that we should write our code assuming that *any* process call can potentially be remote? Isn’t this taking this idea to the extreme? One thing that I like with RMI and other similar distributed frameworks is that I have a very precise knowledge of what is remote and what is local, and I can optimize in consequence. On top of that, exceptions let me know when remote processes have died and I can act in consequence (like Erlang’s supervisors).
What am I missing?