What Kotlin could learn from Rust

This is a follow up to my previous article, in which I explored a few aspects of Kotlin that Rust could learn from. This time, I am going to look at some features that I really enjoy in Rust and which I wish that Kotlin adopted.

Before we begin, I’d like to reiterate that my point is not to start a language war between the two languages, nor am I trying to turn one language into the other. I spent careful time analyzing which features I want to discuss and automatically excluded features that make perfect sense for one language and would be absurd in the other. For example, it would be silly to ask for garbage collection in Rust (since its main proposition is a very tight control on memory allocation) and reciprocally, it would make no sense for Kotlin to adopt a borrow checker, since the fact that Kotlin is garbage collected is one of its main appeals.

The features I covered in my first article and in this one are functionalities which I think could be adopted by either language without jeopardizing their main design philosophies, although since I don’t know the internals of either languages, I might be off on some of these, and I welcome feedback and corrections.

Let’s dig in.

Macros

I have always had a love hate relationship with macros in languages, especially non hygienic ones. At the very least, macros should be fully integrated in the language, which requires two conditions:

  • The compiler needs to be aware of macros (unlike for example, the preprocessor in C and C++).
  • Macros need to have full access to a statically typed AST and be able to safely modify this AST.

Rust macros meet these two requirements and as a result, unlock a set of very interesting capabilities, which I’m pretty sure we’ve only started exploring.

For example, the dbg!() macro:

let a = 2;
let b = 3;
dbg!(a + b);

Will print

[src\main.rs:158] a + b = 5

Note: not just the source file and line number but the full expression that’s being displayed (“a + b”).

Another great example of the power of macros can be seen in the debug_plotter crate, which allows you to plot variables:

fn main() {
    for a in 0..10 {
        let b = (a as f32 / 2.0).sin() * 10.0;
        let c = 5 - (a as i32);

        debug_plotter::plot!(a, b, c; caption = "My Plot");
    }
}
#![crate_type = "lib"] #[test] fn test_foo() {}

Nothing groundbreaking here, but what I want to discuss is the conditional compilation aspect.

Conditional compilation is achieved in Rust by combining attributes and macros with cfg, which is available as both an attribute and a macro.

The macro version allows you to compile conditionally a statement or an expression:

#[cfg(target_os = "macos")]
fn macos_only() {}

In the code above, the function macos_only() will only be compiled if the operating system is macOS.

The macro version of cfg() allows you to add more logic to the condition:

let machine_kind = if cfg!(unix) {
    "unix"
} else { … }

At the risk of repeating myself: the above code is a macro, which means it’s evaluated at compile time. Any part of the condition that is not meant will be completely ignored by the compiler.

You might rightfully wonder if such a feature is necessary in Kotlin, and I asked myself the same question.

Rust compiles to native executables, on multiple operating systems, which makes this kind of conditional compilation pretty much a requirement if you want to publish artifacts on multiple targets. Kotlin doesn’t have this problem since it produces OS neutral executables that are run on the JVM.

Even though Java and Kotlin developers have learned to do without a preprocessor since the C preprocessor left such a bad impression on pretty much everyone who has used it, there have been situations in my career where being able to have conditional compilation that includes or excludes source files, or even just statements, expressions, or functions, would have come in handy.

Regardless of where you stand in this debate, I have to say I really enjoy how two very different features in the Rust ecosystem, macros and attributes, are able to work together to produce such a useful and versatile feature.

Extension traits

Extension traits allow you to make a structure conform to a trait “after the fact”, even if you don’t own either of these. This last point bears repeating: it doesn’t matter if the structure or the trait belong to libraries that you didn’t write. You will still be able to make that structure conform to that trait.

For example, if we want to implement a last_digit() function on the type u8:

trait LastDigit {
    fn last_digit(&self) -> u8;
}

impl LastDigit for u8 {
    fn last_digit(&self) -> u8 {
        self % 10
    }
}

fn main() {
    println!("Last digit for 123: {}", 123.last_digit());
    
}

I might be biased about this feature because unless I am mistaken, I was the first person to suggest a similar functionality for Kotlin back in 2016 (link to the discussion).

First of all, I find the Rust syntax elegant and minimalistic (even better than Haskell’s and arguably, better than the one I proposed for Kotlin). Second, being able to extend traits this way unlocks a lot of extensibility and power in how you can model problems, but I’m not going to dive too deep into this topic since it would take too long (look up “type classes” to get a sense of what you can achieve).

This approach also allows Rust to mimic Kotlin’s extension functions while providing a more general mechanism to extend not just functions but types as well, at the expense of a slightly more verbose syntax.

In a nutshell, you have the following matrix:

KotlinRust
Extension functionfun Type.function() {...}Extension trait
Extension traitN/AExtension trait

cargo

This probably comes off as a surprise since with Gradle, Kotlin has a very strong build and package manager. The two tools certainly have the same functional surface area, allowing to build complex projects while also managing library downloading and dependency resolution.

The reason why I think cargo is a superior alternative to Gradle is because of its clean separation between the declarative syntax and its imperative side. In a nutshell, standard, common build directives are specified in the declarative cargo.toml file while ad hoc, more programmatic build steps are written directly in Rust in a file called build.rs, using Rust code calling into a fairly lightweight build API.

In contrast, Gradle is a mess. First because it started being specified in Groovy and it now supports Kotlin as the build language (and this transition is still ongoing, years after it started), but also because the documentation of both of those is still incredibly bad

By “bad”, I don’t mean “lacking”: there is a lot of documentation, it’s just… bad, overwhelming, most of it outdated, or deprecated, etc…, requiring hundreds of lines of copy/paste from StackOverflow as soon as you need something out of the beaten path. The plug-in system is very loosely defined and basically lets all plug-ins access whatever they feel like inside Gradle’s internal structures.

Obviously, I am pretty opinionated on this topic since I created a build tool inspired by Gradle but using more modern approaches to syntax and plug-in resolution (it’s called Kobalt), but independently of this, I think cargo manages to strike a very fine balance between a flexible build+dependency manager tool that covers all the default configuration adequately without being overwhelmingly complex as soon as your project grows.

u8, u16, …

In Rust, number types are pretty straightforward: u8 is an 8 bit unsigned integer, i16 is a 16 bit signed integer, f32 is a 32 bit float, etc…

This is such a breath of fresh air to me. Until I started using these types, I had never completely identified how uncomfortable I had always been with the way C, C++, Java, etc… define these types. Whenever I needed a number, I would use int or Long as default. In C, I sometimes went as far as long long without really understanding the implications.

Rust forces me to pay very close attention to all these types and then, the compiler will relentlessly keep me honest whenever I try to perform casts that can lead to bugs. I really think that all modern languages should follow this convention.

Compiler error messages

Not to say that Kotlin’s error messages are bad, but Rust certainly set a new standard here, in multiple dimensions.

TestNG is a project I started around 2004 with the only intent to mix things up. I wanted to show the Java world that we could do better than JUnit. I had no intentions for anyone to like and adopt TestNG: it was a project lab. An experiment. All I wanted to do was to show that we could do better. I was genuinely hoping that the JUnit team (or whatever was left of it) would take a look at TestNG and think “Wow, never thought of that! We can incorporate these ideas in JUnit and make it even better!”.

This is my goal with these couple of posts. I would be ecstatic if these two very, very different worlds (the Rust and Kotlin communities) would pause for a second from their breakneck development pace, take a quick look at each other, even though they really had no interest in doing so, and realize “well… that’s interesting… I wonder if we could do this?”.

Discussions on reddit:

What Rust could learn from Kotlin

Part Two is up]

When I started studying Rust, I didn’t expect to like it.

Ever since I heard about it, many years ago, I’ve always had a tremendous amount of respect for the design philosophy and the goal that Rust was trying to achieve, all the while thinking this is not the language for me. When I switched from ten years of gruesome C++ to Java back in 1996, I realized that I was no longer interested in low level languages, especially languages that force me to care about memory. My brain is wired a certain way, and that way makes low level considerations, especially memory management, the kind of problem that I derive little pleasure working on.

Ironically, despite my heavy focus on high level languages, I still maintain a healthy fascination for very low level problems, such as emulators (I wrote three so far: CHIP-88080 Space Invaders, and Apple ][, all of which required me to become completely fluent in various assembly languages), but for some reason, languages that force me to care about memory management have always left me in a state of absolute dismissal. I just don’t want to deal with memory, ok?

But… I felt bad about it. Not just because the little Rust I knew piqued my curiosity, but also because I thought it would be a learning exercise to embrace its design goal and face memory management heads on. So I eventually decided to learn Rust, more out of curiosity and to expand my horizons than to actually use it. And what I found surprised me.

I ended up liking writing code in it quite a bit. For a so-called “system language”, its design was a breath of fresh air and proof that its creators had not only a vision but also a healthy knowledge of programming language theory, which was refreshing after seeing some… other languages that have appeared in the past fifteen years. I will resist giving names, but I’m sure you know what I’m talking about.

This article is not intended to start a language war. I love both Kotlin and Rust. They are both great languages, both with some flaws, and I am extremely happy to be able to claim a decent understanding of both of them, and to feel equally comfortable to start new projects in either, whichever is the best language for the job.

But these languages have followed different paths and ended up making different compromises, which makes their simultaneous study and comparison extremely interesting to me.

It took me a while to select which features I wanted to include in this list but eventually, I narrowed my selection criterion to a very simple one: it has to be a feature that will not get in the way of Rust’s main value propositions (close to optimal memory management, zero cost abstractions). That’s it.

I think all the features that I describe in this article are of the cosmetic, but crucial, variety. They will enhance the readability and writability of the language without compromising Rust’s relentless pursuit of zero cost memory management. However, since I’m obviously not familiar with the internals of the Rust compiler, some of these might indeed compromise Rust’s laser focus on optimal memory management, in which case I’d love to be corrected.

Enough preamble, let’s dig in. To give you an idea of what lies ahead, here are the Kotlin features that I’ll be discussing below:

  • Short constructor syntax
  • Overloading
  • Default parameters
  • Named parameters
  • Enums
  • Properties

Constructors and default parameters

Let’s say we want to create a Window with coordinates and a boolean visibility attribute, which defaults to false. Here is what it looks like in Rust:

struct Window {  x: u16,
  y: u16,
  visible: bool,
}

impl Window {
  fn new_with_visibility(x: u16, y: u16, visible: bool) -> Self {
    Window {
      x, y, visible
    }
  }

  fn new(x: u16, y: u16) -> Self {
    Window::new_with_visibility(x, y, false)
  }
}

And now in Kotlin:

class Window(x: Int, y: Int, visible: Boolean = false)

That’s a huge difference. Not just in line count, but in cognitive overload. There is a lot to parse in Rust before you conceptually understand what this class is and does, whereas reading one line in Kotlin immediately gives you this information.

Admittedly, this is a pathological case for Rust since this simple example contains all the convenient syntactic sugaring that it’s lacking, namely:

  • A compact constructor syntax
  • Overloading
  • Default parameters

Even Java scores better than Rust here, since it supports at least overloading (but fails on the other two features).

Even to this day, I whine whenever I have to write all this boilerplate in Rust, because you write this kind of code all the time. After a while, it becomes a second nature to parse it, a bit like when you see getters and setters in Java, but it’s still unnecessary cognitive overload, which Kotlin has solved elegantly.

The lack of overloading is the most baffling to me. First, this forces me to come up with unique names, but mostly because it’s a compiler feature that’s pretty trivial to implement in general, which is why most (all?) mainstream languages created these past twenty years support it. Why force the developer to come up with new names while the compiler can do it automatically, and by doing so, reduce the cognitive load on developers and make the code easier to read?

The common counter argument to overloading is about interoperability: once the compiler generates mangled function names, it can become tricky to call these functions from other processes or from other languages. But this objection is trivially resolved by allowing the developer to disable name mangling for specific cases (which is exactly what Kotlin does, and its interoperability with Java is outstanding). Since Rust already relies heavily on attributes, a #[no_mangle] attribute would fit right in (and guess what, it’s already been discussed).

Named Parameters

This is a feature that I consider more “nice to have” than really essential, but optionally named parameters can contribute to reducing a lot of boilerplate as well. They are especially effective at reducing the need for builder patterns, since you can now limit the use of this design pattern to parameter validation, instead of needing it as soon as you need to build complex structures.

Here again, Kotlin hits a sweet spot by allowing to name parameters but not requiring you to use these names all the time (a mistake that both Smalltalk and Objective C made). Therefore, you get the best of both worlds: most of the time, invoking a function is intuitive enough without naming the parameters, but now and then, they come in very handy to disambiguate complex signatures.

For example, imagine we add a boolean to the Window structure above to denote whether our window is black and white:

class Window(x: Int, y: Int, visible: Boolean = false, blackAndWhite: Boolean = false)

Without named parameters, calls to the constructor can be ambiguous to a reader:

val w = Window(0, 0, false, true) 



Kotlin lets you mix unnamed and named parameters to disambiguate the call:

val w = Window(0, 0, visible = false, blackAndWhite = true)

Note that in this code, x and y are not explicitly named (because their meaning is implied to be obvious), but the boolean parameters are.

As an added bonus, named parameters can be used in any order, which reduces the cognitive load on the developer since you no longer need to remember in which order these parameters are defined. Note also that this feature combines harmoniously with default parameters:


val w = Window(0, 0, blackAndWhite = true)

In the absence of this feature, you will have to define an additional constructor in your Rust structure, one for each combination of parameters that you want to support. If you are keeping count, you now need four constructors:

  • x, y
  • x, y, visible
  • x, y, black_and_white
  • x, y, visible, black_and_white

You can see how this quickly leads to a combinatorial explosion of functions for something which should realistically only take one line of code, as Kotlin demonstrates.

Enums

For all the (mostly justified) criticism that Java receives because of its design, there are a few features that it supports that are arguably best in class, and in my opinion, Java enums (and by extension, Kotlin’s as well) are the best designed enums that I have ever used.

And the reason is simple: Java/Kotlin enums are very close to being regular classes, with all the advantages that these classes bring, with Kotlin’s enums being a superset of Java’s, so even more powerful and flexible.

Rust’s enums are almost as good, but they omit one critical component that makes them not as practical as Java’s: they don’t support values in their constructor.

I’ll give a quick example. One of my recent projects was to write an Apple][ emulator, which includes a 6502 processor emulator. Processor emulation is a pretty easy problem to solve: you define opcodes with their hexadecimal value, string representation, and size, and you implement a giant switch to match the bytes that you read from the file against these opcodes.

In Kotlin, you could define these opcodes as enums as follows:

enum class Opcode(val opcode: Int, val opName: String, val size: Int) {
  BRK(0x00, "BRK", 1),
  JSR(0x20, "JSR", 3)
  
}

While Rust’s enums are pretty powerful overall (especially when coupled with Rust’s destructuring match syntax), they only allow you to define signatures for each of your enum instances (which Kotlin supports too) but you can’t define parameters at your enum level, so the code above is just impossible to replicate in a Rust enum.

My solution to this specific problem was to first define all the opcodes as constants and put them in a vector as tuples:

pub const BRK: u8 = 0x00;
pub const JSR: u8 = 0x20;



let ops: Vec<(u8, &str, usize)> = vec![
  (BRK, "BRK", 1),
  (JSR, "JSR", 3),
  
];

And then enumerate this vector to create instances of an Opcode structure, and put these in a HashMap, indexed by their opcode value:

struct Opcode {
  opcode: u8,
  name: &'static str,
  size: usize,
}

let mut result: HashMap<u8, Opcode> = HashMap::new();
for op in ops {
  result.insert(op.0, Opcode::new(op.0, op.1, op.2));
}

I’m sure there are various ways to reach the same result, but they all end up with a significant amount of boilerplate, and since it’s not really possible to use Rust’s enums here, we lose the benefits that they have to offer. Allowing Rust enums to be instantiated with constant values would significantly decrease the amount of boilerplate and make it a lot more readable as a result.

Properties

This was another unpleasant step back, and I still find it routinely painful in Rust to have to write getters and setters for all the fields that I want to expose from a structure. We learned this lesson the hard way with Java, and even today in 2021, getters and setters are still alive and well in this language. Thankfully, Kotlin does it right (it’s not the only one, C#, Scala, Groovy, … get it right too). We know that having properties and universal access is of great value, it’s disappointing that we don’t have this feature in Rust.

As a consequence, whenever you release code that is going to be used by third parties, you need to be very careful if you make a field public, because once clients start referencing that field directly (read or write), you no longer have the luxury of ever putting it behind a getter or a setter, or you will break your callers. And as a result, you are probably going to err on the side of caution and manually write getters and setters.

We know better today, and I hope Rust will adopt properties at some point in the future.

Conclusion

So this is my list. None of these missing features have been an obstacle to Rust’s meteoritic rise, so they are obviously not critical, but I think they would contribute to making Rust a lot more comfortable and more pleasant to use than it is today.It should come as no surprise that Kotlin could also learn a few things from Rust, so I’m planning on following up with a reverse post which will analyze some features from Rust that I wish Kotlin had.

Special thanks to Adam Gordon Bell for reviewing this post.

Update:

Discussions on reddit:

Tags: , ,

Refactoring a dynamically typed language: do it safely or automatically, but not both

I was recently having a discussion about refactoring dynamically typed languages and I was struck by the amount of misconceptions that a lot of developers still have on this topic.

I expressed my point in this article from fifteen years ago(!), not much has changed, but the idea that it it impossible to safely and automatically refactor a language that doesn’t have type annotations is still something that is not widely accepted, so I thought I would revisit my point and modernize the code a bit.

First of all, my claim:

In languages that do not have type annotations (e.g. Python, Ruby, Javascript, Smalltalk), it is impossible to perform automatic refactorings that are safe, i.e., that are guaranteed to not break the code. Such refactorings require the supervision of the developer to make sure that the new code still runs.

First of all, I decided to adapt the snippet of code I used in my previous article and write it in Python. Here is a small example I came up with:

class A:
    def f(self):
        print("A.f")

class B:
    def f(self):
        print("B.f")

if __name__ == '__main__':
    if random() > 0.5:
        x = A()
    else:
        x = B()
    x.f()

Pretty straightforward: this code will call the function f() on either an instance of class A or B.

What happens when you ask your IDE to rename f() to f2()? Well, this is undecidable. You might think it’s obvious that you need to rename both A.f() and B.f(), but that’s just because this snippet is trivial. In a code base containing hundreds of thousands of lines, it’s plain impossible for any IDE to decide what functions to rename with the guarantee of not breaking the code.

This time, I decided to go one step further and to actually prove this point, since so many people are still refusing to accept it. So I launched PyCharm, typed this code, put the cursor on the line x.f() and asked the IDE to rename f() to f2(). And here is what happened:

class A:
    def f2(self):
        print("A.f")

class B:
    def f(self):
        print("B.f")

if __name__ == '__main__':
    if random() > 0.5:
        x = A()
    else
        x = B()
    x.f2()

PyCharm renamed the first f() but not the second one! I’m not quite sure what the logic is here, but well, the point is that this code is now broken, and you will only find out at runtime.

This observation has dire consequences on the adequacy of dynamically typed languages for large code bases. Because you can no longer safely refactor such code bases, developers will be a lot more hesitant about performing these refactorings because they can never be sure how exhaustive the tests are, and in doubt, they will decide not to refactor and let the code rot.

Update: Discussion on reddit.

Malware on my Android phone!

2021-01-09 11:01:14.651 3655-4415/? I/ActivityTaskManager: START u0 {act=android.intent.action.VIEW dat=https://vbg.dorputolano.com/... flg=0x10000000 cmp=org.adblockplus.browser.beta/com.google.android.apps.chrome.IntentDispatcher} from uid 10237

Ha HA! I got you now. I have a uid, which I grepped through the output, and I finally identified my target:

2021-01-09 11:01:13.810 3655-3655/? I/EdgeLightingManager: showForNotification :
isInteractive=false, isHeadUp=true, color=0,
sbn = StatusBarNotification(pkg=com.qrcodescanner.barcodescanner user=UserHandle{0} id=1836 tag=null 
key=0|com.qrcodescanner.barcodescanner|1836|null|10237: Notification(channel=sfsdfsdfsd pri=2 contentView=null
vibrate=null sound=null defaults=0x0 flags=0x90 color=0x00000000 vis=PRIVATE semFlags=0x0 semPriority=0 semMissedCount=0))


So the package name is “com.qrcodescanner.barcodescanner“. It looks like the rogue app is disguising itself as a QR barcode scanner. I took another look at the list of my apps and sure enough, I quickly located the offending application. I uninstalled it, and a few hours later, I was happy to observe that the URL’s stopped popping up.

Back to DefCon 5

I went back to the Play Store and tried to find the application that I uninstalled but couldn’t find it, which indicates that Google probably removed it from its store some time ago. The malware I activated most likely side loaded it after I unwittingly approved its installation.

Some forensic research revealed an article from 2018 discussing such a malware QR Code Reader application, but the package and the mode of operation don’t quite match what I found. I am probably dealing with a copycat.

Looking back, I feel pretty disappointed that I had to go through all these steps to get rid of a simple scam application. What would a regular user do?

Conclusions and suggestions

  • Listing the apps installed on my phone should give me the option to sort them by “Latest installed”. I am pretty sure that if I had had this option and I had seen a QR Code Scanner installed just a few days ago, it would have immediately grabbed my attention. As it is, the way Android lists the installed apps is pretty useless for this purpose.
  • MalwareBytes was completely useless and I immediately uninstalled it when I realized this fact. The problem is that it was probably just looking for malware code signatures inside the packages instead of just looking at which apps I had installed.
  • Google Play Protect was also completely unhelpful, which was a big disappointment. First because Google certainly knows which applications they removed from their store for malware reasons, but even so, I would expect Google Play Protect to at least flag any app it finds on my phone that is not on their store. Such an app is not necessarily malware, but it should certainly be flagged.
  • Google Play Protect could also do some behavior profiling to analyze what apps are doing in the background. A service launching recurring VIEW intents on web sites in the background should have raised a flag to the system.

Zoom background #23

Zoom background #22

Zoom background #21

Zoom background #20

Zoom background #19

A Chip-8 emulator written in Kotlin

cracking old school Apple ][ games, but mostly because an emulator has always seemed to me to be a great mix of technical challenge with a very rewarding feeling as you make progress. So I made it my week end project to write a Chip-8 emulator.

Chip-8 is a very popular CPU to emulate and usually the first project people who are new to this exercise undertake, so it was an easy choice. The spec is short and except for a few tiny details, very clear on how to implement this CPU. It’s graphical too, which is important from a reward standpoint. I was absolutely thrilled when I saw the beginning of the welcome screen of Space Invaders appear on my screen after I had implemented just a few opcodes.

You can find the emulator on Github with all the technical details.

I am very tempted to work on a harder emulator now, from a real console. I am fluent in 6502 so maybe SNES, or even an Apple ][…