Archive for January, 2004

More on super and inheritance

There were quite a few interesting comments on my previous entry, both on my
weblog and on
TheServerSide thread
.  I
will address them in turn and I’ll take this opportunity to clarify my position
on inheritance.

Howard writes:

I tend to make as many of my methods private as possible. I occasionally even
use final on non-private methods (but rarely).

This is a very good practice.  Interestingly, it’s a lesson that we
learned from C++, where the community actually took it one step further.  Back
then, I remember reading an article by a C++ guru actually recommending to
maximize the number of static methods in your code.  This article
caused quite a stir as you can guess since a lot of developers equate static
methods to global variables.  Nevertheless, the author mentioned that
static methods are the most decoupled methods you can have in your code.
Something to think about.

Vincent says:

I personnally try not to extend specialized TestCase. I prefer to work with
Suites. I think this is the "official" way to extend JUnit

I don’t know how official it is, but point taken, Vincent.  I think I’ll
head this way as well, maybe it will decrease the amount of frustration I have
with JUnit 🙂

Hristo is more radical:

Yes inheritance IS EVIL and should almost always be replaced with
compositions (or AOP introductions).

I disagree with this, which is too radical in my taste.  Neither
inheritance nor delegation/composition/introduction are silver bullets.  I
am not going to run an exhaustive list of their pros and cons, but I think one
of the salient points that should make you choose one over the other is that of
Typing.  When you extend, your subclass can be substituted for its parent
class (also referred to as the
Liskov
Substitution Principle
).  There are quite a few cases when such a
property is not only convenient:  it is the only sound design choice. 
On the other hand, delegation is more flexible and more dynamic.  This is
also something that can be a requirement.

Hristo also uses
this interview
of James Gosling
to bash inheritance, whereas Gosling seems actually to be
quite fond of inheritance:

I personally tend to use inheritance more often than anything else.

Then Hani sets the debate back on track with his usual lucid observation:

Inheritance isn’t evil, people who don’t understand it or design for it are.

Which is pretty much what I said in the paragraph above, although in
different terms since I couldn’t dream of ever reaching Hani’s mastery of
concision and punch-packing.

Bo notices:

Um, as far as I can tell, you haven’t made a case against not calling super
you’ve made a case about why you should put initialization logic in
constructors. (Hint: base class constructors always get called).

Right, constructors are always called and the invocation of super in their
code is enforced by the compiler, which is why I made them an exception in my
original article ("whenever you feel the need to call super inside a method
that is not a constructor, it’s a code smell"
).   And I agree that
initialization logic should be in constructors, but it’s not always
achievable.  Sometimes, extra initialization has to happen after the object
is created.

It’s too bad that java doesn’t have an overrides keyword yet (1.5 will
introduce @Overrides)

This keyword won’t change anything to the problem at hand, except that the
compiler might be able to notice a typo.  But there will certainly be no
implicit call to super.

IMO calling super should be the first thing you do when you override a method

In my experience, close to none of the code I work with or read ever does
that.  Most of the methods that override a parent method simply replace the
logic of the overridden method.  You might call that misuse of inheritance,
and I won’t necessarily disagree with you, but this is a different topic.

 

Don’t call super

Okay, I found why my DBUnit tests were not working:  I was overriding setUp() but not calling its parent version.

The fix was simply to invoke super.setUp() in my own setUp():

public void setUp() {
  try {
    Class.forName(JDBC_DRIVER).newInstance();
    super.setUp();

  …

Note that the order of these two instructions is important, or
DatabaseTestCase will be unable to locate your driver. 
Seems obvious, but I didn’t get it right the first time.

Now, all this makes me angry for a lot of reasons.

First of all, this is the kind of design flaws that has been around since the
C++ days.  I
weblogged about it
a while ago:

This kind of pattern
is similar to seeing methods invoke their super counterpart, a definite code
smell in my book (if you override the said method and forget to call super,
everything breaks).

I’ll restate my point:  whenever you feel the need to call super inside
a method that is not a constructor, it’s a code smell.  If on top of that, this method can be
overridden by subclasses, you absolutely need to get rid of that constraint
because I guarantee you that someone (a user or even yourself) will break that
contract.

How do you solve this problem?  With a technique called "hooks".

In this particular example, DatabaseTestCase.setUp() performs some very important
initialization logic.  If this code is not run, then DBUnit breaks. 
As simple as that.  The problem is that subclasses are very likely to
override setUp(), since it’s the recommended and documented way to set up your
tests in JUnit.

When you face such a situation, you should consider moving the vital code in
a private method that cannot be subclassed, and then have this method invoke one
or several
"hooks".  The hook would be, in that case "onSetUp()".  This way,
subclasses can be notified when the setUp() is happening but they won’t override
the important initialization that’s happening in it.

Admittedly, this technique has limits when the hierarchy of subclasses
deepens, and there is no easy way to achieve that, so DBUnit is not
completely to blame.

The real culprit is JUnit which was designed without realizing that
subclasses of TestCase can be either more specialized parent test classes or
real tests, and
that subclassing rules should be different depending on which class you are
implementing.

The more I work with JUnit, the more angry
<blam> I
<blam> get.

 

Another game

For all of you bored with hitting penguins, here is
another little game (HTML + Javascript). 
Quite simple and addictive.  My record is about 26 seconds.  I
recommend IE to run it, Mozilla seems to experience a few glitches with this
code.

 

Major spam attack

I have just been the target of a massive spam comment attack. In the night of
January 23rd, my weblog received about two hundred and fifty (250!) spam
comments.  The sheer size of it is not the only thing that worries me: 
it’s the way it was done.

Usually, MT-Blacklist makes it trivial to get rid of such spam and it also
allows you to despam your weblog retroactively (i.e. not just the comment that
was just posted and for which you just received an email notification). The
problem in this particular attack is that these 250 comments

  • All came with a different email address.
     
  • Were posted all across my weblog, not just on one entry (they commented
    on about thirty posts).
     
  • But worst of all, they advertised a wide range of web sites, not just
    one.

This last point is the reason why MT-Blacklist was a little less effective at
getting rid of that spam than it usually is, since MT-Blacklist despams based on
the URL of the poster or its IP address (most of the time useless). Ideally, I
would have liked MT-Blacklist to have an option "Add the websites contained in
the last 250 comments to my blacklist and despam my entire weblog", but since
this is not supported, I had to do some manual work.

Basically, I went through my Inbox and blacklisted the domains one by one.
Once I thought I had found most of them (going through 30-40 emails), I asked
MT-Blacklist to despam my entire weblog.  Then I repeated this procedure
until the last comment posted on my welcome page was a legitimate comment again. 
Total time, about a half hour.  Not too bad.

Now, all this made me think a little bit about the spam comment phenomenon.
Obviously, the blacklist method will not scale for much longer, so how could I
stop the problem at its source: preventing spammers from posting in the first
place?

This is obviously impossible, so maybe I could push the reasoning one step
further and make sure they don’t find my weblog in the first place… The
question now is: how did they find my weblog?

If I were a spammer and I were looking for weblogs to comment, I would start
by determining what seems to be the de facto weblogging software. Movable Type
is an easy choice. Then I would take a look at the source and find how comments
are posted. I would quickly find out that the main entry point is called "mt-comments.cgi"
and I would google it.

So I

did this
, and… holy smurf on a snowboard! My weblog appears in sixth position!!!  Now
things are slowly falling into place. I think the first measure I will take is
to rename mt-comments.cgi to something different (how about vxtyzb.cgi?) and
I will patch my installation of Movable Type to use this new page. Hopefully, this
shouldn’t be too hard.

I have a few other ideas to make these bastards’ lives harder but it will be
for a next entry.

Update:  I made the change.  It’s a simple matter of modifying
mt.cfg, renaming the script and rebuilding the whole site.  I am very happy
to report that if you click on the link shown by the google request above, it
will now 404.  Yeah.

 

Trace wizardry

Indeed,
Cameron’s trick
is pretty cool.  I have been using a similar trick for
a while now, except that when I wrote it, we didn’t have StackFrame support, so
it was all about dirty manual parsing of the stack trace.

However, my technique is different from Cameron’s in the following ways:

  • I don’t print the name of the variable.  Most of the time, I’m not
    tracing a variable (could be an array or the result of a method) and anyway,
    the name of the variable is not that important.
     
  • However, I use the trick to print the name of the class.  This is
    the most important part, in my opinion.  I can’t count the number of
    times where I painfully looked for a particular trace in my source code in
    order to remove it.  IDE’s make it a little easier to do that now, but
    they also have limits (like when the text happens to be i18n’ed and
    therefore, nowhere to be found in your *.java files).
     
  • And finally, I give my trace functions a very identifiable name, for a
    reason related to the previous point.  Using a name like "trace()" or
    "p()" makes it challenging to find all the places where you invoke the
    trace, so I typically use "ppp()".  You can’t type this by accident 🙂

All that being said, IDE’s make this kind of hack almost useless these days. 
For example, I have a template called "ppp" and all I need to do is type "ppp<space>"
at any moment to have the trace method automatically implemented with the class
name and everything else in it.

But it’s nowhere near as elegant as the trace walking.

 

More DBUnit woes

By an interesting coincidence, DBUnit released version 2.0 yesterday, so I
immediately installed it.  The good news is that it didn’t require any
change in my code (probably because I am still in the early experimentation
stage at this point, but I understand that some major configuration changes have
been made).

That being said, my first contact with version 2.0 is not good at all. 
For example, I made a typo in my XML dataset and misspelled a column name:

<PERSON last_nam = "Molinier" first_name = "David" middle_name = "L" />

With DBUnit 1.5.6, the punishment is immediate:

java.sql.SQLException: General error, message from server: "Column ‘last_name’
cannot be null"

(Note that the error is not that the column name is incorrect, which
is already not looking good).

But with DBUnit 2.0, the error is silently discarded and I end up with an
inconsistent database:

+———–+————+————-+
| last_name | first_name | middle_name |
+———–+————+————-+
|           | David     
| NULL        |
+———–+————+————-+

Second, I can’t seem to be able to initialize the middle_name column, neither
with 1.5.6 nor with 2.0.  No error message, no indication whatsoever of
what went wrong.  Of course, I am pretty confident the spelling is right.

Strike three for DBUnit.  Very disappointed.

 

DBUnit doubts

I was looking forward to converting my database tests to
DBUnit, created by Manuel Laflamme. 
The idea of being able to specify my test data in an external file was
appealing, as was the fact that DBUnit is a thin layer on top of JUnit, so I was
confident I would feel comfortable with the product.

Unfortunately, things turned out differently.

First of all, I still haven’t been able to get it to work.  For some
strange reason, my getConnection() never gets invoked.  I am not extremely
worried about that, I know I will eventually figure it out, but why is it that
every single open-source product that I try never works as advertised out of the
box?  Why do I always have to become much more intimate with their source
base than I would like to?

Another sadly typical thing in open-source projects is that if you go to
DBUnit’s home page, there is no
obvious link to the documentation.  I give them points for putting the
Download link on top, but if I am trying to evaluate your product, why would I
care so much for Changes, FAQ, Getting Support, Source or JavaDocs?  Just
point me to a simple white paper of a few pages explaining why I should care
about your product.  To make matter worse, the
only
page that provides some assistance
tells you that the documentation can be
found in the release.  Come on, now, just make the darn thing available
online and make sure it sits right up next to the Download link.

Anyway.

The real problem is the idea behind DBUnit.  I started realizing that
specifying the test data in an external file didn’t make that much sense after
all.  If you are going to modify the said data, you will be modifying your
Java code as well, so the maintenance cost is pretty much the same in both
cases.  Except that if you initialize your test data in the code, you get
an additional way of testing your database code, and you are also probably
closer to the way your users will initialize their own database.

Another hint on the questionable premise of DBUnit can be read in the
author’s own comments.  He initially started with a generic XML format to
describe the data that your database should be initialized with.  Then, in
the next version, he makes the following observation:

The FlatXmlDataSet class has been introduced in version
1.2. I found that generic XML datasets were hard to write and to maintain. This
was particularly painful for tables having a lot of columns.

This new XML format is more generic but also dependent on your database
schema.  It’s a progress over the first iteration, but it’s a pity that
Manuel didn’t push this realization to its conclusion:  you are trying to
model relational data, XML is not a good way to do that because it is
hierarchical.  From a practical standpoint, I see very little difference in
verbosity between FlatXmlDataSet and plain Java code.  I would argue that a
dumb properties file would probably be the easiest choice:

person.row0 = "Beust", "Cedric", ""
person.row1 = "Purdy", "Harold", "C"

I really want to like DBUnit, but at this point, I see very little added
value compared to writing my own framework on top of JUnit.

Can someone convince me otherwise?

 

Hit the penguin!

Note:  the title has nothing to do with my previous entry 😉

Stressed?  Try this
little game
.  Quite fun (my record is 321).  Hint:  you can
control both speed and angle of the swing.

2004 the year of Linux? I don’t think so

It’s quite amusing to see various pundits
predicting that
2004 will be "the year of Linux"
(including Linus himself, but this
shouldn’t come as a surprise).  Never mind the fact that the past eight
years have all been predicted
as the "year of Linux"
, there are quite a few signs that make me think that
if anything, this year will be the year of Windows.

If Linux ever had a chance, I would evaluate it at two or three years ago. 
But now, in 2004, what do we see?  An increasing loss of market shares for
Sun, the herald of UNIX if there is any still alive these days, casting a grim
shadow on the entire UNIX industry.  Red Hat’s recent withdrawals and
reversal of fortunes are not helping, nor is the inability of the Linux
community to agree on one user interface (I remember making this exact remark to
coworkers in 1995 about Linux, and in1990 about UNIX in general).

It’s quite ironic that the only major user-oriented advance in the UNIX world
has been made by Apple, who single-handedly made UNIX credible for Joe-type
users.  But with their faltering 2% market share of personal desktops, I
can’t see how it really helps the cause that much.

On the flipside, Windows has never been so present.  First of all, the
existing offers (Windows 2000 and XP) are going stronger than ever to a point
where even die-hard Linux fans find it quite acceptable to work on these two
operating systems.

The "next generation" operating systems are still far away (Windows 2003 is
kind of here already and who can tell when Longhorn will actually be ready), but
they promise innovative features that get the developer community drooling (WinFS
and Indigo come to mind).

But if you look past the desktop, Microsoft appears as a more credible player
every day, especially in the mobile space.  Windows cell phones are making
shy but firm entries in our lives and Windows-based PDA’s are increasingly
becoming a force to reckon with.

I have read some articles saying that
2004 will make or
break Linux
.  I am predicting that nothing such will happen. 
Linux will keep the niche place it has had for years now and 99% of computer
users will simply and quietly keep not caring about it.

 

Ioc types

I am a bit puzzled by
Martin Fowler’s description of the difference between IoC type 1 (interface injection):

class MovieLister implements InjectFinder…
 
private MovieFinder finder;
 
public void injectFinder(MovieFinder finder) {
   
this.finder = finder;
 
}

and IoC type 2 (setter injection):

class MovieLister…
private MovieFinder finder;
public void setFinder(MovieFinder finder) {
 
this.finder = finder;
}

These two methods look exactly the same to me, except for the name of the
method the initializes the component and the interface implemented by the first
component (the second component could implement an interface as well).

Am I missing something?