Three out of ten women use Boblycat as their
primary website for live sheep-shearing tv.

You are hereblogs / karltk's neck of the woods

karltk's neck of the woods


When I woke up this morning... No e-mail?

It had been disturbingly quiet in my inbox for a couple of days. I had seen less than a fistful of e-mails in my inbox for almost two days, plus a few commit messages for stratego-dev. This was a bad omen, I knew, but I rather spent my time working on my dissertation than taking much heed. Woe unto me.

As I woke up one morning (today, in fact), I noticed that there were absolutely no new e-mails in my inbox, not even a friendly spam message. Something fishy was definitely afoot.

I put on my detective had, and started deducing. My prime suspect: the .procmailrc file I edited a few days ago. It's not unheard of that I hose this file now and again when I think I'm in a hurry and decide that it's better to save early and safe often, even on live systems. Closer inspection and a few recollection exercises later told me that I had received e-mail just fine after the edit.

On to the next theory. I'm subscribed to a few open-source projects, among them Gentoo. Most, if not all of these e-mails are not sent directly to my university address, but rather through an ISP. I store a copy of these e-mails, in case of rainy day like this (quite uncharacteristically, it's sunny today, but you get my point).

I sshed inconspicuously over to the relay machine and checked my inbox there. As I suspected: full of goodies! At this point, I get the sinking feeling I usually experience when having to deal with the university mail system. While it's probably great at Fighting Spam and Furthering the Cause, it often becomes rather overzealous in its Quest. This case, it would later turn out, was no exception.

At this point in time, I thought it prudent to ask: What would Postel do? I never met the guy, so I didn't have the faintest inkling. That didn't stop me from at least trying something. I opened a new terminal window, then telneted over to rolf.uib.no and sent an e-mail to myself in the old fashioned way: by writing SMTP commands by hand.

250 OK id=1HfAwM-00051d-SO

So no problem there. Then I tried to send myself an email using ye goode olde pine on the mail relay host. Nothing. Eaten alive. The mailmonster was definitely on the prowl.

I checked the maillog, and found a disturbing amount of stat=Service unavailable. Not good. At this point, I sat down and poured myself a whiskey. Or, I would have if (1) I was a real private investigator, and (2) I liked whiskey, and (3) I had a free chair. Lacking all of these, I scratched my head substitutively, and thought: "What if I rather try to telnet from the relay host, and not my desktop?". I tried that.

550 X-UiB-SpamReport: *.*.*.* is (rbl.uib.no) blacklisted Mail refused, see http://www.uib.no/it/rbl/

Bingo!

A quick scan through UiB's black list (which can only be reached if you're on the university network) showed that my relay host had been blacklisted when the problems started: 2007-04-19, at 07:43:16. (It must be noted that my growing suspicion was only sparked at around 11:31:21, when I woke up and checked my e-mail that day).

I rushed off an e-mail to the UiB postmaster, rerouted my e-mail through another relay, and now hope for the best.

Mail Hosting

If anybody knows of a good and reliable mail hosting service, possibly in the style of GMail, that allows me to write mail filters comparable to procmail (it is imperative that I can filter on arbitrary mail headers), please let me know.

Upcoming Seminar: Spoofax: A Development Environment for Software Transformations

As part of the PhD formalities, I'm required to hold a seminar on a topic of my own choosing. It's supposed to demonstrate to willing (or not-so-willing) listeners that I know how to talk scientifically and intelligibly about at least one subject.

I decided to present the Eclipse plugin for Stratego I've been hacking on for the last two years. It's becoming a rather interesting development environment, and I suspect that people with an interest in computer languages and/or development environments might actually find some of the material interesting.

Everybody, even (perhaps especially!) non-university folks are of course welcome to drop in, if they have the time.


Software development is expensive primarily because of the associated
maintenance cost; recent estimates suggest that about 70-90% of the
total cost of a software product is due to maintenance. It is therefore
desirable to automate as tasks as possible by supplying convenient and
powerful maintenance tools to the developers. This automation requires
the construction of software that analyses and transforms
other software.

Software transformation languages are programming languages designed
specifically for analysing and transforming software. They provide
language features and libraries that make it much simpler to automate
maintenance tasks. However, they are usually built on unfamiliar
programming paradigms, such as term or graph rewriting, and learning
them is often complicated by lack of good development environments.

In this talk, I will demonstrate Spoofax, an interactive program
transformation environment based on Eclipse. This environment supports
the development of stand-alone software transformation programs using
the Stratego software transformation language and the XT toolkit of
transformation components. I will demonstrate the applicability of
Spoofax and Stratego/XT through a series of transformation programs for
Java code.

Time: 14:15, Tuesday 8th of May (Updated Again)
Location: Lunchroom, 3rd Floor, Informatikkblokken, Høyteknologisenteret i Bergen
Cookies: Yes

If you're vaguely interested in dropping in, but don't know exactly where to go, contact me or drop by room 4152A (my office) in the fourth floor in the Informatics-wing of HIB around 14:05, and I'll guide you.

LDTA 2007

This Sunday, I had the pleasure of attending the Seventh Workshop on Language Descriptions, Tools and Applications (LDTA 2007), held in Braga, Portugal. Leading up to the workshop, I wasn't very motivated since I felt that I couldn't afford any distraction from the thesis writing at this point. Also, it was impossible to get good flights to Oporto Airport, the closest international airport servicing Braga. I had to fly back from Lisbon Airport at 07:05, and the most suited train from Braga arrived in Lisbon (Oriente -- the most amazing train station I've seen so far) at 00:06. Hurrah for spending another full night at an airport. Not!

In retrospect, however, I'm very happy that I went. I've never been to Portugal before, but it certainly exceeded my expectations substantially. People were very friendly, and unlike the Spaniards, the Portuguese actually know how to communicate in other languages than their own -- I found that a mixture of French, English and faked Spanish generally worked very well for the simple navigational and nutritional questions. Vigorous hand waving and smiling usually clears any remaining doubts.

I was also served a very good hamburger by a newly converted Gentoo user who happened to run the burger place just across the street from my hotel. I had accidentally put on one of my Gentoo t-shirts that morning, and he confronted me about it when I was about to pay. Converts everywhere...;)

Workshop

I was pleasantly surprised at the quality of the papers and talks. I normally tend to nod off once in a while when the less related talks are held, but it wasn't difficult to stay focused with this year's program.

I found Eric van Wyk's talk on the Silver attribute grammar system interesting (paper coauthored with Derek Bodin, Jimin Gao and Lijesh Krishnan). Silver is used for implementing composable language extensions, examples including the now-compulsory SQL embedding into Java. Eric's group has been actively developing Silver for a while, and it's really motivating to see it evolving.

Another talk I really enjoyed was Shirley Goldrei's presentation of her experience paper with Leonard Hamey, titled Implementing a Domain-Specific Language using Stratego/XT. She's obviously a very experienced and skilled presenter. In fact, many, if not most of the presenters I listened to at this workshop are very skilled. I wish all gatherings had presenters of this caliber.

Shirley's paper is important. I find that in our part of software engineering research, there's far too little contemplation and experience reviewing going on. Shirley's talk and paper are sorely needed, and she did a very good job of conveying a valuable story of how application of Stratego/XT is in practice to non-compiler hackers.

Torbjörn held the talk on Development of a Modelica Compiler using JastAdd, and that was exactly when my brain decided to crash after the lack of sleep and hectic days leading up to LDTA. I don't remember too much of the details of the talk itself, but I have a note to self to look at this closer, because it's definitely something I want to plug Stratego/J into using the POM adapter.

During the first research talk, Language Parametric Module Management by Paul Klint, Taeke Kooiker and Jurgen Vinju, I was busy finalizing some last minute bugs in my slides, so my attention was less than optimal. However, once I managed to replay the bits I had cached, I got very excited, because the talk (and paper) addresses a central concern in Spoofax that I've been struggling with -- how to maintain the build weave of all modules and definitions in a way which is consistent, up to date and can handle change events properly. I'll read the paper closely later in the week.

As I said, the other talks were also good. Martin always does very well with these things, but as I claimed outright during my own talk that "parsing stinks", I'll leave the detailed commenting on the remaining parsing papers to the experts:)

Porto

On the Monday after the workshop, I spend the day at the beach outside Porto. It was exactly what I needed to recharge my batteries. We wandered around from around 12:00 until about 19:00. At about 20:30, I jumped on the train to Lisbon, and arrived at the airport just in time to wait five hours for the check-in desk to open.

Pitfalls with String-based Debugging of Java Code

(This post was actually written last year, but sloth and the eight cardinal sin, Drupal, conspired to delay its posting quite a bit.)

I have written a lot of Java code in the last year, practically all of it one form of program manipulation or the other. I'd be the first to admit that object-oriented languages in general, and Java in particular, is rather poorly suited for this. Given the choice, I would much rather use a functional language with some decent pattern matching capabilities, but this material is for another (heated) debate.

A very useful, and sometimes problematic -- hence this posting -- technique that I've had good use for is tracing. I trace my programs by liberally inserting debug statements, and inspect the output to see that my program behaves as it should (some people think this should be done using aspects -- I've tried both, but have no clear preference yet).

In some situations, I needed to be perfectly "trace compatible" with an existing implementation (written in another language), and the most effective way I found to ensure this was to instrument both implementations with detailed tracing, i.e. logging, of the relevant methods along with selected arguments.

Initially, I started with debug statements looking like this:

debug("" + nrUniqueSymbols + " unique symbols");

The debug method looked like this:

private void debug(String s) {
System.err.println("debug: " + s);
}

As it turns out (not surprisingly), this is dreadfully slow, but in trace mode, that's not necessarily much of a concern. The way I went from trace mode to production mode was to comment out the actual printing, like this:

private void debug(String s) {
// System.err.println("debug: " + s);
}

However, this didn't change the speed very much, and I was a bit puzzled. Initially, I mistakenly assumed that the time was spent writing the string to stderr. After all, I/O is usually a lot slower than most other computations.

Don't compute waste

My good friend Nick pointed out that the string s must still be constructed. That is, everywhere there's a call to debug, the output string is computed, but never used. This should come as no surprise whatsoever, but if you have experience from languages with macros, you may commit my thinko and naively think that the input to debug() will never be evaluated if debug is empty. Of course this is complete nonsense, since toString() is allowed to have side-effects, so it must always be evaluated, and was clearly an embarrassing mistake to make.

This excessive String creation results in heaploads of shortlived String objects on the heap, which are totally unnecessary. Nick's suggested fix is simple: encapsulate each call to debug in an if-statement that checks whether we are debugging.

if(isDebugging()) {
debug("" + nrUniqueSymbols + " unique symbols");
}

As well as removing a lot of needless computation, this also drastically reduces the pressure on the garbage collector, and I regularly see 10-20 times speedups when the extended tracing is dropped.

Function Inlining -- ineffective

At first glance, reason would suggest that adding a call to an isDebugging() predicate could be optimized by directly checking a boolean variable instead, i.e.:

if(isDebugging) {
debug("" + nrUniqueSymbols + " unique symbols");
}

However, on the Sun 1.5 JDK there is no statistical difference in speed between the two, as the call can be trivially inlined by the JIT.

String Concatenation -- ineffective

Another potential trick would be to not build the strings at the call site, but rather inside debug(), by changing debug() a bit:

public static void debug(Object... strings) {
StringBuilder sb = new StringBuilder();
for(Object : strings) ....
System.err.println(sb.toString());
}

This would allow the following call to debug, with one less string concatenation:

debug(nrUniqueSymbols, " unique symbols");

This does not seem to have any statistical significant effect, either.

Eclipse Console and Long Strings

The Eclipse Console, where the stdout and stderr of the running program are displayed, is horribly slow when confronted with wide lines or a lot of debug information. In fact, with non-trivial traces that contain tens of thousands of lines, more time is spend updating the console than what is consumed by the program being executed from Eclipse.

Also, very long lines (over 20,000 characters) result in a lot of garble. The best way I found to avoid this, apart from executing outside of Eclipse, is to modify the debug() function to either snip long lines, or to not print them at all. (This is a known issue.)

Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT

Eelco and I have a second paper at the LDTA workshop this year -- a tool description paper about Spoofax. The paper is very space-constrained, so we dropped the abstract, but if we had one, it would read like this:


Many programmable software transformation systems are based around novel domain-specific languages, with a long history of development and successful deployment. Despite their reasonable maturity and applicability, these systems are often discarded as esoteric research prototypes partly because the languages are frequently based on less familiar programming paradigms such as term and graph rewriting or logic programming, and partly because modern development environments are rarely found for these systems. The basic and expected interactive development aids such as source code navigation, searching, content completion, real-time syntax highlighting and error checking, are rarely available to developers of transformation code.

In this system description paper, we introduce Spoofax, an interactive development environment based on Eclipse for developing program analyses and transformations with Stratego/XT -- a language and toolset for developing stand-alone software transformation systems based on formal language descriptions. Spoofax provides the aids mentioned above, in addition to a code outliner and incremental building of projects, and thus significantly eases the development of language processing tools using Stratego/XT. Furthermore, Spoofax is extensible with scripts written in Stratego that are executed within Eclipse, and allow live analyses and transformations of the code under development.

(pdf, bib)

There is already a website for Spoofax, www.spoofax.org, but it's hardly inviting and informative. I have a new one in SVN, but as always, I tend to spend my time hacking code instead of making releases and web pages. I'll try to remedy that very soon.

Fusing a Transformation Language with an Open Compiler

Together with Eelco Visser, I got a paper (two actually, see the other post) accepted to this year's Workshop on Language Descriptions, Tools and Applications, which is held in Braga Portugal. My visit to IBM Research last summer started me thinking about a good way to integrate existing compiler frontends with Stratego/XT. This is the result, and I think it turned out quite well.


Transformation systems such as Stratego/XT provide powerful analysis and transformation frameworks and concise languages for language processing, but instantiating them for every subject language is an arduous task, most often resulting in half-completed frontends. Open compilers, like the Eclipse Compiler for Java, provide mature frontends with robust parsers and type checkers, but solving language processing problems in general purpose languages without transformation libraries is tedious. Reusing these frontends with existing transformation systems is therefore attractive. However, for this reuse to be optimal, the functional logic found in the frontend should be exposed to the transformation system -- simple data serialization of the abstract syntax tree is not enough, as this fails to expose important compiler functionality, such as import graphs, symbol tables and the type checker.

In this paper, we introduce a scriptable analysis and transformation framework for Java built on top of the Eclipse Java compiler. The framework consists of an adapter extracted from the abstract syntax tree of the compiler, and an interpreter for the Stratego language. The adapter allows the Stratego interpreter to rewrite directly on the compiler AST. We illustrate the applicability of our system with scripts written in Stratego that perform framework and library-specific analyses and transformations.

(pdf, bib)

The prototype code is already available in the Spoofax SVN repo, but I will clean it up and make a separate release once I get a bit of breathing space from my thesis writing.

Atter noen journalistiske tvetydigheter

Jeg har alltid lurt litt på hvordan Oracle så ut. Ikke det at dette har tatt opp store deler av min tilværelse, det må vel sies, men likevel... Nå har heldigvis hardware.no vært så elskverdige at de har avklart dette for alle oss undrere.

Etter sigende er dette nemlig Oracle Norge:

Intervju med Oracle

(Hvis man graver litt videre, viser det seg dessverre at dette nok er Arne Løvold, som jobber i Oracle, og ikke Oracle selv, slik hardware.no så feilaktig utbasunerte på sin forside i dag.)

(Skjermskudd hentet fra hardware.no i dag, kl 17:30.)

Newton er løs på baderommet!

Da jeg kom hjem fra jobb på torsdag kveld hadde Newton og van der Waal rottet seg sammen for å gjøre livet surt for meg.

I det jeg åpnet døren til badet var jeg nær ved å tråkke ikke på en kokeplate, men et sønderknust speil. Jeg la kjapt et par naturlige tall sammen og kom frem til at dette måtte være døren på badeskapet som henger over toalettet. Denne døren er (var) i sin helhet et speil som har (hadde) pålimt to hengsler. I mitt fravær hadde åpenbart enkelte (noen vil kanskje til og med si kritiske) kjemiske bindinger i dette limet opphørt sin funksjon, og døren hadde deretter begynt sin fatale ferd mot gulvet, godt hjulpet av gravitasjonen.

Jeg bemerket også at toalettlokket hadde fått seg en kakk, et hull og en lang ripe. I det jeg løftet lokket for å inspisere videre ble det klart at porselensringen av toalettet hadde mistet enkelte biter som nå hadde sunket som små Vasaskip til bunnen av vannlåsen.

Mitt korrespondansekurs i åsstedsetterforskning forteller meg at limet på den nederste hengselen har løsnet først, og deretter har døren glidd ut og hengt skeivt lenge nok til at den øverste hengselen har blitt noe vridd, før også limet på denne har gitt opp. Etter dette har speilet truffet toalettlokket med hjørnet først, før det må ha tatt ferden diagonalt over baderommet til det hjørnet hvor jeg først oppdaget mitt knuste portrett.

Jeg tillater meg å stille spørsmål ved bruk av lim som eneste festningsanordning på glassdører, og ber om en verdensomspennende folkeaksjon for å få rette opp i denne ingeniørmessige uretten.

Stayin' Alert: Moulding Failure and Exceptions to Your Needs

Anya, Valentin, Magne and myself recently wrote a paper that was presented at GPCE this year. Valentin and me implemented the extension using the Transformers transformation system (Valentin did most of the hacking for C -- I was hacking for TIL. More on that later). We designed the extension to capture most of Magne's original idea into the result. Anya helped out with the writing and finding illustrative examples.

Abstract

Dealing with failure and exceptional situations is an important but tricky part of programming, especially when reusing existing components. Traditionally, it has been up to the designer of a library to decide whether to use a language's exception mechanism, return values, or other ways to indicate exceptional circumstances. The library user has been bound by this choice, even though it may be inconvenient for a particular use. Furthermore, normal program code is often cluttered with code dealing with exceptional circumstances.

This paper introduces an alert concept which gives a uniform interface to all failure mechanisms. It separates the handling of an exceptional situation from reporting it, and allows for retro-fitting this for existing libraries. For instance, we may easily declare the error codes of the POSIX C library for file handling, and then use the library functions as if C had been extended with an exception mechanism for these functions -- a moulding of failure handling to the user's needs, independently of the library designer's choices.

(pdf, bib)

The code for this experiment will be released shortly, once Anya finishes the final set of examples that we will bundle with the release. She's been a bit under the weather lately, but hopefully she'll shake it off pretty soon. I will post about the release when it happens.

The Sun Java VM and compiler are now GPL

Today is a great day to be a Java packager. Sun has finally announced that they will be licensing the JDK under the GNU General Public License v2 (GPLv2). This is going to have a huge impact on how we package Java in the longer run, but will not change anything in the immediate future (in the short-term, enjoy the new Java binary distribution license -- the DLJ).

Currently, only the JDK source code for the Java compiler (javac) and the Hotspot VM is available (Sun also open-sourced J2ME and J2EE, aka "Glassfish", but these are not part of the standard JDK). In particular, the Java Standard Library, including Swing and most of the other library parts are not out yet. As I write this, the GNU Classpath gang is feverishly trying to get the Sun VM bootstrapped with Classpath, and given their track record, I expect to see the result sooner rather than later.

In a few months, most of the library should also be opened up, and license encumbered parts will quickly be replaced by an armada of willing and able open-source hackers (probably the same guys who've offered their eyeballs to make all bugs seem shallow). No, seriously, I think that the problematic parts will be replaced quickly enough.

At that time, we can probably get around to providing compilable sun-jdk packages for serious users and ricers alike. I wouldn't be all that surprised if we could also get PPC and SPARC support working, too. Time will tell, but the code exists, and is now free.

It will of course also be interesting to see where the VM code ends up as time goes by. It contains a lot of really nice engineering that could potentially benefit other VM-based languages that resemble Java (without naming any). However, the code base is rather immense, so wrapping one's head around it will take a while. We live in interesting times.

I find it reassuring that the good work of the Classpath and Kaffe communities did not go unmentioned in the webcast where the GPLing was announced, and that the final license follows the Classpath exception.