Thursday, October 4, 2007

Writing a toolchain plugin for CDT (or: things not to use XML for)

I'm currently in the process of implementing support for a compiler toolchain for Eclipse/CDT. This means writing a large chunk of XML code describing how the toolchain works (what tools there are, what kind of inputs they take, what options they take, what build configurations are available, and so on). A typical standard thing to do when you want to use CDT with a compiler other than GCC.

Eclipse/CDT is very flexible when it comes to supporting other kind of toolchains; anything which can be broken down to a series of external programs which which interact to produce output from a given set of inputs. You can implement a toolchain for creating PDF files from LaTeX input, although this has little to do with C/C++ development. This great flexibility is actually the first warning flag. Yes, flexibility is a good thing, but the user's have to be able to wield it as well.

Most of the toolchain is defined in the plugin.xml. This is where all tools, configuration, projects type, options, etc. are defined. This is actually the second warning flag. My plugin.xml is at 2000 lines and counting, and I haven't really done anything complicated. A bunch of options, two different project types, two build configurations, and 4 tools (c-compiler, c++-compiler, linker, assembler). Yes, I could probably break out much of this in separate plugins, but that seems a little overkill.

CDT has a system for inheritance among elements in the toolchain definition file. I can define a common base class for my C and C++ compilers, and define C++-specific options in a subclass. What CDT does is basically to paste a primitive inheritance system on top of XML. It works like this:


<option
id="com.foo.bar.option.parentOption"
isAbstract="true"/>
<option
id="com.foo.bar.option.subOption"
superClass="com.foo.bar.option.parentOption"
... />

This may sound great at a first look, but it quickly turns out to be hopelessly inadequate. It's ok to do small things this way, but when you have three different tools with similar set of options you need a way to be able to share these in a more flexible manner.

One of my favorite programming principles is DRY: Don't Repeat Yourself. While writing CDT toolchain definitions you have to break this rule so many times, it's almost physically painful. For example: each XML element has to have a globally unique identifier, and it has to be specified in full in the element itself. In the plugin.xml file which contains my toolchain definition, 25% of the lines includes the reverse-DNS name ("com.foo.bar.plugin", for example) used to make sure that the identifier are globally unique.

There is an option for my compiler which takes 20-30 enumerable values (the target cpu type). To specify this, I have to write an XML element for each of these values:

<enumeratedOptionValue
command="--cpu FOOBAR1"
id="com.foo.bar.plugin.toolchain.compiler.generic.option.cpu.FOOBAR1"
name="FOOBAR1"/>

Instead of allowing me to have a simple list of all cpu types and generate each option, I have to manually (cut-n-paste) expand the enumeration value definitions.

I think it would be better to write the toolchain definition in Java instead. The idea of defining things in the plugin's plugin.xml file is to let Eclipse know things about plugins without having to load the plugin itself, but does this include details about which options my compiler takes? If plugin.xml just defined the entry point classes for my toolchain in the plugin.xml file, I could define the toolchain by creating the appropriate Java objects. This would allow me to use the full power of Java's inheritance system and I can store option metadata (such as the number of available cpu types) in a way most suitable for my needs.

2 comments:

Doug Schaefer said...

Interestingly enough, we did it this way so that toolchain implementers didn't need to know Java.

There is a facility to do you're toolchain in Java but it isn't well documented and explained.

But I get your point. Sometimes it's just easier to do things in Java instead of reams of XML.

Unknown said...

Yes, I guessed that was the case. The problem is that you very soon end up having to write Java anyway. Macro-suppliers, option-applicability, and so on.

Also, the builtin plugin.xml-editor does not scale for this kind of stuff. I would like to have some kind of assistance in entering id-strings, and making sure that superClass and optionCategory attributes are ok (instead of getting a discreet message on stderr at runtime).

I actually ended up writing large parts of the plugin.xml file directly in Emacs with the MBS extensibility document on the side.

Since the plugin.xml-way is the best documented way, I'll probably take a look at trying to autogenerate the plugin.xml file from some more dedicated format.