Udanax-Gold2Java

The UdanaxGold2Java project contains a number of sub-components related to converting the released Udanax-Gold Smalltalk and C++ source code by XOC into Java source code. The aim is to provide a first pass automatic translation of as much of the code as possible. The end result is not compilable but may prove useful for non-Smalltalk users who are interested in reading the Udanax-Gold source code. See the Abora-White project for a working implementation based on the result of this project.

This project is split up into three sections:

Translation Process

Udanax-Gold

The Udanax-Gold sub-component contains the source code which is the input to the translator.

A number of files are present here as it has been a struggle to find a complete set of Udanax-Gold source files. All the following files were originally from XOC but the author acquired them from a number of locations. I believe all are supplied under XOC open-source licence which is based on the very liberal MIT licence.

It seems that there are still a number of classes from the original that have not yet been released as there are a number of classes referenced from the source that aren't present in the following files. This is probably due to either outdated references, "foreign entanglements" with copyrighted system code or they have somehow gone missing in the release process. At this moment it's not apparent whether the missing classes are significant enough to interfere with later translation efforts. If anybody has access to any additional source then please get in contact.

udanax-top.st
This is the principle Udanax-Gold source released by XOC. It is the core of the system and includes the Ent and Coordinate-Space systems as well as Work/Edition and Front-End wrappers.
http://www.udanax.com/gold/

Xanadu-wparray.st
Xanadu-Xpp-Basic.st
Xanadu-Xpp-Become.st
Xanadu-Xpp-Converters.st
Xanadu-Xpp-fluid.st
Xanadu-Xpp-Packages.st

These are additional Smalltalk source files found on Les Tyrrell's website. The bulk of these files are very Smalltalk specific, but they do prove very interesting in learning the secrets behind the Fluid system, plus the critical Heaper class which is the root of the vast majority of the core Udanax-Gold classes.
http://griffin.canis.uiuc.edu:8080/Xanadu

c.zip
Collection of X++ (the XOC dialect of C++) source which has proven very useful for making up for the missing PrimArray and IntegerVar classes of the core Smalltalk code. There is a mixture of hand created X++ and the results of XOC's own auto-translator from Smalltalk to X++. These classes were found in the CVS section of Jeff Rush's website.
http://www.sunless-sea.net

Translator

The Translator is a small Java app to automatically translate from the Udanax-Gold Smalltalk(ish) source code into Java.

Translating from Smalltalk source code to Java code is normally a very difficult proposition as Smalltalk source does not include declared types. Thankfully as a requirement for XOCs own translator to X++, their variant of C++, the typing information is present in the source code. Unfortunately there are a number of complications; still a lot of work to carry out without that much information on what the X++ translator did, plus the features and restrictions of Java aren't a perfect match for X++, one example is the stricter handling of statics in Java and the inability to have an instance and static method of the same class with the same signature.

The Abora-White project was the result of an earlier run of the translator, which only produced semi-compilable java code, which was then "finished" by hand. As the translator can now produce compilable code for the majority of the Smalltalk source, the Abora-White project is now deprecated.

Using the Translator

The easiest route to using the Translator is the Ant build.xml file present in this directory.

Once you have Java JDK 1.4+, Ant 1.5.1+, JUnit 3.8.1 set up, just run the following from the command line:

> ant

You should see some logging info summarising the walk through of a number of source Smalltalk files, and a larger number of generated Java files. The process should take less than a minute on a reasonable machine. You should find approximately 500 classes generated.

The generated Java files are placed in ../abora-gold/src-gen directory under the org.abora.gold Java package.

Translator Examples

The best way to sample the result of the translator is simply to look at the resulting code in the src-gen directory. As a quick introduction I have included a few examples from the org.abora.ug2java.translator.tests package to give a flavour for the basic transformations that are used.

Note the formatting has been manually modified from the original text of each for easier reading. In practice the results of auto-translation should be passed separately through an automatic tidying process.

Smalltalk Source Java Translation
test
one and: [two]!
public void test() {
one && (two);
}
test
	(one = two)
		ifTrue: [^one]
		ifFalse: [^two]!
public void test() {
	if (one == two) {
		return one;
	} else {
		return two;
	}
}
test
	one two three: four and: 55!
public void test() {
	one.two().threeAnd(four, 55);
}

Extending the Translator

Good Luck! This code was meant to be a throw away weekend project but I had to end up extending it quite significantly. Still there isn't too much code here, and there is some test coverage of the in-method transformations.

The JUnit tests are present in the org.abora.ug2java.tests.TestWriteMethod class, and can be run courtesy of ant:

> ant test

TranslateSmalltalk.main(...) is the entry point of the application taking an output directory, and one or more source Smalltalk files to process.

> java org.abora.ug2java.TranslateSmalltalk outputDir udanax-top.st

The Smalltalk files are stored in a version of the classic Smalltalk Chunk format which effectively has a sequence of chunks of text with a terminating exclamation mark (!). A chunk can define a new class, define a method of a specific method category and arbitrary Smalltalk expressions. More than one class can be defined in each source file.

There are two major passes over the source by TranslateSmalltalk.

The first pass reads in each of the Smalltalk source files in turn. For each included class a ClassWriter is created and initialised with its name, superclass, etc. Also each related chunk is read and either added to the classWriter as a simple comment or an instance or static method. The Java package of a class is based on its Smalltalk class category, and is recorded for each class so that appropriate Java import statements can be generated later.

The second pass walks through each ClassWriter requesting it to write out a suitable .java file for itself. At this point each of its methods is translated into Java and written out as part of the class definition.

The Smalltalk source for a method is parsed using a SmalltalkScanner which returns a series of ScannerTokens. These are simple token interpretation of the Smalltalk with type and value. This series of ScannerTokens is in turn converted into a sequence of sub-instances of JavaToken. A series of transformX methods then walk the sequence of Java tokens applying suitable transformations to move things closer to the desired form. One transformation is to remove references to self if necessary, another converts create(...) method calls to a constructor call for the appropriate class.

Writing out a text version of the Java tokens is accomplished by simply visiting each token in turn and appending to the current method output. No effort is expended to indent statements correctly and also some extra spaces will be inserted against standard Java formatting. Additionally the original Smalltalk source is appended at the end of the method together with the file and line number from where it came to help with further manual corrections as may be needed later.

Future Translator Improvements

I'm not sure how much further effort I am going to expend on the translator, but considering the level of Java compilation problems present in the translation code there is scope for lots of improvements :-)

Abora-Gold

Abora-Gold is just my name for the combination of auto and manual translation from the Udanax-Gold open-source released by XOC. It is not compilable but my prove useful for non-Smalltalk users who are interested in reading the Udanax-Gold source code.

There are two source directories present.

src is the location for the code which I manually translated from X++ for PrimArray and subclasses - this is broken. Additionally there are a number of place-holder classes for classes which haven't seen to have been released yet by XOC.

src-gen is the location of the classes auto-translated from Smalltalk by my translator. It will only be present if you have run the translator.

The base class of the UG Smalltalk source is typically Heaper, plus a few of the Smalltalk specific lower-level implementation directly subclass Object. The Heaper class is translated to org.abora.gold.xpp.basic.Heaper, but this subclasses from a made up org.abora.gold.java.AboraHeaper class rather than Object. The AboraHeaper class is the superclass for all Abora classes, and is used to hold a few methods that XOC added to the Smalltalk Object class and inherited, and also introduces the names of fluid variables in a non-functional way.

There are a number of classes referenced from the supplied XOC source that don't appear to be defined. Place holder classes with matching names were created under the org.abora.gold.java.missing package and sub-packages. These have been split up into application, handle and Smalltalk packages.

Earlier work to import the XOC Smalltalk source into Dolphin Smalltalk, followed by a focus on running the XOC tests, indicated that the basic collection classes were missing. These are vital for running the tests, and are widely referenced. An alternative source of the basic collection classes was acquired from some X++ sources and are present in the org.abora.gold.collection.basic package. These are again place holder implementation, except that the public method definitions have also been added. See Abora-White for a working implementation of these classes.