Thursday, April 13, 2017

Appendages

I've recently pushed the code to implement Appendages, which will likely be the primary feature of the 1.1 release.

Formally, appendages are a way to extend the functionality of a class orthogonal to its normal inheritance hierarchy, and specifically to its instance state. In other words, they are classes consisting only of methods that can be applied to any object derived from a specified base class, called the "anchor" class.

Taking the example from the manual, let's say we have pair of classes for representing two dimensional coordinates:


    class Coord {
        int x, y;
        oper init(int x, int y) : x = x, y = y {}
    }

    class NamedCoord : Coord {
        String name;
        oper init(int x, int y, String name) :
            Coord(x, y),
            name = name {
        }
    }

Now let's say we want to add some new functionality:

  • Get the distance of the coordinate from the origin (the "magnitude" of the coordinate's vector).

  • Get the area of the rectangle defined by the coordinate and the origin.

In the absence of other considerations, we might just add two methods to Coord and be done with it. However, this doesn't always work.

If Coord and NamedCoord are in a module we don't own (an "external" module), adding methods is more complicated. Our new methods might not be appropriate for general use, and for non-final classes adding new methods breaks compatibility. So our new methods might simply not be welcome upstream, and in any case, we might not want to be blocked on waiting for our change to come back around into a released version of the external module.

We could derive a new class, "SpatialCoord", from Coord and give it the new methods. But then NamedCoord won't have them. Furthermore, if Coord comes from an external module, we might not even control the allocation of the new object: it might be produced internally by some other subsystem and merely shared with our calling code.

Prior to appendages, the only way to solve this was to define our new methods as functions accepting Coord as an argument:

    int getMagnitude(Coord c) {
        return sqrt(c.x * c.x + c.y * c.y);
    }

    int getArea(Coord c) {
        return c.x * c.y;
    }

This works, but it has a few problems as compared to having them bundled as methods in a class:

  • As separate functions, a module using them would have to import each of them separately and also retain them as separate elements in a shared namespace.

  • We lose some syntactic niceties, such as the object . method () syntax and the implicit this.

  • As standalone functions, we are unable to access protected members of the class that would be accessible to methods of a derived class. (This is not an issue in the example, however it is an issue with the approach in general).

Appendages provide a nicer solution. We can define an appendage on Coord by creating a derived class definition that uses an equal sign ("=") instead of a colon before the base class list:

    class SpatialCoord = Coord {
        int getMagnitude() {
            return sqrt(x * x + y * y);
        }

        int getArea() {
            return x * y;
        }
    }

Defining an appendage is very much like defining a class except that an appendage is limited to a set of methods. These can be applied to any instance of a class derived from the anchor class (Coord, in this case). To make this work, an appendage can have no instance data of its own. This means:

  • No instance variables.

  • No virtual methods (all methods are final or explicitly static).

  • No constructors or destructors.

To use an appendage, we must explicitly convert an instance of the anchor class using on overloaded "oper new" (which looks like ordinary instance construction):

    foo := SpatialCoord(Coord(10, 20));
    bar := SpatialCoord(NamedCoord(20, 30, 'fido'));
    cout I`bar has magnitude $(bar.getMagnitude()) \
           and area $(bar.getArea())\n`;

Note that the "foo" and "bar" assignments don't create new instances: they just convert existing instances of Coord and NamedCoord to SpatialCoord in order to allow the use of its methods. This is a zero cost abstraction.

You can also compose an appendage from several other appendages (as long as they all have the same anchor class). For example, we could have done this:


    class MagnitudeCoord = Coord {
        int getMagnitude() {
            return sqrt(x * x + y * y);
        }
    }

    class AreaCoord = Coord {
        int getArea() {
            return x * y;
        }
    }

    class SpatialCoord = MagnitudeCoord, AreaCoord {}

There are a number of places in the Crack library that will benefit from the use of appendages. Notably, appendages will make it possible to define encoding-specific String classes. The String class hierarchy (or, more properly, the Buffer class hierarchy) currently specializes around the concept of ownership (Buffer has no ownership assumptions, ManagedBuffer has an associated buffer and is growable, String owns an associated (to be treated as) immutable buffer ...). All of these are just byte buffers: there is no character concept, the user is responsible for assuming an encoding and ensuring that the string is treated correctly with respect to its encoding.

With appendages, it will be possible to define ASCIIString and UTF8String, so instaed of having to import individual methods from the crack.ascii and crack.strutil modules, we'll just be able to import (e.g.) ASCIIString and then call functions like strip() and toLower() as normal methods.

There are a few potential areas for improvement in appendages:

  • Implicit conversion (while generally an antipattern) would be useful here. If we have a function accepting an appendage as an argument, there's not a lot of value in having either the caller or the function itself do the conversion.

  • Having some way to explicitly require validation during conversion to an appendage would be nice (especially in the case of string appendages). You can currently define static members to do validation, but there's no way to exclude generation of the "oper new" methods that allow a user to more naturally bypass them.

  • It would be useful for appendages created from classes derived from the anchor class to preserve the methods of the derived class. For example, when we do

    bar := SpatialCoord(NamedCoord(20, 30, 'fido'))
    above, we're losing the ability to access the name variable from bar. It should be possible to work around this right now by defining the appendage as a generic, though at the cost of generating multiple instances of the appendage code.

Nonetheless, appendages are a feature that I have long wished for that are very much in line with Crack's original goal of expanding upon existing concepts in the Object Oriented paradigm in a very natural way.

No comments:

Post a Comment