Log in

Flat Cat in a C3 Vat

I'm getting close to a 0.2 release of Categories. Changes include:

  • A Clojure version
  • A version that targets Gambit-C
  • The -c3- domain is implemented, and -c3- examples are included
  • Several bugs are fixed
It's worth noting that the initial implementation of -c3- is spectacularly inefficient. It doesn't have to be that way, but it was easiest to concentrate first on getting the domain to work right, and worry about caching and other obvious optimizations afterward.

Another interesting point is that Eduardo Cavazos got the 0.1 release working in Ikarus and Ypsilon, which are both R6RS Schemes. I already planned to include an R6Rs version in a forthcoming release; Eduardo's kind efforts will likely make that happen faster.

The rest of this post walks through a little of the test code I've been using, as examples of working with the -flat- and -c3- domains.

First, let's define a few types.
  (define <user> (structure () username password))
  (define <admin> (structure (<user>) roles))
  (define <trial> (structure () start-date))
  (define <trial-user> (structure (<user> <trial>)))

If you're used to Java or C# or C++, you might be tempted to think
that <trial-user> inherits from <user> and
<trial>. It doesn't. In Categories, types don't have
inheritance. The definition of <trial-user> includes
<user> and <trial>, which means that it has all the fields
that are defined in those types. We haven't defined any inheritance
yet, though. Inheritance is possible, as we'll see shortly, but it
belongs to domains, not types.

Now let's define a function:

(define print-user (function -flat-))

This function belongs to the -flat- domain. -flat- is a very simple
domain that supports polymorphic functions, but not inheritance. That
means that if you define a method on <number>, then it will run
only if you pass an argument of type <number>. Passing an
argument of type <integer> won't select that method, because
<integer> is not <number>. -flat- does not support
inheritance. The disadvantage is that you can't automatically inherit
behavior in -flat-. The advantage is that it's simple and fast--and
you don't automatically inherit behavior. Hey, sometimes you don't
want to inherit.

So far, the function print-user doesn't have any methods. Let's add
(add-method! print-user
            (flat:method ((u <user>))
                         (display "User: ")
                         (display (get-key u 'username))

(add-method! print-user
            (flat:method ((u <admin>))
                         (display "Administrator: ")
                         (display (get-key u 'username))
                         (display "  Roles: ")
                         (display (get-key u 'roles))

(add-method! print-user
            (flat:method ((u <trial-user>))
                         (display "Trial user: ")
                         (display (get-key u 'username))
                         (display "  Start date: ")
                         (display (get-key u 'start-date))

The macro flat:method is a convenience provided by the -flat- domain
that creates a method suitable for use with -flat- functions. Now the
function print-user has three methods that work on three different
types. Let's try them out.

First we need some values to use as inputs:
(define $user (make <user> 'username "fred" 'password "dino"))
(define $admin (make <admin> 'username "joe" 'password "f23Zb!" 'roles '(gm dev)))
(define $trial (make <trial-user> 'username "noob" 'password "noob" 'start-date "2009-11-11"))

Now we can call print-user on them:
(print-user $user)

User: fred

(print-user $admin)

Administrator: joe
 Roles: (gm dev)

(print-user $trial)

Trial user: noob
 Start date: 2009-11-11

Categories dispatches on more than one argument. Let's define a
two-argument function and see how that works:

(define times (function -flat-))

(add-method! times
            (flat:method ((n <integer>)(x <integer>))
                         (* n x)))

(add-method! times
            (flat:method ((x <integer>)(y <integer>)(z <integer>))
                         (* x y z)))

(add-method! times
            (flat:method ((n <integer>)(s <string>))
                         (apply string-append (vector->list (make-vector n s)))))

(add-method! times
            (flat:method ((n <integer>)(s <symbol>))
                         (make-vector n s)))

(times 2 3) ; => 6
(times 2 3 4) ; => 24
(times 2 "Foo") ; => "FooFoo"
(times 2 'bob) ; => #(bob bob)

Notice that (times 2 3) calls a different method from (times 2 "Foo")
or (times 2 'bob). The difference is the second argument: in the first
example it's an integer; in the second, it's a string; in the third,
it's a symbol.

Okay, what if you want inheritance? Well, you use a different
domain. The domain I just added, -c3-, supports multiple
inheritance. We can write the same examples for -c3- with just a few
changes. First of all, let's tell -c3- about the new types we created:

(c3-dom:derive-type! -c3- <user> (list <anything>))
(c3-dom:derive-type! -c3- <admin> (list <user>))
(c3-dom:derive-type! -c3- <trial> (list <anything>))
(c3-dom:derive-type! -c3- <trial-user> (list <user> <trial>))

The type <anything> is the root of the -c3- type graph;
everything inherits from it one way or another. c3:derive-type! is a
convenience provided by -c3- for telling the domain about new
types. All you have to do is tell -c3- what supertypes you want a new
type to have; it figures out the rest.

Notice that in -c3-, <trial-user> inherits from <trial>
and <user>, just as you would expect. There's an important
subtlety here, though. The definition of the <trial-user> type
gave it the same fields as <trial> and <user>. The
c3-dom:derive-type! function made it inherit from those two types. The
important point is that it doesn't have to inherit from the
same types that it includes. Sometimes you want to say that a new type
is a subtype of some other type, but you don't want to inherit all
fields of the parent types. Representation and taxonomy are two
different things, and Categories doesn't force you to combine them,
though you can if you want.

Okay, now how do we write a polymorphic function in -c3-? The same way
we did it in -flat-, except that we make a -c3- function instead of a
-flat- function, like so:

(define print-user (function -c3-))

This definition replaces the old one, so print-user is no longer a
-flat- function. You can have both -c3- functions and -flat-
functions; they just can't have the same names.

Now let's add some methods again:
(add-method! print-user
            (c3:method ((u <user>))
                       (display "User: ")
                       (display (get-key u 'username))

(add-method! print-user
            (c3:method ((u <admin>))
                         (next-method u)
                         (display "Admin roles: ")
                         (display (get-key u 'roles))

(add-method! print-user
            (c3:method ((u <trial-user>))
                         (next-method u)
                         (display "[Trial]  Start date: ")
                         (display (get-key u 'start-date))

This time we use c3:method instead of flat:method. c3:method is a
convenience that, you guessed it, creates a method suitable for use
with -c3-. (Actually, at this point there is no significant difference
between the two kinds of method, but the important point is that there
could be. A user-defined domain is free to represent type
relationships and method signatures any way it wants).

Notice that the -c3- methods are a little bit shorter. That's because
they take advantage of inheritance to reuse the code defined for
supertypes. The methods for <admin> and <trial-user> call
next-method to reuse the code defined for <user>.

Here's the output:
(print-user $user) ; =>

User: fred

(print-user $admin) ; =>

User: barney
Admin roles: (gm editor)

(print-user $trial)

User: noob
[Trial]  Start date: 2009-11-11

Notice that I didn't redefine any of the types, and I didn't recreate
any of the instances. Changing domains affects only the taxonomy. The
types and their instances don't have to change at all. you can reuse
the same types and values in several different domains at the same

The two-argument functions work similarly, but again we can leverage
the fact that -c3- supports inheritance:

(define times (function -c3-))

(add-method! times
            (c3:method ((n <number>)(x <number>))
                         (* n x)))

(add-method! times
            (c3:method ((x <number>)(y <number>)(z <number>))
                         (* x y z)))

(add-method! times
            (c3:method ((n <number>)(s <string>))
                         (apply string-append (vector->list (make-vector n s)))))

(add-method! times
            (c3:method ((n <number>)(s <symbol>))
                         (make-vector n s)))

(times 2 3) ; => 6
(times 2 3.0 4.0) ; => 24.0
(times 2 "Foo") ; => "FooFoo"
(times 2 'bob) ; => #(bob bob)

We don't have to define methods specifically for <integer> and
<float> in -c3-; the <number> method works on values of those
types just fine, because -c3- defines <float> and
<integer> as subtypes of <number>. Otherwise, the behavior
of this function is the same as in the -flat- domain.



Real examples will be bulky, because the inefficiencies I'm addressing tend to happen in systems of lots of related types. Let me give a contrived example, so that it can be simple and compact.

Let's start with a 2D graphics library that operates on Cartesian coordinates, defined as pairs of integers. Suppose you want to extend it to operate on Polar coordinates. Polar coordinates can also be represented as pairs of numbers, but the elements have a substantially different meaning, and integers are not a convenient representation for them. They need a different representation.

Because we want the library code to recognize the new Polar points as Point objects, we're going to have to inherit from the provided Point object. But doing that in conventional object-oriented languages means we are also going to inherit two fields that we never use. We could solve that problem by refactoring the existing implementation of Point, but that refactoring work is just overhead, and it may not be possible--we may not be free to rewrite the library's source code.

So we're stuck; either we inherit two useless fields in order to give our new type the right taxonomy, or we define it as a disjoint type, and lose the ability to recognize it as a Point.

As I said, this example is contrived. I've seen plenty of real examples, but they tend to be more complicated. One common example involves extending GUI widgets, and being forced to inherit a bunch of fields that will never be used by the extended widgets. Another is extending a networking library to handle a new protocol, and being forced to inherit fields that will never be used in the new protocol. A third is an extensible library of collection types: different representations of maps from keys to values have very different space- and time-performance profiles, even though their APIs are basically the same. We would like to be able to add a new kind of key/value mapping without inheriting the data layout of the existing ones.

A reasonable objection is that any of these problems could be solved by refactoring the superclasses to provide abstract types that we can inherit from, and making the existing types subclasses of those abstract types. But, as a practical matter, we're not always free to refactor a given library to suit our design requirements, and even when we are, such a refactoring is pure overhead from the point of view of the person who just needs a new type.

February 2010

Powered by LiveJournal.com