It's a bit different than most recent posts, but I felt like talking about about good code, bad code, and some of the World's code design that has paid off, in theory.
Divide-and-conquer is a pretty well known strategy, in any field. Any task which is sufficiently complicated has to be broken into pieces to be completed. Needless to say, almost all programming tasks are beyond the mental capacity of an individual to solve. Breaking the problem into pieces is a fundamental task in programming, and most advances in programming, historically, have been to make this break easier.
Now, there are many ways to divide things up. Or rather, there are many ways to combine the pieces together. Some of them are bad. For example, introductory object oriented programming teaches us to split things up by classes, and further by their respective nouns (data) and verbs (functions). This is bad! Very bad. In the long run, you end up with a class with a dizzying variety of tasks, data, and code, none of which may be terribly related to each other. Furthermore, the noun/verb class abstraction often breaks down in terrible ways. They say a square is a rectangle, but try and implement a square and a rectangle class with that relationship embedded and you'll have trouble, I promise you.
Anyway. We all agree that dividing the work up is a necessity. However, as soon as we did the work, we assumedly integrate it back together to form the whole. But, and here's the point, if we have to go back to it, how to we redivide it? We broke up the work, but only once, and then we threw the division away! That seems silly. Why not retain the division somehow, so that if we have to go back to code a year later, it is already divided into understandable components?
This has been codified into a programming principle called SRP, the Single Responsibility Principle. When following this principle, the rule is that any given piece of code must only 'care' about a single aspect of the program. Defining 'care' is tricky of course, but I define it to mean anything I can understand without reference. If the code brings two complicated systems together, and each is too complicated to keep in my brain, that's too much responsibility, and the offending code must be further divided (and conquered).
An anecdotal example. In Twilight 1, each potential action a unit could perform was a discrete class. Each individual class had a large set of knowledge required to perform its job. A 'rob that building' task action needed to know how to move, how to respond to danger, how to rob, and how to return with the results. Not only was each action very difficult to get working, but they inevitably had bugs. The code was too complicated, because it knew too much.
To fix this, in Twilight 2 I implemented an action sequencing system. Robbery was defined as 'Move to Building', 'Rob', and 'Return', and these were reusable pieces. How could this single element, the sequence, possibly be incorrect? It can't. There are no possible bugs. The sub-components can be wrong, but that's not the worry of this action, so it MUST work. I never had to touch the sequences again. (Actually, since I implemented my own scripting language to build these sequences, I overdid it greatly and suffered from second-system-syndrome, but that's a talk for another post).
So yes, Single-Responsibility is good. It makes for a HUGE number of classes, but each one is individually understandable, debug-able, and iterate-able. Ironically, the only good way to get SRP out of traditional Object-Oriented Programming, is to invert the traditional approach. Classes are no longer nouns with function verbs. Classes are now verbs, that operate on classes that are nouns. This separates data storage and activity. Each potentially complicated verb is split into its own responsibility, and life is good.
World's design, entirely on accident, absolutely enforces this paradigm. I didn't even realize it till a year past I had been working with it.
When working with the game entities in World, there are only 3 concepts available for use: Attributes (just data), States (usually implemented as a State Machine), and Actions (which are just functions, but with some attached metadata). When constructing a new action, there is a set of action classes already implemented, which can be used, or a new one can be implemented. An action implementation has a known definition and limited data availability. It cannot possibly do more than an Action is supposed to do. Each action definition, therefore, is extremely easy to understand, debug, or copy.
States are slightly more complicated, since state machines are not a simple construct. A state machine is a set of states (just a number, really), each of which has a set of transitions (conditions for moving to a new state), and behaviors (just some operation again, like an Action). If new code is needed, the only places to hook in are at these two types. Either you make a new transition, or a new behavior. And, like actions, these consist of a very small set of functions and responsibilities. They are simple. There are a lot of them, but each one is simple.
So as it turns out, actions can only act, and transitions can only transit. But perhaps more importantly, only actions can act, and only transitions can transit. This means that nothing else can possibly care about acting or transiting, and that, in effect, simplifies everything else.
Perhaps I should redefine SRP. Single Responsibility Principle: Any given piece of code must only 'care' about a single aspect of the program, and ONLY that piece of code is allowed to 'care' about it. Divided and Conquered.
No comments:
Post a Comment