Showing posts with label Programming Languages. Show all posts
Showing posts with label Programming Languages. Show all posts

Friday, January 9, 2009

Bad ACS, no Twinkie

So I've gotten back into Doom modding again, and during the process I have had to pick up ACS again. After spending over an hour trying to print a single sentence to screen (I'll get to that in a second) and having Doom Builder crash Yet Again, losing all of my progress, I figured I needed a break to blow off steam.

So why did it take me so long to print a message to the screen? Modularity. The mod I am making is going to have a lot of text printed to the screen, and in realizing this up front I decided to go ahead and write a function that would center it and compute the duration (how long the text should be displayed) based on string length to save me some typing. There is a rant in here alright, but there are also some interesting programming issues that I want to touch on lightly. For example, how do we balance planning ahead and building foundations with actually getting shit done? There, that was lightly, wasn't it?

Basically, I wanted to go from typing this:

hudMessage(s:"mah message"; HUDMSG_PLAIN | HUDMSG_LOG, 0, CR_GRAY, 0.5, 0.5, duration);

to typing this:

pcThink("mah message", duration)

and have the 'duration' variable be intelligently calculated (with the option to explicitly set it in cases where the default algorithm generates suboptimal results). Thus I set out to write the pcThink function. After several iterations, I ended up with the following:

function int pcThink(str thought, int duration)
{
// a value of 0 for the duration means the caller wants
// us to figure it out for ourselves
    if (duration == 0.0)
    {
        duration = strLen(thought) * 65536;
        duration = duration / 10.0 + 1.0;
    }
    hudMessage(s:thought; HUDMSG_PLAIN | HUDMSG_LOG, 0,
       CR_GRAY, 0.5, 0.5, duration);
// return our duration in tics so it can be used in
// delay() calls
    return duration * 35 / 65536;
}


If you can spot the mistake, then maybe you deserve a twinkie. ACS does not. I'll give you a hint: the symptom of the bug was that my messages were not lasting long enough. And no, there were no compile errors.

Okay, maybe you noticed my liberal mix of floats (they're actually fixed point values in ACS) and ints. duration is, after all, declared as an int, and I am multiplying it by 65536 in one place then diving it by 10.0 in another. Am I crazy? Have my rugged good looks finally gotten the best of my mental facilities? Perhaps - nay, likely. But that is not the cause of my problems in this instance, though you would be looking in the right direction.

hudMessage takes a fixed point value for its last parameter, which represents how long the message is supposed to be displayed to the screen. I am passing in duration here, which was declared as an int, but like any good language ACS coerces ints to floats (fixeds?) and vice versa, which all sounds keen and gives one a warm fuzzy feeling, right? Except that this coercion is done in a manner described most accurately as despicable. ints in ACS are 4 byte storage containers, pretty typical. Fixed point values do not technically exist in that there is no fixed keyword like there is an "int" keyword, though there are fixed point literals which can be illustrated by the following declaration:

int wolfInSheepsClothing = 50.0;

The way this is stored is that the upper 2 bytes represent the decimal value (things to the right of the period) while the lower 2 bytes represent the integral value (things to the left of the period). This is atrocious! 1.0 does not equal 1 in this system, and not just because it actually equals 1.0000001. No, because 1.0 actually equals 65536! In Soviet ACS, 1.0 + 1 equals 65537! I'm not even kidding! Who the hell thought this was a good idea?! The error in my above code is where I am dividing by 10.0, intending to increase the duration by one second for every ten characters in the message, but I instead increase the duration by one second for every 6554 characters!

Other gripes with ACS include dividing time into units of 35 tics per second (I'm sure there's a good reason...) and converting strings to ints when using the plus operator instead of concatenating or generating a compile error. ACS is actually a pretty good language for what it's meant to do and provides script parallelization that is easy to rationalize.

I am getting sick of goofy language design decisions pandering to low-level needs slowing down my high-level development. We need better abstractions that fit with our taught domain knowledge of mathematics and text so we can write "x = 1 + 2.0" without worrying about coercion tricks while "x = "foo" + 7" or "if (someIntegerValue)" will not compile. And don't get me started about pointer arithmetic. Also, why on earth have units not become more prevalent? I might not remember if delay(int) expects tics or seconds, but if I am allowed to type "delay(70 tics)" or "delay(2 seconds)" at my whim, I don't have to remember. And having static unit analysis at compile time could avoid certain embarassing blunders.

Oh, and indexing starting at zero! Dijkstra be damned, I still find this to be a counter intuitive (and thus incorrect) language design decision after over a decade of programming with it, though for the sake of brevity I will save the details for another post.

The problem as I see it is that programming language decisions were originally made based on memory limitations and thinking too closely to the machine. We have not evolved enough past these tarpits. We don't have enough, or any, "High Enough Level" languages. You've probably heard the cobbler's children analogy applied to programming, particularly to language design. If we were shoe makers, we would be hammering horseshoes on our kids' feet if nails were cheaper than stitches. Swear to God.

Tuesday, February 12, 2008

Open Question: Chomsky versus Turing, Round 1

What it is


I find writing blog entries from time to time to be a very pleasant source of output. However, one aspect of blogging that I hope not to overlook is the incredible ability to receive feedback in the form of comments. Therefore, I have decided to start making posts that explicitly grasp at feedback. To formalize these posts, I will put the important questions in bold with some framing statements interspersed between them; see below.

Basically I am highlighting issues that I am not quite certain about in hopes that other peoples' insight can help fill in the holes in what would otherwise be a declarative piece of swiss cheese.

The Question(s)


Why do we consider words to be part of a natural language, but we often consider methods and classes as “external” to a programming language? Why do programming languages seem to have such an immunity, save for keywords and arbitrary functions that find their way into the “standard library”? I find the spelling of the word “weird” to be a flaw with the English language, yet if I come across a method written for a C++ project that I feel uses Hungarian notation incorrectly (using Hungarian notation at all is a flaw in my book, but I digress) then I will blame that programmer and not the language itself.

Part of the issue is that we tend to view words as atoms in natural languages whereas with programming languages we label such things as primitives and operators (including keywords and perhaps some wrapper methods in standard libraries) as atoms. In this way, natural languages have a much broader palette of atoms than programming languages, though programming languages maintain the strength of being simpler to learn and understand. So my original question, comparing user-created objects and user-created methods to words, is a bit flawed and might be more accurate if I was comparing them to sentences; we can certainly blame an individual for sentences that don't fit the grammatical rules of the language. But wait—isn't that what syntax errors are the equivalent of?

So if keywords and operators are analogous to words, statements are analogous to sentences, syntax rules are analogous to grammatical rules, and type errors as well as certain run time errors (null dereferences, incorrect array indexing, invalid casting) are analogous to illogical utterances (“Losing is the reddest flatulence.”), then what is the natural language equivalent of the user-created class or method? Is it simply more words (neologisms, portmanteaus, etc.)? Maybe the atomic nature of class names and methods names is illusory; are they more like phrases or sentences?

If this seems like typical internet philosophical tallywhacking so far—then I'm doing my job as a game design blogger ;). But seriously, I do have a concern that legitimizes this post, if you will but take a moment to walk down a side road (it will loop back to the main street if I don't fall asleep first).

Wreaking Havok


I dabble in the Havok physics engine at work, and one thing that I have had to deal with recently illustrates a problem that I see with saddening frequency in the programming world. I wouldn't call it “reinventing the wheel”—perhaps just “reinventing the tread”. Anyway, the problem occurs when working with two libraries that have their own internal representation of the same concept. The concept in this particular example happens to be vectors; Havok has its own internal struct for vectors, and we have ours. Meaning that when I am doing vector math that crosses library boundaries I have to bloat my code with conversion function calls. For the uninitiated, vectors are used a lot in 3D games, in my experience more so than any other non-primitive data structure (save for perhaps containers). The concept that Havok vectors tries to encapsulate is the same as the concept that we try to encapsulate with our vectors, so it is unfortunate that compilers can't make this connection unaided. Worse, we can't expect Havok to get rid of their vector structs, nor can we get rid of ours and solely use Havok (what happens if we decide to let our Havok license end and pursue another physics solution?).

So something as common as Vectors seems like a nice concept to promote to some level above us and Havok. I'm not sure where, the language or the standard libraries? Ignoring this particular blemish, I would like to summarize this problem and what I believe to be its cause. Basically, with such a limited, domain-agnostic approach to programming languages, reinventing the wheel at the class and method level emerges from the design. I have been thinking a lot about improving programming languages lately, and one of the key issues that I tend to steer away from is redundancy. So is there a real problem here, and more importantly is there a solution?

I say “Yes” and “Yes”


I have two solutions, one of which needs more thought before I put it online. I will go into detail on the more reasonable, less risky one.

I believe that this problem arises from languages constraining their atom palette to be as generic as possible. I believe that the game industry would do well to get together and generate such a language. I also believe that this will never happen, so we as hobbyists and enthusiasts will have to continue to experiment in isolation until something takes off. I'm working on it, but trust me—we can benefit from someone far more educated in the field of programming language design than I solving this problem. If no one else steps up to the plate, I'll eventually finish my amateurish, patchwork solution and release it to the world—then you'll all be sorry!

Even a domain specific language for video game design runs the risk of becoming stale with how rapidly game development changes (though I see this slowing down as Nintendo consoles, cellphones, and more focus on casual/indie games draw us away from inventing new shader technology). We could make vectors, matrices, and quaternions primitive, but what do we do when some MIT prodigy discovers a better way to simulate bouncing breasts using a crazy new data structure called a xerbaton*? For such a domain-specific language to truly evolve with an evolving domain, we would have to work to standardize such things into the language itself in real time instead of us all creating our own xerbaton structs and management functions. I believe this implies being less strict about the size of our atom palette (the easy part) and more open to viewing the game development community as a team (think a WoW guild) as opposed to a bunch of competing teams (think football). The latter is the hard part.

So what do you think? Can programming languages learn from natural languages and improve by exposing their atom palettes (keywords, operators, standard classes and methods) to the public and encouraging open extension? Is this dangerous? Could this work for game development? Will it provide benefits by minimizing wheel/tread reinvention? I can think of examples of phenomenon like what I am proposing, but not exactly. For example, Java's libraries are modified and added to with nearly each major release, but this is more of a committee decision with a somewhat democratic input system and is by no means geared toward game design. Lisp takes a different route seen in the form of different dialects, but this is community extension at a different layer. The natural language equivalent would be if we decided the English language should start using periods instead of commas to separate digits in large numbers, or if we decided that adjectives should come after the nouns that they modify. Both of these examples seem to happen at a slow pace, in large steps by individuals or small groups of people. Could we as a community moderate such extension by an anonymous mass and come out ahead? Could we convince compiler and IDE vendors to play along to add syntax highlighting to new keywords (a dangerous concept for backwards compatibility in the first place, but I'll let that can of worms be opened in the comments to keep this from getting longer) and include new standard library files by default? To do this right seems to require something somewhat revolutionary ala Wikipedia. Or do you know of any project that already does what I have been describing in this post?

And the all-important meta-question: Do you like this open question format?

* picture a damped spring with a “sensuality” modulator, and please keep your hands out of your pants while doing so.