ActionScript 3.0 (AS3) ToolTip for Flash 9 (CS3) programmers!

I’m learning Flash right now. My next few games are going to be built in Flash in order to help me learn, and then hopefully build a nice PBBG using Flash and a dedicated C++ based network server. Should be highly exciting!

Anyway, after scouring the net for an amazing amount of time, I could not find a ToolTip component for ActionScript 3.0 (AS3). Tons for 2.0, but if I am going to be working with Flash, I have to go with 3.0. It’s much nicer for a serious programmer, as well as people who just want to be on the cutting edge.

I pretty much gave up after a few hours of digging, and decided it was time to rewrite somebody else’s code. So the one I found is apparently originally written by “Versus”, then modified by Joseph Poidevin. No idea who they are, no idea how to give them proper credit, because I honestly can’t remember where I found the original, and I changed about 80% of the code so hell, it really doesn’t much matter. The only thing that’s important is their work helped me to write a ToolTip that’s slightly customizable.

The class is fairly basic right now, but it does its job and does it well. You can customize opacity, text alignment, and delay (time before ToolTip shows up). I planned to add more customizations to the thing, but so far I really haven’t had a need, and I’m big on adding features only when you see a real-world use for said features. (Yes, I am aware font customizations will be very useful, but until I need them, I don’t write them)

If you find anything you’d like to add to this, please don’t hesitate to tell me. I may not incorporate all changes, because I want this thing to stay simple, but I’ll certainly consider anything I see.

Without further ado, here it is: ToolTip.as, or a tarball with the directory structure if you’re just too lazy to build it yourself: tooltip.tgz

Example usage:

This is actual code that yours truly is using in an exciting new game that may or may not ever be finished:


ToolTip.init(stage, {textalign: 'center', opacity: 85, defaultdelay: 500});
ToolTip.attach(btninstructions, 'Learn how to play\nScraboggle!');
ToolTip.attach(btnstart, "Begin a game of scraboggle.");
ToolTip.attach(btn_scores, "View the best scores and words, and who got 'em!");

Nerdmaster does something useful…

No, children, this isn’t the unbelievable (and fictitious) story of a bad little nerd who finally finds redemption via some amazingly-selfless act. This is, instead, the true story of me, the very selfish nerd who offers something useful to the world in order to make himself (myself) feel superior.

I’ve been into statistics lately, and built a really slick hypergeometric distribution probability calculator. I offer the C++ and Ruby source for free, so if you’re interested in statistics and programming, check it out.

Why do I do this? Simple – for starters, I can always use the publicity. I linked to my page from wikipedia’s article. I never would have bothered to put my code up until I noticed that the only other link (here comes the superiority complex) was to a calculator that’s dog slow and gets odd values in some cases – if your desirables (white marbles) are set to 10 and your sample size (marbles drawn) are set to 20, it will actually show drawing 11 desirables as being possible! Plus it doesn’t show data in a very friendly way, doesn’t really explain the algorithm, and doesn’t offer source code. So I figured I could get some math nerds onto my site by throwing that link there, and more visitors is always good, even if they’re icky math nerds.

The other reason is to impress some C++ loving weirdos at Arena Net. I guess they think a client/server game with strict performance requirements should be written in C++ or something stupid like that. I tried to explain that VB is the only language for programming a serious game, but they just wouldn’t listen.

Anyway, enjoy the code. I licensed it as a “use however the frack you want” kind of license, so if it’s useful to you, just let me know you like it.

Sloccount’s sloppy slant – or – how to manipulate programming projects the Wheeler Way

Sloccount is my newest Awesome Software Discovery. It’s a great idea, but is far too simple to do what it claims: estimate effort and expense of a product based on lines of code. And really, I wouldn’t expect it to be that great. The model used to estimate effort is certainly not the author’s fault, as it isn’t his model. But that idiot (David Wheeler) doesn’t just say it’s a neat idea – he actually uses this horrible parody of good software to “prove” that linux is worth a billion dollars. For the record, I prefer linux for doing any kind of development. I hate Windows for development that isn’t highly visual in nature (Flash, for instance kind or requires Win or Mac), and Macs are out of my price range for a computer that doesn’t do many games. So Linux and I are fairly good friends. I just happen to be sane about my liking of the OS. (Oh, and BSD is pretty fracking sweet, too, but Wheeler didn’t evaluate it, so neither will I)

The variables

To show the absurdity of sloccount, here’s a customized command line that is assuming pretty much the cheapest possible outcome for a realistic project. The project will be extremely easy for all factors that make sense in a small business environment. We assume an Organic model as it is low-effort and most likely situation for developing low-cost software.

Basically I’m assuming a very simple project with very capable developers. I’m not assuming the highest capabilities when it comes to the dev team because some of that stuff is just nuts – the whole team on a small project just isn’t likely to be having 12+ years experience, and at the top 10% of all developers. But the assumptions here are still extremely high – team is in the top 75% in all areas, and 6-12 years of experience, but pay is very low all the same. This should show a pretty much best-case scenario.

Also, I’m setting overhead to 1 to indicate that in our environment we have no additional costs – developers work from home on their own equipment, we market via a super-cheap internet site or something (or don’t market at all and let clients do our marketing for us), etc.

Other factors (from sloccount’s documentation ):

  • RELY: Very Low, 0.75
    • We are a small shop, we can correct bugs quickly, our customers are very forgiving. Reliability is just not a priority.
  • DATA: Low, 0.94
    • Little or no database to deal with. Not sure why 0.94 is the lowest value here, but it is so I’m using it.
  • CPLX: Very Low, 0.70
    • Very simple code to write for the project in question. We’re a small shop, man, and we just write whatever works, not whatever is most efficient or “cool”.
  • TIME: Nominal, 1.00
    • We don’t worry about execution time, so this isn’t a factor for us. Assume we’re writing a GUI app where most of the time, the app is idle.
  • STOR: Nominal, 1.00
    • Same as time – we don’t worry about storage space or RAM. We let our users deal with it. Small shop, niche market software, if users can’t handle our pretty minimal requirements that’s their problem.
  • VIRT: Low, 0.87
    • We don’t do much changing of our hardware or OS.
  • TURN: Low, 0.87
    • I don’t know what this means, so I’m assuming the best value on the grid.
  • ACAP: High, 0.86
    • Our analysts are good, so we save time here.
  • AEXP: High, 0.91
    • Our app experience is 6-12 years. Our team just kicks a lot of ass for being so underpaid.
  • PCAP: High, 0.86
    • Again, our team kicks ass. Programmers are very capable.
  • VEXP: High, 0.90
    • Everybody kicks ass, so virtual machine experience is again at max, saving us lots of time and money.
  • LEXP: High, 0.95
    • Again, great marks here – programmers have been using the language for 3+ years.
  • MODP: Very High, 0.82
    • What can I say? Our team is very well-versed in programming practices, and make routine use of the best practices for maintainable code.
  • TOOL: Very High, 0.83
    • I think this is kind of a BS category, as the “best” system includes requirements gathering and documentation tools. In a truly agile, organic environment, a lot of this can be skipped simply because the small team (like 2-3 people) is so close to the codebase that they don’t have any need for complexities like “proper” requirements gathering. Those things on a small team can really slow things down a lot. So I’m still giving a Very High rating here to reflect speedy development, not to reflect the grid’s specific toolset. For stupid people (who shouldn’t even be reading this article), this biases the results against my claim, not for it.
  • SCED: Nominal, 1.00
    • Not sure why nominal is best here, but it’s the lowest-effort value so it’s what I’m choosing. Dev schedules in small shops are often very flexible, so it makes sense to choose the cheapest option here.

So our total effort will be:

0.75 * 0.94 * 0.70 * 1.00 * 1.00 *                # RELY - STOR
0.87 * 0.87 * 0.86 * 0.91 * 0.86 *                # VIRT - PCAP
0.90 * 0.95 * 0.82 * 0.83 * 1.00 *                # VEXP - SCED
2.3                                               # Base organic effort

= 0.33647 effort

We’re also going to assume a cheap shop that pays only $40k a year to programmers, because it’s a small company starting out. Or the idiot boss only pays his kids fair salaries. Or something.

Command line:

sloccount --overhead 1 --personcost 40000 --effort 0.33647 1.05

Bloodsport Colosseum

For something simple like Bloodsport Colosseum, this is an overly-high, but acceptable estimate. With HTML counted, the estimate is 5.72 man-months. Without, it’s 4.18 man-months. We’ll go with the average since my HTML counter doesn’t worry about comments, and even with rhtml having embedded ruby, the HTML was usually easier than the other parts of the game. So this comes to 4.95 months. That’s just about 21 weeks (4.95 months @ 30 days a month, divided by 7 days a week = just over 21). At 40 hours a week that would work out to 840 hours. I spent around 750 hours from start (design) to finish. I was very unskilled with Ruby and Rails, so this estimate being above my actual time is certainly off (remember I estimated for people who were highly skilled), and a lot of the time I spent on the project was replacing code, not just writing new code. But overall it’s definitely an okay ballpark figure.

When you start adding more realistic data, though, things get worse.

If you simply assume the team’s capabilities are average instead of high (which is about right for BC), things get significantly worse, even though the rest of the factors stay the same:

0.75 * 0.94 * 0.70 * 1.00 * 1.00 *                # RELY - STOR
0.87 * 0.87 * 1.00 * 1.00 * 1.00 *                # VIRT - PCAP
0.90 * 0.95 * 0.82 * 0.83 * 1.00 *                # VEXP - SCED
2.3                                               # Base organic effort

= 0.4999 effort

This changes our average from 4.95 man-months to 7.3 months, or about 31 weeks. That’s 1240 hours of work, well more than I actually spent. From design to final release, including the 1000-2000 of lines of code that were removed and replaced (ie, big effort for no increase in LoC), I spent about 40% less time than the estimate here.

…And for the skeptics, no, I’m not counting the rails-generated code, such as scripts/*. I only included app/, db/ (migration code), and test/.

However, this still is “close enough” for me to be willing to accept that it’s an okay estimate. No program can truly guess the effort involved in any given project just based on lines of code, so being even remotely close is probably good enough. The problem is when you look at less maintainable code.

Just for fun, you can look at the dev cost, which is $21k to $28k, depending on whether you count the HTML. I wish I could have been paid that kind of money for this code….

Murder Manor

This app took me far less time than BC (no more than 150-200 hours). I was more adept at writing PHP when I started this than I was at writing Ruby or using Rails when I started BC. But the overall code is still far worse because of my lack of proper OO and such. So I tweak the numbers again, to reflect a slightly skilled user of the language, but worse practices, software tools, and slightly more complex product (code was more complex even though BC as a project had more complex rules. Ever wonder why I switched from PHP for anything over a few hundred lines of code?):

0.75 * 0.94 * 0.85 * 1.00 * 1.00 *                # RELY - STOR
0.87 * 0.87 * 1.00 * 1.00 * 1.00 *                # VIRT - PCAP
0.90 * 0.95 * 1.00 * 1.00 * 1.00 *                # VEXP - SCED
2.3                                               # Base organic effort

WHOA. Effort jumps to 0.8919! New command line:

sloccount --overhead 1 --personcost 40000 --effort 0.8919 1.05

This puppy ends up being 3.4 months of work. That’s 14.5 weeks, or 580 hours of work — around triple my actual time spent!

Looking at salary info is something I tend to avoid because as projects get big, the numbers just get absurd. In this case, even with a mere 3500-line project, the estimate says that in the environment of cheap labor and no overhead multiplier, you’d need to pay somebody over $10k to rewrite that game. Good luck to whatever business actually takes these numbers at face value!

But these really aren’t the bad cases. Really large codebases are where sloccount gets absurd.

Big bad code

Slash ’em is a great test case. It isn’t OO, is highly complex, and has enough areas of poor code that I feel comfortable using values for average- competency programmers. So here are my parameters, in depth:

  • RELY: Very Low, 0.75
    • Free game, so not really any need to be highly-reliable.
  • DATA: Nominal, 1.00
    • The amount of data, in the form of text-based maps, data files, oracle files, etc. is pretty big, so this is definitely 1.00 or higher.
  • CPLX: Very High, 1.30
    • Complex as hell – the codebase supports dozens of operating systems, and has to keep track of a hell of a lot of data in a non-OO way. It’s very painful to read through and track things down.
  • TIME: High, 1.11
    • Originally Nethack was built to be very speedy to run on extremely slow systems. There are tons of hacks in the code to allow for speeding up of execution even today, possibly to accomodate pocket PCs or something.
  • STOR: Nominal, 1.00
    • I really can’t say for sure if Slash ‘Em is worried about storage space. It certainly isn’t worried about disk, as a lot of data files are stored in a text format. But I don’t know how optimized it is for RAM use – so I choose the lowest value here.
  • VIRT: Nominal, 1.00
    • Since the app supports so many platforms, this is higher than before. I only chose Nominal because once a platform is supported it doesn’t appear its drivers change regularly if at all.
  • TURN: Low, 0.87
    • Again, I don’t know what this means, so I’m assuming the best value on the grid.
  • ACAP: Nominal, 1.00
    • Mediocre analysts
  • AEXP: Nominal, 1.00
    • Mediocre experience
  • PCAP: Nominal, 1.00
    • Mediocre programmers
  • VEXP: Nominal, 1.00
    • Okay experience with the virtual machine support
  • LEXP: Nominal, 1.00
    • Mediocre language experience
  • MODP: Nominal, 1.00
    • The code isn’t OO, which for a game like this is unfortunate, but overall the code is using functions and structures well enough that I can’t really complain about a lot other than lack of OO.
  • TOOL: Nominal, 1.00
    • Again, nominal here – the devs may have used tools for developing things, I really can’t be sure. I know there isn’t any testing going on, so I can be certain that 1.00 is the best they get.
  • SCED: Nominal, 1.00
    • The nethack and slash ’em projects are unfunded, and have never (as far as I can tell) worried about a release schedule. Gotta choose the cheapest value here.

Total:

0.75 * 1.00 * 1.30 * 1.11 * 1.00 *                # RELY - STOR
0.87 *                                            # TURN (the rest are 1.00)
2.3                                               # Base organic effort

Total is now 2.166 effort. New command line, still assuming cheap labor and no overhead:

sloccount --overhead 1 --personcost 40000 --effort 2.166 1.05

Slash ‘Em is a big project, no doubt about it. But the results here are laughable at best. The project has 250k lines of code, mostly ansi c. The estimate is that this code would take nearly 61 man-years of effort. The cost at $40k a year would be almost $2.5 million! With an average of just under 24 developers, the project could be done in two and a half years.

I worked for a company a while ago that built niche-market software for the daycare industry. They had an application that took 2-3 people around 5 years to build. It was Visual C code, very complex, needed a lot more reliability than Slash ‘Em, was similar in size (probably closer to 200k lines of code), and had a horrible design process in which the boss would change his mind about which features he wanted fairly regularly, sometimes scrapping large sections of code. That project took at most 15 man-years to produce. To me, the claim that Slash ‘Em was that much bigger is a great reason to make the argument that linux isn’t worth a tenth what Wheeler claims it is. Good OS? Sure. But worth a billion dollars??

Linux and the gigabuck

I’m just not sure how anybody could buy Wheeler’s absurd claim that Linux would cost over a billion dollars to produce. Sloccount is interesting for sure, particularly for getting an idea of one project’s complexity compared to another project. But using the time and dollar estimates is a joke.

Wheeler’s own BS writeup proves how absurd his claims are: Linux 6.2 would have taken 4500 man-years to build, while 7.1, released a year later, would have taken 8000 man-years. I’m aware that there was a lot of new open source in the project, and clearly a small team wasn’t building all the code. But to claim that the extra 13 million lines of code are worth 3500 years of effort, or 400 million dollars…. I dunno, to me that’s just a joke.

And here’s the other thing that one has to keep in mind: most projects are not written 100% in-house. So this perceived value of Linux due to the use of open source isn’t exclusive to Linux or open source. At every job I’ve had, we have used third-party code, both commercial and open source, to help us get a project done faster. At my previous job, about 75% of our code was third-party. And in one specific instance, we paid about a thousand dollars to get nearly 100,000 lines of C and Delphi code. The thing with licensing code like this is that the company doing the licensing isn’t charging every user the value of their code – they’re spreading out the cost to hundreds or even thousands of users so that even if their 100k lines are worth $50k, they can license the code to a hundred users at $1000 a pop. Each client pays 2% of the total costs – and the developmers make more money than the code is supposedly worth. And clearly this saves a ton of time for the developer paying for the code in question.

If you ignore the fact that big companies can use open source (or commercially-licensed code), you can conjure up some amazing numbers indeed.

I can claim that Bloodsport Colosseum is an additional 45 months of effort simply by counting just the ruby gems I used (action mailer, action pack, active record, active support, rails, rake, RedCloth, and sqlite3-ruby). Suddenly BC is worth over $175k (remember, labor is still $40k a year and I am still assuming a low-effort project) due to all the open source I used to build it.

Where exactly do we draw the line, then? Maybe I include all of Ruby’s source code since I used it and its modules to help me build BC. Can I now claim that BC is worth more than a million dollars?

Vista is twice as good as Linux!

As a final proof of absurdity, MS has a pretty bad track record for projects taking time, and the whole corporate design/development flow slowing things down. Vista is supposed to be in the realm of 50 million lines of code. Using the same methods Wheeler used to compute linux’s cost and effort, we get Vista being worth a whole hell of a lot more:

Total physical source lines of code:                    50,000,000
Estimated Development Effort in Man-Years:              17,177
Estimated cost (same salaries as linux estimate,        $2.3 billion
  $56,286/year, overhead=2.4)

To me these numbers look just as crazy as the ones in the Linux estimate, but MS being the behemoth it is, I’m not going to try and make a case either way. Just keep in mind that MS would have had to dedicate almost 3,000 employees to working on Vista full-time in order to get 17,177 years of development done in 6.

The important thing here is that by Wheeler’s logic, Vista is actually worth more than linux. By a lot.

Linux fanatics are raving idiots

So all you Linux zealots, I salute you for being so fiercely loyal to your favorite OS, but coming up with data like this (or simply believing in and quoting it) just makes linux users appear a ravenous pack of fools. Make your arguments, push your OS, show the masses how awesome Linux can be. But make sound arguments next time.

digg this!

The move to typo 4.0

Typo is my blogging software. Written in Ruby on Rails, it seemed like an ideal choice for me since I’m a big fan of the RoR movement. But like so many other open source applications, Typo has got some major problems. I’m not going to say another open source blog would have been better (though I suspect this is true from other pages I’ve found on the net), but Typo has been a major pain in the ass to upgrade.

For anybody who has to deal with this issue, I figure I’ll give a nice account here.

First, the upgrade tool is broken. If you have an old version of typo that has migrations numbered 1 through 9 instead of 001 through 009, you get conflicts during the attempt at migrating to the newest DB format. You must first delete the old migrations, then run the installer:

rm /home/user/blog_directory/db/migrations/*
typo install /home/user/blog_directory

Now you will (hopefully) get to the tests. These will probably fail if, like me, your config/database.yml file is old and doesn’t use sqlite. Or hell, if it does use sqlite but your host doesn’t support that. Anyway, so far as I’m concerned the tests should be unnecessary by the time the Typo team releases a version of Typo to the public.

Next, if you have a version of typo that uses the components directory (back before plugins were available in Rails, I’m guessing), the upgrade tool does not remove it. This is a big deal, because some of the components that are auto-loaded conflict with the plugins, causing all sorts of stupid errors. That directory has to be nuked:

rm -rf /home/username/blog_directory/components

This solves a lot of issues. I mean, a lot. If you’re getting errors about the “controller” method not being found for the CategorySidebar object, this is likely due to the components directory.

Another little quirk is that when Typo installs, it replaces the old vendor/rails directory with the newest Rails code. But it does not remove the old code! This is potentially problematic, as I ended up with a few dozen files in my vendor/rails tree that weren’t necessary, and may have caused some of my conflicts (I never was able to fully test this and now that I have things working, I’m just not interested). Very lame indeed. To fix this, kill your rails dir and re-checkout version 1.2.3:

rm -rf /home/username/blog_directory/vendor/rails
rake rails:freeze:edge TAG=rel_1-2-3

My final gripe was the lack of even a mention that older themes may not work. I had built a custom typo theme which used some custom views. But of course I didn’t know it was the theme until I spent a little time digging through the logs to figure out why things were still broken. Turned out my theme, based on the old Azure theme and some of the old view logic for displaying articles, was trying to hit code that no longer existed. Yes, my theme was using an old view and the old layout, both of which were hitting no-longer-existing code. But better API coding for backward compatibility would have made sense, since they did give you the option to use a theme to override views and layouts. Or at the very least, a warning would have been real nice. “Danger, danger, you aren’t using a built-in theme! Take the following precautions, blah blah blah, Jesus loves you.”

How do you fix the theme issue, though, if you can’t even log in to the blog to change it? Well, like all good programmers who are obsessively in love with databases, the typo team decided to store the config data in the database. And like all bad open-source programmers, they stored that data in an amazingly stupid way. I like yaml, don’t get me wrong – it’s amazingly superior to that XML crap everybody wants to believe is useful. But in a database, storing data in yaml format seems just silly.

<rant>

PEOPLE, LISTEN UP, if you’re going to store config that’s totally and utterly NOT relational, do not use a relational database. It’s simple. Store the config file as a yaml file. If you are worried about the blog being able to write to this file, fine, store your data in the DB, but at least store it in a relational sort of way. Use a field for each config directive if they’re not likely to change a lot, or use a table that acts like a hash (one field for blogid, one for settingname, one for setting_value). But do something that is easy to deal with via SQL. Show me the SQL to set my theme from ‘nerdbucket’ to ‘azure’ please. When you can’t use your database in a simple, straightforward way, you’ve fracking messed up. Yes, there are exceptions to every rule, but this blog software is not one of them. It would not have been hard to store the data in a neutral format that would make editing specific settings easy.
</rant>

Sorry. Anyway, how to fix this – the database has a table called “blogs” that has a single record for my blog. This record stores the base url and a bunch of yaml for the site config. You edit the field “settings” in the blogs table, and change just the part that says “theme: blah” to “theme: azure”. If you don’t have access to a simple tool like phpmyadmin, then you’ll likely have to retrieve the value from your mysql prompt, edit it in the text editor of your choice, and then reset the whole thing, making sure to use newlines properly so as not to screw up the yaml format…. Then you are up and running and can worry about fixing the theme at your leisure.

Now, to be fair, I think I could have logged in to the admin area without fixing my theme, and then fixed it there. But with all the problems I was having, I thought it best to set the theme in the DB to see if that helped get the whole app up and running. Obviously it wasn’t the theme that was killing my admin abilities (and I can’t even remember anymore what it was). But once I hit that horrible config storage, my stupidity felt ever so much smarter compared to the person who designed typo’s DB.

Typo is pretty sweet when you don’t have to delve under the hood. But “little” things like that can make or break software, and I hope to <deity of your choice> that the next upgrade is a whole lot smoother.


UPDATE UPDATE HOORAY

One more awesome annoyance. It seems all my old blog articles are tagged as being formatted with “Markdown”. When I created them, I formatted them with “Textile”. If you’re not up on these two awesome formatting tools, take a look at a simple example (the first is how Textile appears when run through the Markdown formatter):

  • This “website”:http://www.nerdbucket.com is really sweet, dude!
  • This website is really sweet, dude!

I’ve been using Markdown lately as I kind of prefer it now. But my old articles are in Textile format. I don’t know why upgrading my fracking blog loses the chosen format, but boy is it fun going through old articles and fixing them!!

Digg this!

How not to benchmark software

I’ve just stumbled upon an amazingly misinformed benchmark about Flex, from an actual Adobe employee, Matt Potter.

This guy benchmarks JSON, AMFPHP, and XML as ways to transmit data between PHP and a Flex app. His findings show that XML is generally faster than either JSON or AMFPHP. This “discovery” could revolutionize the way we send and receive data on the net! Who’d have thought that XML is so efficient? Truly amazing results!

But if we choose to drop back down to Earth from the blissful land of Ignoramia, we may find that even Adobe devs can make horrible mistakes.

So why is this year-old article worth dissecting? Simple – it comes up FIRST when you search google for “json flex”, which makes it a great tool of misinformation for people looking for ways to incorporate the awesomeness of JSON into flex! Note that if it were a random article that was at least moderately hard to find, I probably wouldn’t care too much.

So Matt Potter compares XML, AMFPHP, and JSON. His first and most amazing mistake is that he’s using raw XML, but converting data structures in PHP into JSON and AMFPHP. XML is expensive to create as well as to read, so skipping that step completely invalidates his article in my opinion. But worse still, he tests against a local server. The network overhead of XML is going to be significantly worse than JSON in most cases (no idea about AMFPHP as I’ve never used it), so ignoring the 2-3x bigger data really doesn’t do much for providing a valid test.

One of the comments also mentions that there’s a PHP extension for JSON that’s better than what Matt used, and Matt’s response: “I used the Zend Framework instead of the json php extension because I really think that the Zend Framework is the easiest to setup and use, and I have other examples of using the ZF that I’m going to publish”. So instead of looking for the best tool for the job, he went with the easiest tool. But for XML testing he went with the hardest but most efficient “tool”: manual creation of XML with no conversion from objects, no use of XML creation tools, nothing.

I dislike ignorance when one tries to present facts, but this article actually makes me suspicious that his intention was to “prove” XML was the best technology of the three, and was willing to manipulate data in any way necessary to provide evidence. It’s pretty despicable to have a position of influence (adobe employee) and abuse it to prove a totally BS point.

It should be noted that some of the comments, particularly Blaine McDonnell’s, ripped Matt apart better than I can. But when ignorance and/or deception rear their ugly head, it never hurts to point them out one last time.

Bloodsport Colosseum Finished…

No, I don’t mean the programming is complete. I mean that I’m pretty much done working on it and it’ll live in its current state indefinitely. It’s fully playable of course, but it’s missing two or three features I would consider core, and has a bunch of stuff I wish weren’t there…. Unfortunately, however, I am finding that I just don’t have the drive to keep working on the game, as it is no longer anything like what I had planned to build.

I have other things I want to build, and this game has been in development for over two years now. If you count the PHP and Perl versions I had to scrap early on, a total of probably 750 hours have gone into it, and I just don’t have the free time to dump that much into a single game when I have so many other ideas I want to play with. Had it been a better game, I certainly would have kept working on it, but I just didn’t keep it focused on the main aspect of the game properly. For more info, read below as I dissect what I think are the biggest problems with BC. This is actually more for myself than anybody else, as I am hoping not to make the same mistakes in my next games.


What went wrong?

Poor Focus

Are you playing to build up an army of gladiators, or to build up the wealth and power of yourself, the manager? The answer is unfortunately both, and very strongly both.

My goal was a fantasy-sports-style game with gladiators. The gladiators themselves were expendable units. You could train them, equip them, whatever. But at the end of the day, you were merely supposed to be building a team to further yourself as a manager.

I don’t know how all fantasy sports games work, but the one I did play had very basic players that you controlled in a football (American) league. There were a few stats on each player, as far as passing, punting, blocking, etc. But for the most part it was about building a team that worked well together, and reading the game commentary each week to see how well you did.

Well, I got so wrapped up in the coolness of my RPG stats for gladiators that I made it cumbersome to manage them. Each gladiator needed several tabs to view all their details. The overview, which showed all the gladiators “at a glance” was unfortunately lacking enough information to really make meaningful choices. Add in all the ways weapons could modify a gladiator, and how different stats worked with equipment, and of course the ranks thrown in as a typical RPG leveling system, and you’ve got a lot of information to deal with.

In a game where you are the gladiator, this works out just fine, and in fact you’d probably want even more focus on stats and things. But when you’re just the manager, you find that managing your small team of 5 gladiators is a big pain.

Additionally, the focus being so split between gladiators and managers meant that building up your manager was never really emphasized. You had one stat, fame. It meant very little after about a month or two of play. You had money and equipment, but again, those meant nothing after a short amount of play.

No Team Spirit

This is sort of an extension of my lack of focus. In a fantasy sports football game, your team is a single unit. An injury in your quarterback means your whole team suffers, so you need to have a backup. Too much focus on speed and not enough on blocking, and your team is crap even though they have the best runners in the league.

In Bloodsport Colosseum, however, your “team” was a group that had no interaction except that they sometimes fought each other. That is totally not how a team in a fantasy sports game should act. It was essentially like managing a bunch of boxers, except that you couldn’t even properly choose matches because I didn’t want the players to be able to abuse the game (the computer chose matches each night).

If I’d done this as I meant to, your team would consist of at least 10 gladiators, and you’d need different specializations for your team to win. Or maybe not 10 gladiators, maybe fewer would be fine, but they would need to work together somehow.

Priorities

Tournaments should have been the first kind of match to exist. They were always on the list of things to do, but never made it to the top. Yet I think they make more sense than the green/blue arenas. Apocalypse matches were okay, but just too chaotic since I had no teaming mechanic.

Achievements should have been very high priority. You are managing a team of gladiators, and yet I never built in medals, trophies, or anything like that. While the effects on the game may have been minimal or none at all, the desire to be the best manager would have meant something! Comparing trophy cases, bragging about the elite “1000th kill” award, etc. Yes, eventually long-time players would grow bored of having every trophy available, but this would have at least made the manager a more tangible piece to the game.

Spending most of my time on gladiator stats and weapons was just a mistake. Complex gladiators aren’t necessarily a bad thing, but they shouldn’t be a burden. In a fantasy sports game, you might do basic training on your players, and you will likely choose some high-level organizational things (positions on the field and such), but you shouldn’t have to make several choices every single day on every gladiator in order to keep them working well. Having simpler stats, training, and equipment would have left a lot more time to do other things that should have been higher priority. Such as tournaments and achievements….

Manager stats should have been a priority, too. I had a plan to have managers gain specializations that helped out gladiators, with their own stats for attack, defense, money management, healing, whatever. After a certain amount of XP, you spend it to gain a new skill or something. This may not have worked well – it may have ended up being really confusing, in fact. But at least it was the right direction – focusing on the manager.

Fight logs sucked. They were just a bunch of “A hits B for X damage” kinds of messages. Making a robust combat engine should have been done before even going into closed beta testing. The fight engine was too simple, and it was one thing that could be complex as hell without confusing the player, since it was totally hands-off. Players should have been excited to see the fight log, to read a mini-story that was detailed and interesting, describing feints, parries, dismemberment (another dead idea), etc. Instead, it was always something I felt could wait until later.

Ideas For Next Time

I don’t know that there will be a new BC ever. But if there is, I have some ideas for it.

  1. Gladiators will be much simpler. No levels (ranks), or if there are levels, they’ll represent experience without giving additional bonuses to stats and hit points. Gladiators will have different stats that represent more tangible abilities, like the derived stats do today. And your choice will be simpler – each stat will be set and never change. The choices will likely be word-based, such as “very weak”, “weak”, “average”, “strong”, and “very strong”. So you may have a gladiator who is “very strong” at Offense, “weak” at Defense, “average” at Speed, and “weak” at Toughness (hit points).
  2. Recruiting stations will still exist in some form, but they won’t have such an effect on the gladiators. In all likelihood they’d have options that gave, at most, -1 or +1 boosts to specific stats, so that instead of all your stats averaging out to 3 (“average”), you may have them averaging 2.5 – 3.5.
  3. Your gladiators will work as a team by default. Maybe I’ll still allow one-on-one challenges, but I’m not sure I like how challenges ended up working in BC. I definitely didn’t like the arena choices, as I couldn’t explain them well to people, they required you to choose an arena for every gladiator, and they made the system have to be out of sync in order to have pre-scheduled battles. It was just confusing – a simpler approach of setting your team in a league or something would be more interesting as well as more user-friendly, I think.
  4. Training will be simple or nonexistent. Skills will be simple or nonexistent. If they’re included, skills will surely be complex behind the scenes, such as the combat training value is today, but the player won’t see it or have to deal with it. He’ll just see that a gladiator has a certain amount of experience with bladed weapons. When training happens, it’ll be a long-term thing. You set John Doe to train blades, and he’ll train every day until you change this preference. He’ll move up through 3 or 4 ranks of training, Basic, Advanced, and Expert or something. These will modify him in some way that will make fight logs more interesting, such as being able to parry or riposte where he previously couldn’t. Battle experience will affect attack chance and perhaps damage, while training will affect “moves” available. And it’ll all be something the player notices only passively, so it’s interesting but not cumbersome.
  5. Gladiators will not have scarring. They’ll have critical injuries which require time out in order to heal (which of course should affect the team as a whole). They may lose limbs. They may die. The player won’t have much say in this at all, other than perhaps choosing an overall team strategy that’s risky. Gladiators missing limbs will almost certainly need to be retired, but mid-season it may be better to leave a one-armed gladiator in, if his experience and training are really good.
  6. Equipment will be simple again. Simpler even than what I had when I first showed BC to the public. There will be a few choices for each group of weapons, and they’ll be very simple choices. Maybe 3 weapons per group, with maybe 3 levels of quality. And weapon damage would be the main stat, with a small modifier to speed. Some weapons would be better for parrying, perhaps, and blunt weapons would deal different damage than bladed (crushing limbs versus open wounds), but the player wouldn’t care about much other than damage and “oh, the sword will allow a parry more often, great.”
  7. As I just alluded, there would be a little more in-depth combat damage. This would be added complexity to the backend, but not something that the players would ever have to pay attention to if they didn’t want to. The most important thing about adding complexity to this kind of game is to make sure the user doesn’t have extra micromanagement just for the sake of adding more to the game. Anyhow, the combat engine would be more sophisticated as I’ve stated, with things like parries, ripostes, sudden berserker-like rages, etc. But also, damage will be totally different. Hit points are great for a traditional RPG, especially one where the character is expected to survive many brutal battles. But in a game where the warriors are expendable, a much more interesting combat system should be looked into. I wouldn’t build a truly realistic system, otherwise most gladiators would die most of the time, but I would make it semi-realistic. Damage would be to specific body parts, death would usually be from blood loss or brain damage, etc. Hit points would exist perhaps on individual body parts to measure how useful they are, but they wouldn’t determine life left. For instance, if you destroy my arm, I can’t use it in battle anymore, but further hits to that arm don’t really do much to me – it’s probably already bleeding as much as it can.
  8. As I stated above, the manager would be the focus. Players would be in the role of a manager, and would have more choices about their management than about the gladiators. Perhaps they’ll be able to gain experience, but no matter what mechanism I choose to deal with manager persistence, there would be ways to carry over something no matter how many seasons a manager played. A long-term player should be able to show trophies, medals, and other awards. There should be all kinds of random “top player” lists, showing obvious achievements like kills, wins, KOs, etc. Special awards would have to exist that players wouldn’t even know about until they won one, such as an award for crippling a large number of opponents. Maybe allow managers a long-term inventory of manager-specific “artifacts”, which would give minor boosts to his gladiators in addition to his own experience. Overall a long-time player shouldn’t be able to easily crush every n00b he encounters (although separate leagues for the newest players could alleviate this some), but he should have a small advantage for sure.

Care Bears and Dr. Sbaitso… the conspiracy is revealed!

My son is three, so I grant him a lot of leeway when it comes to his choices of video. However, I was shocked, appalled, and disturbed when my wife mentioned he was totally in love with the Care Bears: Big Wish Movie. Naturally I beat the crap out of him. Many times. But he still likes that god-forsaken video.

So one day he asked me to watch it with him. When a three-year-old asks you to do something, man, let me tell you, you’d better think real hard before refusing. Unless you really need to teach the kid that what he’s asking for is not allowed or bad for him, you obey. So anyway, I sat down and watched a fair amount of the movie.

I was once again totally shocked about midway through the movie when they made a reference to Dr. Sbaitso:

Funshine Bear: [dons Groucho Marx glasses and imitates Sigmund Freud] Hmm, interesting. Tell me about “caring.”
Wish Bear: I can’t. I feel all empty inside.
Funshine Bear: Interesting. Tell me about “empty inside.”

< Quotes modified from IMDB >

Okay, to be fair it could have been any Eliza like algorithm, but the point is clear – the designers of the Care Bears CGI movie (or script writer I suppose) have a geeky background, and actually put in a reference that probably one in a thousand people watching the movie would ever get.

I don’t claim to like the Care Bears all of a sudden, but I can’t help but laugh every time I see or hear that scene. Whoever decided to put that into the movie KICKS ASS.

Yet another Awesome Software Discovery!

This time it’s a piece of javascript to compute ideal body weight in a variety of ways, the most interesting of which claims to tell you what other people like you consider their ideal body weight. Very “slick” little system, if you care about such things.

I’m always on the lookout for crazy new technology, so when I found this “ideal weight calculator”:http://www.halls.md/ideal-weight/body.htm, I was overjoyed by how many different algorithms seemed to be present. When I looked at the source and found that the author was using javascript, I was again very excited. This meant I could look at (and possibly learn) his algorithms!

And so here they are: “javascript source for calculating ideal weight”:http://www.halls.md/ideal-weight/body2.js.

But read the copyright message with me and bask in the author’s sheer genius! Clearly he (or maybe she? No idea, don’t care) considers the algorithms to be proprietary and will MESS YOU UP if you steal them! So I guess I won’t bother to learn them. Hell, merely looking at them is probably illegal.

So aside from the author’s painful arrogance and stupidity, what can we learn about him from this script? Simple – he thinks he’s some sort of omniscient deity (don’t mess with me lest I strike ye down, mortal! And I will know if you try: “you won’t get away with copying this code”), and yet he doesn’t have even the tiniest iota of smarts when it comes to securing what he claims is “truly my unique creation and algorithms”.

O, Great and Wonderful Physican (yes, that’s right kids, he points out to all us lesser mortals that he’s a god damn physician!), I beseech thee! A bit of simple and kind advice for you: if you want an algorithm to be protected, don’t publish it on the web. In un-obfuscated javascript no less. Obfuscation isn’t bulletproof, not by a long shot, but it’s better than nothing.

And really, go for a server-side approach if you’re as paranoid as you seem to be. Once you use javascript, everybody who visits your site has copied it. This is not because they’re all thieves, but because of a little thing called the browser cache. Not only that, but anybody can view your proprietary algorithm and rewrite it. Copyright it all you want, a rewritten version of the algorithm is going to be COMPLETELY LEGAL! Copyrights only protect exact (or very nearly exact) duplication. You need a patent to protect an algorithm. For a basic description of the algorithm, read below. I was gonna rewrite it in javascript, but it’s really quite worthless, so explaining it should piss off our good doctor well enough.

<By the way, the message should be “It’s copyrighted”, or since you’re talking about scripts (plural), maybe “They’re copyrighted”. Note the apostrophes. Apostrophes can be your friend.>

The good news is that his script is so mundane and, dare I say it, not unique – most of the script is other people’s work on pretty standard formulas. Why, you ask, is this good news? Because he doesn’t actually need to worry about people stealing it!

His “secret formula” is well worth discussion, however:

You go to the site. You put in your height and weight. His script uses a very standard formula to calculate BMI. His “people’s choice” code then cuts BMI down by 40% or 50% (gender determines this) and then adds a gender-specific value (11.5 for men, 11 + age x 0.03 for women). Then reversing the very standard BMI calculation gives you what other people supposedly consider to be an ideal weight!

That’s right, a simple algorithm that tells you what other people just like you consider ideal! But because of the simplicity of the script, it gets worse – say you’re a 440 pound, 5’6″ adult male. According to this brilliant physician, the average person that height, weight, age, and gender think that 291 pounds is their ideal weight! That’s right, little ones. If you’re extremely obese, your beliefs of what is and isn’t an ideal weight become so skewed that you think being slightly less obese (but still very obese) is “ideal”. Funnier still, of course, is that as your weight changes, so will your ideal. So once our example 440lb guy gets down to 400, he thinks his ideal is 271lbs. Doesn’t matter if it takes him a month or fifteen years to drop 40lbs, his new ideal is still going to be 271.

BUT WAIT, THERE’S MORE! When you’re not an adult, the script tells you that your peers consider your desired weight to be something that is based entirely on height and weight! So the average 440 pound, 5’6″, 18-year-old male longs to weigh 131 pounds. The moment he’s older than 18.5 (no idea where the doc pulled these numbers from), he longs to weigh 291. Yup. One day he goes to sleep hoping to be in a healthy weight range, then he wakes up thinking he was wrong, and should in fact weigh more than twice his original goal.

Arrogance, stupidity, bad programming, and then weird assumptions followed by even more stupidity. This is possibly my best Awesome Software Discovery yet!

Be careful of Rubyforge gem!

I just discovered a weird issue. The comments under my name will explain it better than I can here, but sufficed to say, if you use Net::HTTP in ruby, do not install the RubyForge gem!

Read all about “the issue”:http://rubyforge.org/tracker/index.php?func=detail&aid=8907&group_id=1024&atid=4025 on the rubyforge bug tracking page.

The wonderful world of Cross-site scripting (XSS) – OR – why input filtering is bad

I have been dealing with XSS at my so-called “real job” recently, and it has come to my attention that a lot of people in this world are under the mistaken impression that it’s better to do “input filtering” than “output filtering”. As I pretty much came up with these terms myself (they may or may not exist elsewhere; I’m just too lazy to find out), I’ll define them for you:

Input Filtering: Scrubbing XSS-dangerous data out of your input before it gets saved anywhere.

Output Filtering: Scrubbing XSS-dangerous data only upon display.

Now, the most important concept here is that XSS is most dangerous when a user can see immediate results without alerting you, the web designer. So if you have a page that repeats their parameters back at them (say a search page where you put “Your search for $parameter could not be found”), that’s A) independent of input vs. output scrubbing, and B) extremely by far the most dangerous kind of XSS vulnerability. Why? Because it allows a user to post a link to your site that can execute malicious javascript. Bad, bad, bad.

After echoing user parameters is fixed, you have to look at how you display stored data. This is where the type of scrubbing comes into play – do you scrub the data before storing to your database / file system? Or do you only scrub when you’re about to display the data?

I will soon prove that input scrubbing is for pansies who are paranoid and tend to make up pathetic lies about their imaginary 20-year-old girlfriends.

Why input filtering is inefficient

  • It’s bad to store data in a display-specific way (have to unencode when displaying PDF, email/text reports, etc).
  • You have to modify other areas of code than just DB storage, such as searching (search for “<blah>” won’t yield “&lt;blah&gt;”), which may not be immediately obvious.
    • You could just auto-filter all incoming data, but there may be cases where you really can’t or don’t want to. I personally dislike blind filtering like this unless there is no better option.
  • If you have existing data, you have to check it for pre-existing problems. With large data, this can be very slow.
  • If you’re truly paranoid (as I am), you still won’t trust the DB data and will need to find a way to have input filtering work nicely with output filtering. This is a whole lot more work than just doing one or the other.
  • If you use a good MVC system like Rails, you can actually escape all text fields as they’re read from the database if you want. With a carefully written ActiveRecord plugin to Rails, I’d bet you could have all accessors automatically escape their data if it’s textual. And even provide a method for getting at the unsafe data.
    • I still don’t like such blind scrubbing logic, but better to blindly display scrubbed data than to blindly alter data before it hits your database.

Why input filtering can be dangerous

  • If you can’t trust your programmers to do proper output filtering, why would you trust them to do proper input filtering?
    • Yes, input filtering is liable to be in fewer locations, particularly if you filter all incoming parameters, but it’s still not a silver bullet, and has a lot of long-term risks when mistakes do happen (read on for details).
  • Compare to output filtering in terms of the bug factor:
    • Bugs will happen. If you truly believe you don’t ever write code with bugs, then by all means ignore this section. I’ll get a good laugh when you tell me about your first big project that went from a two-week estimate to a six-month half-finished-and-then-rewritten-from-scratch project from hell.
    • If you mess up an output filter:
    • You probably have an issue that’s confined to a single area on your site (the area you messed up).
    • You do a quick hotfix, and the site is once again safe.
    • If you mess up an input filter:
    • Every area of the site that contains the data you missed is at risk.
    • You do a quick hotfix to stop anything new from coming in, but existing data is still currently at risk.
    • You find and quickly fix the very obvious offending data in the database.
    • You wait until the site is slow (or you can take it down) and run through all data entered since you suspect the exploit came into existence, fixing it record by record.
  • If future XSS issues arise, you have to retroactively fix your old data again instead of merely fixing your filter.
    • New xss vulnerabilities won’t arise, you say? Maybe so, but how many times have we computer folk shot ourselves in the foot with presumptions about the future? (We’ll never need more than 640k memory, nobody will still be using this old software when y2k finally hits, etc)
    • Note that XSS attackers have discovered that in some cases, the backtick character (`) will work to do specific JS-oriented attacks. This is not a character that is scrubbed by at least two different html_escape types of functions that I know of. Enjoy retroactive data-fixing? Me too!

Why input filtering can be better (and my incessant arguments to prove that it really can’t)

The most logical argument I was given is that in a large enterprise, control of data output gets pretty tricky. So as far as I’m concerned, large companies are the only place the below issues even have a tiny bit of merit. And even then….

  • In a large enterprise, you know that nobody will inadvertantly display unsafe data, because all data is safe.
    • Unless of course somebody writes a program that makes changes to the DB. Less likely than a rogue program that merely displays data, I agree, but still a possibility. In an organization that’s big enough to be at risk of multiple apps reading data that wasn’t built by the “proper” people, I’d say there is a definite risk that apps will be writing to said data as well.
    • At my job, there have been several cases where somebody who wasn’t even a part of IT (a manager and a content designer) modified data directly in SQL, bypassing any hope of safeguards.
    • In a large enterprise, I think it’s even more important than ever that all access to the DB goes through knowledgeable IT staff. Yes, I know this is a pipe dream, but I still think proper procedures can allow output filtering to be the clearly correct option.
  • You can detect problems with input filters more easily, because you have the data that could be dangerous right at your fingertips. If need be, write a program that periodically audits your data to check for unsafe characters. If you messed up an input filter, this program can save you.
    • Good testing does this same thing for output filtering. It’s far harder to write perfect tests for your app’s HTML output than to write a program to audit the DB for unsafe data, but it’s still the right way to do it.
    • Resource usage is wasteful in my opinion, when the resources are being used to prevent data from simply being stored in its original state.
    • If you have a large amount of data that is changing all the time, this solution may simply not be doable. In what situation would you have that much data changing that regularly? Oh… I don’t know… maybe in a big corporate enterpise?