h2. Disclaimers
Ruby on Rails is as good as the hype, it really is. Building a maintainable web application just doesn’t get much better (yet). These are just some caveats I’ve found while using ROR(Ruby on Rails), and some will probably make me sound a lot more pissy than I intend, so keep in mind I would still rather use ROR(Ruby on Rails) than PHP, Perl, java, C, C#, etc. And also keep in mind that some of the problems I’m mentioning here are going to be found in other frameworks. ORM systems try to keep us from having to write SQL, and that is guaranteed to have some issues when performance is our #1 concern.
I’m also the kind of person who seems to get pissed off about almost anything. And the thing is, I’m not actually pissed off – I’m annoyed. I just happen to be vocal about my annoyances…
One final note… I’m not claiming to be a Rails expert! These issues are ones that I find problematic because of people like me who learn as we go. A true Rails guru probably will have better solutions to a lot of these issues. But I’m trying to help the Rails people who haven’t had the time to learn every little nuance in Rails, because I think there are a lot more of those than there are gurus…
h2. Background
Here’s the deal – at my job, I was assigned to a big project (monstrously big when you consider I was the only developer for 90% of the project), and I chose to use ROR(Ruby on Rails) for it. I don’t know if I’m at liberty to discuss the exact details, so let’s just say I’m working on a community-oriented site, revamping their product reviews. Currently, the site has information about something like 70,000 products, and each product has anywhere from 1 to 1000 user-submitted reviews. My revamp is a temporary fix until we get a java portlet solution working (That one was definitely not my decision).
h2. Minor annoyances
My main gripes relate to performance, so I’ll cover them in detail below. First, a list of my minor gripes… keep in mind, it’s often the “little” things that add up.
Rails doesn’t currently support portlets in any way (I can’t find anything about ’em on google, at least). If it did, maybe we could convince the genius management types that Rails is a suitable solution. Hell, I don’t even think portlets are half as interesting to site visitors as they are to site builders. But since they’re a requirement in the long term, they’re what we’ve got.
Rails documentation is pretty sparse for some of the internals. Look at the api site (http://api.rubyonrails.com) and try to find information about RJS or specific information about using routing / named routes. It may be there, but I sure can’t find it! And much of the documentation simply doesn’t go deep enough. Look at ActiveRecord::Migration and ActiveRecord::Schema – they tend to give the simplest cases and still have undocumented functions that you have to look into the source to figure out… no big surprise, Rails is still new, but still an annoyance.
The testing is weird. I haven’t tried integration testing yet, and I think it might fix some of my issues with testing, but unit tests and functional tests are a bit odd at times. I have one test that passes when run by itself, 100 out of 100 times. But it fails when run with rake. I still haven’t figured that out… and functional tests… ugh. First off, the name seems to be wrong. They’re just more view/content-oriented unit tests. Second, they act strangely if you call multiple actions in one test. Even with the recent inclusion of integration testing, it’s just more convenient to write a “quick” functional test that hits the same action in various ways. But if you do that, there are cases where the test won’t work. I’ve had to split up some tests in order to get them to work in a situation where, for some reason, the testing framework is breaking them. Like the data refuses to change after calling one action… and this only happens sometimes, not always. And with such crappy documentation about how testing works (most docs for testing are third-party docs, and generally very shallow), it’s a pain in the ass!
Some things are just documented incorrectly, and seem to be done on purpose for “simplicity”. Associations, for instance. If I ask for product.reviews, the docs say I get an array back. Well, technically it is an Array. But it’s been mucked with (the specific instance itself, not the Array class) in order to allow for the association’s special functions. Let’s say I want placeholders – for products with less than 10 reviews, I want the array padded to 10 items, where nil items tell my view to show the text, “Your review could be here!” (Stupid example, yes. There are other ways to do this, etc, etc. But the point isn’t this specific example so shut up.) The moment I call @reviews.push(nil)
on that “Array” I got back, I’ll hit an error that nil isn’t a review, so I can’t add it to my array. The Array object that’s sent back is modified to allow the cool association features, but this is not clearly documented! The solution? @reviews = (complexassociationlogic).dup
. By adding dup, you get a new Array that hasn’t been mucked with. Horrible for performance on large Arrays, I’d bet, but if you know the Array is going to be small, you can deal with it and get a true Array back.
Inconsistent API gets really old. Parameters are a great example. When parsing parameters, hashes exist for parameters that are sent in a certain way (‘object[attribute]’ as an input field’s name is the most common reason to see this). And in the testing framework, you can send parameters to an action as hash elements, such as:
get :action, :param1 => {:nested1 => 1, :nested2 => 2}
You cannot do this with url_for style parameters! It will not internally convert from a hash to the expected “url-friendly” values. I don’t see the usefulness in allowing the hashed params in two places, and then having them break in another! I’m hoping it’s just an oversight, but all the same, it’s difficult.
h2. Major annoyances
Quick sidenote about how badly this site was written: when a user submits a review for a product, the review is stored as a flat text file instead of in the database. When the admin looks at reviews for approval (or rejection), he’s looking at the first line on the flat text file. When he clicks “Next”, the ENTIRE FILE is rewritten to move the next line to the top of the file. And this is one of the more logical operations on the site.
Other examples include the fact that there’s currently a page that shows all reviews for a given category, and the biggest category, housing over 15,000 products, ends up with a 2.3Mb page that’s got so much stuff on it, it’s totally worthless to anybody. Don’t forget the minor detail that all pages are static html, generated every night by a review generation script that takes up to two hours to run on some extrememly fast systems (yes, there really is a lot of data). Dynamic content is so much sweeter after seeing a system like this.
Moving right along… so I’ve got this crappy legacy system to work with. I’m only trying to make the system a little easier to use, and more maintainable in the short term. This means I’m not getting any content folks to rewrite pages, I’m not totally redoing layout, etc. It’s all going to look very similar to the end user, but behave much differently on the back end. The current system is all perl scripts, and is easily the worst code I’ve ever seen. And given that I’ve seen my own code from years ago, that’s saying quite a lot.
So while I’m rewriting a lot of the backend here, I have to try my hardest to keep the look and feel moderately similar to what’s there right now. Tricky, tricky. And when moving from gobs of static files to dynamically-generated output, things get really painful….
Now to unveil my #1 gripe: Rails is very stupid when it builds SQL. Watch the dev log fly by when you do things that are oh-so-simple in code. (Lame gripe? Sure… this is actually a gripe that would exist in most ORM frameworks, even. But with this project it caused me some major headaches, so instead of having a more legitimate major gripe, I have this one. Deal with it.)
First example is a Rails newbie (me around the beginning of 2006). I have a list of manufacturers, and want to show each one, its products, and the product reviews. Simple:
<% @manufacturers.each do |mfg| %>
<%= @manufacturer.name %>
<% mfg.products.each do |prod| %>
<%= product.name %>
<% prod.reviews.each do |review| %>
<% review.ratings.each do |rating| %>
- <%= rating.name %>: <%= rating.text %>
<% end %>
<% end %>
<% end %>
<% end %>
When there are 1000 manufacturers for a given category, each with an average of 15 products, each with an average of 10 reviews, each with ~5 rating items, you can see very quickly that the number of hits to the database is insane. And this is a relatively simple example of what I’ve had to do.
I learned quickly about eager loading. When I load my manufactureres, I pull in all products at the same time. This means a longer hit to the DB, but a lot less time parsing. But this still means a lot of hits for reviews and ratings! What do we do?
Well, some of the smarter folks out there might suggest using a :through association to eagerly load the reviews. Then we’re only grabing each review’s ratings. Great idea, right? Surprisingly, no. While it may be a better situation than the prior one, it’s still far too slow for a page hit. The SQL for eagerly loading products will run on our dev machines in about 1/10th of a second. Throw in reviews and it’s up to TWELVE SECONDS! (Yes, I’ve already checked indexes, thanks for the tip, smartass). What’s more, the query gives me over 200,000 rows, while the manufacturer + product query only gives me about 30,000. That’s a whole lot more data to deal with. In fact, it’s exactly the number of rows that are in the review table to begin with. (This is obvious to somebody familiar with SQL….) Either way, the page load just for that data being processed (read from database and then turned into ActiveRecord objects) can end up being in excess of a minute.
Now the naysayers will pipe up with “use caching!”
So on to my next major gripe: caching is not very useful for a site with huge amounts of data to present. Oh sure, only one person has to be hit with that huge penalty, then everybody else is good. No big deal. Except when you need your data refreshed every hour, and 10 places on the site are dangerously slow (I consider >30 seconds to be in this realm). All of a sudden you have 10 pissed off customers an hour, 240 a day, and eventually everybody knows something’s not right.
Background caching is something I would love to see! I wish I were a better programmer, I’d submit a patch… but boy, wouldn’t it be great to be able to say, “start rebuilding this cache, but until it’s done, show the old cached page”? Then nobody is ever hit with that initial page load! That would make insanely large sites able to present a perfect interface to the user, while keeping the data up to date as often as needed!
Okay, back to my situation. Not only do I need to display all the data previously mentioned, but I also need to present some advertising-related data. Each product can have many advertiser_links. Each of those links points to an advertiser. To get at that data, I’ve just added two new joins. Pulling all the data at once is simply not doable. Just adding the ratings to the above query (remember we were just doing manufacturer, product, and review) makes it too slow (47 seconds!! over 1,370,000 rows!! YAY!) for me to ever consider for a real website. Adding the advertising information isn’t even worth timing. I tried, just for kicks, and couldn’t sit around long enough to see how long it took.
So how do I get past this? Why, I turn to my friend the paginator! That’ll fix all my woes! Right?
Once again… no. It’ll help, but certainly isn’t the final solution. The problem is that this site just has too much data. I paginate the hell out of it, and still most pages take too long to wait for loading. Especially since I can’t eagerly load as much as I need to in order to stop hitting the DB 10-50 times per product. Pagination is part of the answer, but by itself it isn’t bringing page load down low enough to matter. Early tests revealed 5-15 seconds per page because even though I’m pulling less data, I’m still hitting the DB so many times. And when I do lots of eager loading, the DB spends so much time that I don’t gain much on getting a small result set.
SO HOW THE HELL DO I GET PAST THIS? At this point I had hit a wall; I just didn’t know what else to do.
NOTE: there may be more options available that I don’t know. I know enough about Rails to know that there’s still a lot more that I have to learn.
I ended up doing what I should have done long ago: building custom SQL. Okay, I actually didn’t build SQL outside of the Rails system except once, where I needed hundreds of thousands of rows of data without the overhead of building an object for each one. But I did have to make heavy use of the find method’s :joins and :select options (once you hit those two with a :conditions option, you’re writing about 70% of the sql for the query, minus the weird table/row aliasing rails does), and I did a lot of preloading data. For instance, I’d have my list of manufacturers (with products eagerly loaded) and the load all advertiser information, stored in a hash by product id. This allowed me to hit the database only a few times, and with relatively quick transactions.
Another little cheat was to do a lot of @product.mfgid = xxx@ or @if (product.mfgid == xxx)@, instead of @product.manufacturer = xxx@ and @if (product.manufacturer == xxx)@. I may know an id for something, but not have it loaded on that specific product instance. Punching in the extra few characters (and therefore not loading more data) can yield tremendous performance bonuses.
Another nice solution I came up with is one I will put up as a plugin at some point – cachedfind. If findevery (all finds eventually hit this one) is called with the same parameters twice, within a given amount of time, the plugin will not hit the database again. This saved me a ton of DB hits on tables that have a lot of data that never changes. Some will say, “A good DB server will cache that for you!”, and they are right. But that still requires a hit to the DB server (rarely on the same server your code is running, at least in a large app), and Rails has to parse that data. Returning pre-parsed data that was previously being loaded 50+ times per page makes a big difference. This can end up involving a ton of data being stored in memory, so this trick is really only good for small sets of data that are rarely changed, and loaded in a lot of different places for different reasons. In my case, things like product categories (loaded for products, manufacturerers, and reviews) and advertiser info (only a dozen or so advertisers exist in the data, but the extra join on two or three queries was still painful).
The moral of the story is simple. Rails is proven for small sites, and even medium-to-large sites written with Rails from the beginning. Redoing a huge site to work in Rails requires a lot more than what the tutorials will ever tell you. It requires a lot of banging your head into walls, trying to rattle out one more life-saving plan. It requires a tenacity that not a lot of developers will have, especially those buying into the “Rails projects go 10 times faster!” hype. It requires a lot of digging into the internals of Rails. It requires at least a small amount of brain damage.
Is it worth it to do this when I could be working on the Java solution? I certainly think so. I find that Java is painful enough to use (compared to Ruby) that it is still easier to play with new technology that may have little quirks than to deal with “reliable” and “proven” frameworks based in Java.