I’m diggin’ Go, but I’ve noticed there’s a tendency to “freeze” third-party code directly into one’s repositories on github. This is absolutely awful in my opinion – the code isn’t a part of your project, so it has no place in your repository! I do understand the problems, trust me, but this approach has some issues a lot of folks won’t realize until they’re hit HARD by a show-stopper.
In Ruby, and Rails 2 apps in particular, it was a strongly-encouraged practice to just commit vendor code directly into your repository. That all changed with Rails 3, when people had used the “store all dependencies locally” approach for a while and realized it wasn’t a great practice.
In nodejs, npm originally pushed developers into checking in all the code at first, too, and with very opinionated explanations of how absurd it was that anybody would ever not want their dependencies in their code repository… then they added a freeze command which only stores package names and version numbers instead.
Though this isn’t as big a problem in Go, and may never be given you can deploy binaries instead of giant blobs of code, it’s worth mentioning because some go devs will inevitably be compiling on production, even though their development was done elsewhere. If Go ends up adding support for compiling libraries as part of the build process, development dependencies get a lot more hairy.
In Ruby and nodejs, this was a huge problem. Pulling down dependencies sometimes meant code had to be compiled to provide a local library – in Ruby, I saw this with mysql and libxml. In nodejs, I believe this happened with a mongo package. The problem here was that while I might be on Linux, another dev might be on a mac, and our production system might be some other architecture, or 64-bit vs. 32-bit, or who knows what else. Committing what I compile may mean production has to have some kind of awful hack in order to work. Explicitly ignoring the compiled binaries means recompiling on production, which isn’t awesome.
Sure, you can require production to be exactly the same as dev machines, but in a serious shop, this isn’t always a viable option. Production might be Windows for some insane reason your devs can’t control, and forcing devs to use Windows is no better than forcing them to use a Mac. And don’t get me started on Vagrant. It can help in this area, but it still essentially forces your devs into a particular workflow that might be awkward for them, and in some cases causes far more problems than it’s worth.
Let your devs do things the way that’s most efficient for them – even if that means (gasp) you don’t check dependencies into your repo!
Say I build 20 small apps using the gorilla toolkit and in a year there’s a major security problem discovered. I have to go in to each of those repositories, pull down the latest version of the gorrila code, commit a giant checkin, and get it deployed. If you have any kind of code review process, this is even slower.
Sure, you have a fair amount of work in Go-land for these kinds of problems. It’s not like Ruby where you just update a central Gem and restart your apps. (This is one area Go could stand to improve – by allowing compiled shared libraries, not just code inclusion for every third-party library under the sun) But it’s still easier to update the compromised code (gorilla in my example) just once, then check out each dependent repo, recompile it, and deploy the new binaries.
Insane code review
Having huge chunks of code — sometimes hundreds of thousands of lines — in your repository can often make code review literally impossible. I could claim to update some dependencies and inject a backdoor in somebody else’s library. Nobody in the world is good at code reviews once code reaches a certain point – something like 400 lines. Now imagine a code review of 100k lines in a shop where you can’t necessarily put absolute trust in your developers.
Even accidental problems can happen this way by virtue of a developer updating dependencies to the wrong version. Or I’m a lazy developer and I put the dependency update in the same commit as a pile of other code. Will anybody even notice my code?
This isn’t a reason to avoid checking in dependencies by itself, but it adds an extra level of pain to the process.
What happens when I include the code for ten repositories that all use different licensing, and that licensing isn’t necessarily compatible with the licensing used by my project?
I have no idea. You may have to ask a lawyer and possibly dump code you’ve relied on for years if somebody complains.
When you bundle a list of dependencies, you don’t really have this same kind of problem. You’re saying “my project needs these things to work”.
When you bundle code into your repository, you’re saying “these things are part of my project”.
Legally, there’s a HUGE difference between the two statements.
So what’s the solution?
The good news is the Go creators aren’t the opinionated ones. They say (paraphrased) “here’s how we do it, but we deliberately aren’t pushing a specific workflow on others because we know there’s no one-size-fits-all option”.
The bad news is the community is very opinionated, and sometimes that’s all it takes. Bundler wasn’t built by the Ruby team or the Rails team, but it’s become “the” way to handle dependencies in a Rails application. Though I like its general approach, it’s not awesome for all Rails shops in existence to be forced into a single approach.
So for my part, if I develop anything that doesn’t need me to be 100% certain it never changes, I’ll probably just use simple import statements and let “go get” do its thing.
If I’m a bit more picky, I will probably build a Makefile that runs a “go get”, then checks out a specific hash, and tell consumers they just gotta deal with one extra layer of annoyance if they want my project to work.
If I absolutely must have a specific blob of code, and I absolutely need others to be able to use it easily, then yes, I might actually copy the third-party codebase into my own repository. I’ll have to do some license checking and possibly alter my code’s license to accommodate this (for instance I can’t release a project as public domain [CC0 most likely] if I’m including GPL code), but if I really need to, I may indeed include vendored code.
I guess my point is there are a lot of ways to do this, and I really hope the Go community sees that “CHECK IN ALL THE THINGS” isn’t necessary or even a good idea in most situations.
I forgot my usual sophomoric name-calling for this post, and that left me feeling a bit empty, so here goes.
The folks who build GoDep are idiots for forcing all users of their (previously promising) system into storing all code locally without any real in-depth discussion of any of these problems. And choosing the directory name “_workspace”, even more awesome. Big thumbs-up to bad developers who are so stupid they don’t even know why they suck.