This is the full thread of the email I sent on June 4th, 2010 to warn people about the problems I saw with Heiler. This has had names removed and replaced with all-caps representations of the names, but is otherwise exactly what I wrote to my manager and various other people at MF shortly before walking out the door.
My quitting was for very difficult and unrelated family reasons — for the record they knew I was quitting over five weeks prior to the date I sent this email.
I went back and forth on whether to send this, but in the end I really felt a response was necessary. I hope you don’t take this as a personal attack, I just felt that this needed to be said, and your message really took me by surprise.
BOSS LADY, I don’t feel that your response is fair to the IT team, especially in light of the problems we’ve had with Heiler, Jive, Unica, etc. Looking back, we’ve shown continually that the IT team here is very good at building custom solutions, and no matter how many growing pains we’ve had, with the right resources we have built very solid, scalable applications.
- SRS by itself is proof of our strength as an IT team, as we have an incredibly solid and fast site, and we keep getting awards for it. When we have the time to really work on a project for SRS, it turns out better than expected more often than not. Failures — truly critical failures — are almost unheard of. Given time and money, SRS could become a better product than anything off the shelf, just as it was years ago.
- With Pro Karma and Jive, our attempts to replace the legacy HC code have fallen short, where the in-house work I have done has been not only successful, but solid. The Ruby blades for HC have been the most stable HC systems in the past few years. A system I built to be a temporary replacement for the user reviews ended up carrying us for years because Pro Karma, even with 30 man-months of effort, never completed their project. Jive completed it, but was extremely late, had major performance problems, and even today doesn’t have the same capabilities as the legacy code.
- DAX failed early on, launching extremely late and with many problems. Thanks primarily to the efforts of our in-house staff, it’s gotten to a point where it’s actually showing some real promise.
What makes Heiler such a big risk? It’s an off-the-shelf solution, and Heiler code will never be owned by us (we can’t even view most of it). This makes it hard to make use of our talents, and I think it’s been showing for months. I’m confident that the in-house MF staff could have written a new system from scratch in under 12 months, given 4-5 full-time developers, and it would have been comparable in features to Heiler, with full ownership of the code, native mac clients, input from the various teams who will use it, etc.
Here’s the bottom line: stating that Heiler must go live no matter what is to say the company’s welfare doesn’t matter as much as deploying a single piece of software. When we first found problems with Heiler, our criticisms were met with resistance, and people were told not to be negative. People are now too afraid to express their opinions, so management assumes the project is going well. I think we’d all feel more confident about launching this part of NextGen if we had re-evaluated it months ago, and considered a plan B. It should never be too late to have a plan B, as any large project carries a large level of risk.
Incidentally, my elaboration on the Heiler flaws was requested, because my initial “Heiler concerns” email was too vague. I didn’t just send this out of the blue. I wanted to provide constructive feedback, and I wanted that feedback to reach the right people. It’s very sad when people care enough about MF to voice legitimate concerns in a professional and tactful way, and they get repeatedly swatted down.
On 05/31/2010 11:32 AM, BOSS LADY wrote:
Frankly NERDMASTER, People are working the weekend because they care about this project and want to see it go live. Basically we are trying to replace the crap that the in house people have developed over the last 10 years.
From: NERDMASTER Sent: Friday, May 28, 2010 3:31 PM To: PERSON A Cc: PERSON B; BOSS LADY Subject: Re: Heiler concerns
Okay, let’s see….
- The architecture is bad right from the ground up.
- They built their system on top of an application called Eclipse. Eclipse is an application built to help programmers write code, and it’s got a very flexibile plug-in system. You can make Eclipse do just about anything with enough plug-in code, but that comes at a price. For starters, Eclipse is a slow platform to begin with. It’s also got some limitations that take a good deal of work to get around, because it was really built for programming, not for enterprise apps like a PIM system.
- The architecture of the database is very bad as well. DATA ARCHITECT can probably give you better specifics, but the core problem is that the database is very inefficient. You have to pull a lot more data than you need most of the time in order to get product information. There’s a ton of wasted space because of the very poor “custom field” system. In order to interpret what a given field means, the code has to jump through a lot of hoops.
- The code architecture, at least for the majority of code I’ve seen, is poor in the sense that it’s clearly written by people who didn’t worry much about design, or else didn’t understand a lot of best practices for writing a large, scalable system. More on that in #2.
- The code is clearly made in a hasty way, at least the code we’ve been allowed to see. It’s clear that some of the programmers don’t know much about object-oriented design, code reuse, polymorphism, design patterns, and other computer science concepts that are typically learned within a few years of work experience, or a couple years of college. It’s hard to show an example that would be meaningful to anybody who isn’t a developer, but one situation we see a lot is code that copied and pasted with minor changes. This is typically indicative of a situation where an object hierarchy can make much cleaner and more efficient code, and copying the code everything introduces a lot of risk – if you have the same general system copied ten times, a bug requires ten fixes instead of one.
- The rights to the core code are probably part of a contract, but we’re allowed to see it whenever they have a hot-patch, because hot-patches have to get into the Subversion repository in order for us to continue developing against them. So clearly they are willing to let us see their code sometimes — just not all of it at once. I think we need to rework the contract given the problems we’ve had, because I feel our in-house team could really help Heiler by being able to see what parts of the code are causing problems. I don’t mean to be rude to the Heiler team, but they clearly do not have the experience with large-scale systems that we have.
- False deadlines are created by somebody who has no idea what this project’s problems are. I don’t know why anybody has ever tried to impose a set deadline, but all it does is push us (and Heiler) to rush things when they should have been very carefully designed. Implementing a system as huge as a PIM replacement is such a major change that rushing it and not spending time carefully designing every step is just begging for a disaster. I think today’s situation is a good example. A few people are working over the weekend to try and get this system ready for Wednesday’s launch, which still probably won’t happen without significant risk to the business. People already burned out are being asked to put in even more time on a system they aren’t going to be able to fix in a weekend.
Thanks for listening to my rants :)
On 05/28/2010 09:08 AM, PERSON A wrote:
NERDMASTER, Thanks for your thoughts! I commend the work you and your team are doing and would ask for more detail and formal documentation of the issues you’ve outlined. As we talked about this week, I know your time is limited so only if more documentation on these aspects would serve your team, please help us out.
- Clarify how the architecture is bad. Is it bad relationally or in other ways? Observations about the strengths or weaknesses of Heiler, related to the challenges and opportunities they provide, is constructive so the company is better equipped to provide resources to make improvements.
- Will you show how the code is bad with an example or two of bad vs good code you’re typically seeing?
- I gather that lack of rights to the source code is part of the contract, right? Any ideas for a solution?
- Items do slip through the cracks when work is rushed. Please share any ideas of how your team could be better supported.
Thank you again, PERSON A
On 5/18/10 2:57 PM, NERDMASTER wrote:
I’ve done a great job spreading around other people’s concerns, but I guess I’ve never really properly brought up my own to you, PERSON B.
- The architecture and code are really bad – as you said, this isn’t necessarily easy to address, but one problem I always had before was that it takes me (and MY COWORKER) a long time to get things done sometimes, just because we’ll have to dig through really confusing code to find anything. It’s not something training can help, and sometimes we still have to talk to BAD HEILER DEV about the confusion – something I expect will continue for a while, and cost the company money. I’m not saying this is a show-stopper at all, but it’s going to slow down productivity. Only by sitting down and looking at what MY COWORKER does when building an export template or changing workflow logic can you really see why it takes so long. My concern has been that we’ll be blamed for taking too long when it’s just a difficult system to work with.
Along those same lines, we’ve had a very difficult time convincing the Heiler devs to properly make use of subversion for version control. I don’t know how to explain what this means to you, but basically it inhibits productivity again. The Heiler team’s use of our source control system has wasted our time on multiple occasions by forgetting to check in code, checking in the wrong code, or simply assuming we’d know how to get our environments set up again after making major changes. Not to mention the environment for doing dev work is a huge headache because of our lack of even read-only rights to the core product’s source code.
The project has been pushed so hard that things constantly slip through the cracks. Somebody is going to be a scapegoat for all this, and it’s unfair. The deadlines are unreasonable for this product, as has been seen in the past. When somebody forgets something, they’re in trouble for having done so, but mistakes will definitely be made with the pressure to get this out the door on time. I think this was the case with the missed export template situation. Maybe GUY WHO QUIT DUE TO HEILER should have done his job better, but maybe it was a simple mistake due to the pressure he was under.
Really, my biggest concern has always been that this project is going to fail in one way or another, and that I would be held responsible for issues I feel are beyond my control. Obviously that doesn’t matter as much to me now, but I think the same issues and concerns will be present for anybody else working on the system.