Tuesday, November 20, 2007

Ruby - Part 2

I haven't dropped Ruby and have been playing with it since my last post. I'm not enthused, but neither am ready to give up. My problems might simply be learning pains of a new language, environment and paradigm or Ruby may simply not be the best tool for this particular project.



I've run into two problems so far:


  1. Background processing: on top of me going to the page with our production server's status (this page is being generated by RoR), I would also like to put in place a few monitors. These monitors should run in the background and fire notifications when server stats get out of whack (e.g. when tomcat's number of current connections approaches the peak). There doesn't seem to be a unified way of doing this with RoR. I've seen solutions using half-assed hacks that look like Ruby ports of cron. But there is nothing that looks finished, polished, and usable.

  2. Preference: I want to be able to persist some runtime parameters between application's runs. My options are: a database or the file system. Neither looks appealing. Dragging the RDBMS baggage around just so that I could save a handful of parameters seems excessive. On the other hand, I also hate when web apps mock with the file system because setting up and maintaining such an app is a hassle (setting up directory permissions, etc.). What I'd like a simple Preferences API similar to what Java has.

Thursday, November 15, 2007

Trying Ruby

I want to create a web page for monitoring the health of our production server. The page would gather data from different places and display them in a dashboard kinda format. I didn't really want to use Java for this and instead decided to give Ruby on Rails a try.

Getting Ruby On Rails running on Windows in a Microsoft shop isn't trivial. Ruby (RubyGems) is really not designed for a Windows based corporate environment. I immediately ran into the firewall problem: we are running a MS firewall that uses NTLM authentication and Ruby doesn't support that out of the box.

I spent a few hours browsing around, reading blogs and articles, trying various things. Finally ran across a post pointing to a gem that interfaces with the native NTLM library: rubysspi-1.0.4-i386-mswin32.gem. After mocking about with it for a little while and finally RTFMing, I managed to get it to work and got RoR downloaded and installed.

Wednesday, October 31, 2007

Adapting Code Documentation Practices

Today I recieved a decree from above to start spot checking code for proper documentation and if I don't deem the documentation effort to be adequate, I need to tell the QA not to accept the project.


Implementing this decree will be tricky socially, technically, and logistically. On the social front, it will be difficult to gain the necessary buy-in from other developers, who may perceive spot checks as being too Big Brotherly and invasive. Technically, documenting every function is wasteful, we'll need to define exactly what type of functions and classes are in most need of javadocs. And finally logistically, we need to find answers to questions like "How do we enforce these rules?" "How do we plug this new requirement into our existing processes?" "How do we measure our progress and the level of compliance?"


Documentation for documentation's sake is pointless and wasteful. I believe the best approach in satisfying all three areas is to look at the Agile Manifesto and start implementing the process with a single goal in mind: to maximize each and every principle of agile development. Doing so, will help us focus our reasons (we may end up finding that there are none), define criteria for gathering performance metrics, and most importantly will provide the grounds for a successful buy-in from the management and developers.

Tuesday, October 30, 2007

System Reliability

As they say, if you haven't seen it before, it's new to you. So today I learned something amazing, shocking (it was shocking to me), and completely mundane for those who deal with providing services day in and day out. System Reliability, which is relate to the system up time, is defined in terms of 9's: 99%, 99.9%, 99.99%, etc. Seems like splitting hair, at first. Big deal, whether it's 99% or 99.99%. Well, it does. For a system running 24/7, 99% uptime implies 88 hours of downtime per year, while 99.99% implies 53 minutes a year. Big efen difference!

Monday, October 29, 2007

First Sprint

Today we finished the first sprint of our first official Scrum project. This project is unusual because I am using Scrum to manage a transition from one internal service provider to another. As a result, we are dealing with a list of internal system and website changes that is fixed in scope. In other words, all user stories are non-negotiable and all must be done by a contractually set date.

Even though this project sounds like it goes against every Scrum principle, we are still finding tremendous value in following it. At the User Story Writing Workshop, just by sitting down the Product Owner with our team, we've identified new items that haven't been considered before, while at the same time, after a thorough discussion some tasks ended up being dropped. The workshop was a very intense four hour run, but everyone left the room with a much better understanding of the overall scope. I think for the first time in weeks, people could see the entire project all the way through and had enough confidence that nothing major has been left out of planning.

We've dropped the ball with the first sprint, though. A lot of it can be attributed to the chaos of San Diego fires. Out of 28 story points assigned to the sprint we finished... none. Still we had decided to push through with the demo and sort of fumbled through it by taking the audience through the project, explaining what Scrum was, etc. I realize that this can only work once. Fine.

The demo had an interesting side effect. Scrum is used to manage the IT part of the project, but there is a whole lot of coordination happening on the business side among various departments: sales, marketing, compliance, etc. Usually, on projects of this nature all communications among departments are ad hoc and person to person. But suddenly, the demo became a focal point for all department reps who were at the meeting. As soon as the demo was over they started talking to each other, sharing their statuses, turning the meeting into a de facto Scrum of Scrums. We noted this in the retrospective and decided to keep the practice going.

After the demo we held a productive retrospective meeting and tried to explain our failure to deliver. I also took that time to talk to everyone about minor Scrum related transgressions that were made here and there (like talking at the demo about personal commitments vs. team commitments, or a developer taking a change request from a customer and starting to work on it in the middle of a sprint without notifying anyone on the team... primarily me). After that we dove right into planning the second sprint.

For the first time, thanks to the recent scrum training, I had a very good idea where I was going with planning. Since there is no reliable velocity history, we started with the remaining highest priority story and proceeded breaking it down into tasks. I insisted, though, that if possible every task be under 8 hours. Agreeing on tasks for the first story took a long time. However, the entire process was very helpful to our QA specialist, allowing her to dig into low level details on what and how to test. At the end I asked everyone to commit to the hours written down on task stickies and once I had everyone's commitment, we moved on to the second user story.

Again, we broke it down into tasks, but when time came to commit, the developer who had to do the majority of work on the first two stories said that seeing the total time now, she wasn't comfortable committing. I offered to split the story into two: one for this iteration, one for the next. We found a really nice seam along which to split the work and after moving tasks around and recalculating the hours, everyone happily committed to the second user story.

Upon finishing the third user story, we've hit a bottleneck: one developer and the tester were tightly booked, while the other developer could still take on more hours. The decision was not to commit to any other user stories since there were no resources left to test them, but have more stories handy in case the second developer runs out of work.

All-in-all, this was a very productive meeting. I got an impression that everyone finally understood what was required of them and how Scrum could help them to be efficient. In two weeks I will have direct evidence whether it actually worked..

Sunday, October 28, 2007

sdcountyemergency.com

I think one of the bigger embarrassments to San Diego's local government during the fires was that http://sdcountyemergency.com/ went down as soon as people started hitting it for information and it stayed down for most of the day. I'd like to see a detailed case study of how the site was engineered, what sorts of decisions went into compromises while building it, why it crashed under a heavy load that it should've have been designed to handle, and finally what devices were put in place to prevent this from happening again.

Saturday, October 27, 2007

Teams of One

Judging from personal experience as well as talking to others in the industry, it seems that many small to medium non-IT companies like assigning a single developer to a project. For a while I attributed this to management's ignorance. When you have 5 developers working on 5 projects, it may seem that you can crank projects out faster than if you had 5 developers working on a single project. In fact I was once told as much by a boss (who got to be a Director of Software Development merely by just being there first, long enough, and knowing how to put together Paradox forms).

While I'm not going to say that ignorance isn't to blame, it won't be at the root of the problem. It dawned at me -- as I was digesting agile wisdoms acquired over the last three days in Scrum talks -- that the actual problem is much more sinister. Managing a team right is tricky: one has to coordinate people, has to make sure everyone's personalities match, has to plan ahead, has to step in and resolve disagreements and conflicts, etc. Since running a team properly is time consuming, the natural tendency for a manager is to fall back to the control-and-command style. But controlling people is easier when they are divided, so in the end, oftentimes subconsciously, the manager will split the department into one-person teams.

This is a problem that oftentimes cannot be solved from below with education. It can only be solved from above by a competent superior either with a reprimand or with a straight out replacement.

Running a team is a tough job, but the reward for doing it right is tremendous: a happy, engaged and productive team cranking out projects like a heartbeat. When searching for a job, any environment where people work on their own projects should immediately set off the "bad management" alarm.

Friday, October 26, 2007

Scrum & Agile

Having sat through Mike Cohn's Certified ScrumMaster course earlier this week, the connection between Scrum and agile has finally become very clear. But what I observed in the class has surprised me, even though, it really shouldn't have.

I saw a lot of people still looking for the silver bullet (and if Scrum ain't it, they are more than ready - eager - to write it off as a piece of hyped up crap). What they expect coming out of training is a set of rigid rules to take back to their companies, rules to follow unquestionably to magically transform their dysfunctional environments into well oiled and efficient software producing machines. No, I'm sorry, can't do, doesn't work that way. Scrum cannot be boiled down to rules. If it could, it wouldn't work for the very simple reason that every environment is different. Every company has a unique culture, unique business model with unique requirements, unique bunch of people working together, communicating in unique and unpredictable ways.

You can't round all of that into a square hole and still expect things to work. Instead Scrum is a set of guidelines evolved within a small circle of developers when they tried to follow the principles outlined in the Agile Manifesto. They looked at those principles and experimented. What worked for them, got distilled into a set of guidelines they called Scrum.

These guidelines are nothing more than starting points for a team wishing to become agile. A pseudo-authoritative reference to take to one's boss or the team and say, here are the things that worked well for others, let's try adapting them here (compare this with saying, let's go agile, here are the general principle of agile development; the natural question is "sounds great, so what exactly do we go from principles to implementing them?"). Once you have the buy-in from the management and the team (not an easy feat in itself), next comes the hard part: making your process agile in a way that works for your team in your company.

Scrum is not going to dictate you how to do this. Remember, it's just a bunch of starting point guidelines. You need to run with them experimenting, adjusting and tweaking in a way that fits your organization. How do you know how to change them? You need to be constantly looking back at the agile principles and asking yourself a question: what do I need to do in order to maximize on every principle without breaking others?

In the end, it's a thinking's man game... as always.

Monday, June 25, 2007

upper management wants us to come up with metrics for measuring efficiency of our department. I have no idea how to do this or even how to approach this problem. Every type of metric I can come up with is either defficient in some way or requires so much maintenance, that developers either won't do their numbers at all or procrastinate generating reports for weeks only to use bogus numbers (since they won't remember the actuals) later.

Tuesday, April 17, 2007

Technical Debt

When you take shortcuts and accumulate technical debt, most of the time you just don't know when, where and how this will come back and bite you in the ass. Case in point, in our company every quarter the DBA has to go through user accounts, regenerate passwords, and distribute them to all the database users. This is a major pain in the neck that has been going on for a long time.

As soon as I heard about this, I inquired why do we manage user passwords manually, instead of tying the database (MS SQL Server) to the Active Directory. The answer was that this is done so that our main in-house application can authenticate users against database accounts, and for the app to continue to work, it needs access to passwords. "Why can't we change the way authentication is handled?" I asked next. "Well, because there is no central place where authentication is done and every module does it on it's own. So to change this now, would require a major overhaul."

When the application was first written, there was no Active Directory and MS SQL Server handled its own authentication. But this is not a good excuse for not taking time to think the architecture through. We are paying for a bad decision made almost 10 years ago by not being able to simplify maintenance by taking advantage of a new technology.

Monday, March 26, 2007

AccuRev

While at the SD West conference, I've run into an interesting new SCM called AccuRev (http://www.accurev.com/). Ok, being at version 4.5 it's not exectly a newcomer, but I haven't heard of it before.

From what I've seen from the demo, the program works in a new paradigm. It adopts Perforce's metaphors of depots and workspaces and takes them to a new level by introducing the concept of a code stream. As far as I understand, the entire codebase (a depot) is represented as a tree with each node being a focal point for collecting changes. These nodes can represent projects, version branches , features, teams, etc. At the end of a tree branch, there can be a leaf node representing a particular user's workspace. Workspaces, just like in Perfoce, seem to be tied to a particular user on a particular machine.

Changes can be applied at every node and then they will stream down to child nodes and leaves. So, let's say I have a parent node for Project A and below it there will be nodes for trunk, some feature F, QA, and RC1. If QA finds a bug and I fix it, I can apply the changes at the Project A node level and have these changes automatically stream into QA, RC1, and Feature F. There are of course controls that allow a finer level of control over what gets changed, where, when and by whom.

The client UI is visual and drag and drop enabled. You can see the entire SCM tree on the screen, you can move nodes around, reconnect them, reassign workspaces to different nodes, arrange people into teams, etc. All in all it looked like a very powerful piece of software and, at $800/seat for a pro version and $1400/seat for an enterprise version, a very expensive at that.

SD West

Just got back from the Software Development conference (yeh, THE software development:) with a lot of new ideas to think about and consider: restructuring our code reviews, repositioning the testing step in the dev. process, new SCM software to look at. And there is more, it's just that I've process so much information that I need to go back to my notes and look at them all again. It's was definitely money well spent.