Recently MARTA, Atlanta’s mass transit system, installed a fancy new fare system which takes “smart” cards instead of tokens. Things were a little bumpy at first but overall the new system seemed to be working well until the unexpected happened. Yesterday an article in the Atlanta Journal-Constitution talked about how MARTA took a heavy economic hit when the new fare system suddenly crashed during a maintenance upgrade and was down for about a day. About 115,000 people ride the MARTA daily. During the system downtime none of the cards would swipe, the Marta smart card vending machines quit working, more clerks were required than usual at each of the 38 stations affected, confused riders could not get through the blocked turn-styles, and others were inconvenienced to find cash to ride the bus. One of the outcomes is that MARTA is considering suing the outsourced software provider for damages.
Even though I do not know the details, clearly thorough testing was not performed during the upgrade process and based on the “free riders” snafu an acceptable disaster recovery plan was not in place. Like Jet Blue’s recent software fiasco that left travelers stranded, the MARTA system crash is another example of the potential financial impact an organization can suffer when technology suddenly goes awry. The lessons learned for these organizations may be more attention to plan for higher quality systems and also to have plans in place when the unexpected happens. One of my recommendations would be to hire and utilize skilled and thorough Business Analysts who will define excellent requirements and who will be proactive in finding problems early before the defects can be coded in the software.
The MARTA system upgrade should have been a normal maintenance activity. MARTA could/should have involved a BA to review the upgrade changes and validate that sufficient testing would be performed for the upgrade. Based on the money that was lost the temporary workarounds were not effective. No matter how stable a system seems to be, planning for disaster recovery is vital for any critical software system. A BA has skills to elicit and document an acceptable disaster recovery plan to reduce the business exposure in the event that something drastic happens. Disaster recovery planning can be started early in the project when BAs identify potential risks and assess their impact to the business. The plan can be finalized once a solution has been defined. Lastly, MARTA could/should have a skilled BA facilitate and document a root-cause analysis about what happened and communicate the results to the impacted stakeholders. This analysis would establish why the crash occurred and would hopefully lead to process improvements for future software changes. I think MARTA (and Jet Blue) should either hire or make better use of Business Analysts.