Most information systems are intended to accept, process, and provide information to either support a business process, or to provide information that people in the organization use to make decisions or to answer questions. Business intelligence projects (or data warehouse projects) are often started when stakeholders desire to meld information from multiple systems, or arrange data in a way the source systems were not intended to support. The trick to making the most effective use of business intelligence systems is to understand those questions and decisions and focus on the specific pieces of information you need to answer those exact questions or make those particular decisions, and you don’t have to deal with data that you don’t need.
I recently worked with a team at a financial services company working on a project to incorporate a new data source into an existing data warehouse. The stakeholders had access to this other source of information for several years, but were not able to easily combine that information with other data that they used from other sources. When they did attempt to combine data, it was usually on an individual basis using excel. The stakeholders of this project identified several pre designed reports as well as wanted to be able to do adhoc queries. There were several unknowns about how the new data was structured, and the team was also concerned about how they would figure out what all adhoc queries the stakeholders wanted to do, and how long it would take them to figure that out.
I suggested that one way to speed things up was to reduce the amount of analysis they did. I did not mean cutting corners on analysis, but rather to reduce the amount of items they were trying to analyze. They could do this by starting with the outcomes they were trying to produce, determine what outputs are needed to drive those outcomes, and then figure out what processes and inputs are needed specifically to deliver those outcomes. They would be able to apply a lot more focus, and deliver value sooner, by selecting a specific outcome they wanted to support first, and focus on delivering the associated reports. They could then figure out what inputs and processes were needed for those specific reports and deliver those, getting feedback from the stakeholders requesting the reports. Over the course of the project, this approach allows stakeholders to start using getting some information much earlier than they would have had all the analysis been done before moving on to development.
As we talked about the best way to proceed with the project, we decided to focus on the predefined reports first. We determined that if we addressed the known needs, we could fairly quickly deliver information that would support those needs, and in many cases structure data in a way that would meet the majority of the stakeholder’s adhoc query needs. We established user stories for each of the reports that acted as placeholders for further analysis when we got closer to delivering that particular report. We used some sample reports the stakeholders had put together as a reference point for identifying the user stories, and also figured we could use those reports as further information when doing a deep dive analysis on every specific story.
The team decided to start with a report that consisted of a list of a mutual funds belonging to a specific client and information about each fund that was derived from multiple sources. While this report was not necessarily the report that all of the stakeholders wanted more than any other, it was a key report that had to be done in order to produce most of the others. The team looked at the sample report and tried to determine if it was small enough they could deliver it in a couple of weeks. The team thought it would be best to break it up into three smaller stories, two stories including data sourced from different systems, and one story that consisted of data elements that were calculated based on other data elements. To further describe these stories, the team started building a data dictionary with relevant information about the data elements and their sources, as well as any transformations that were necessary to arrive at the desired data. The team also identified some sample clients that they could use to test the reports when they were developed. These sample clients were selected because they represented a wide selection of the cases that the team would run into with the data in which they were working.
The stories, the report prototypes, the data dictionary, and the example clients were all the requirements information that the team pulled together. They agreed to try using that information and adjust if they found that they did not have sufficient information, or if they found that some of the information was not really necessary. As of the last time I checked with the team, they have not needed to adjust the types of requirement information they were preparing.
Following this approach allowed the team to some of the data in the existing systems because it not needed. Starting with outputs and working backwards means that the team does not have to dig into deep detail about those elements that are not needed, which they would have if they had started with analyzing source system. This saved the team time, effort, and frustration. Plus, by delivering things iteratively the team was able to get feedback on what they had delivered to date and revise their approach going forward to make it more efficient and increase the chance of delivering what their stakeholders needed.