Articles

Structured Data Vs. Unstructured Data for FP&A and Treasury

By Bryan Lapidus, FP&A
Published: 3/6/2017

Big data, digital transformation, agile development. There are lots of words to describe how our personal and professional lives are awash in data and how we manage it. We need to know both what this means, and what it means for us in our role in finance. This article kicks off a two-part series to help you familiarize yourself with key terms of the information age, and the implications for FP&A and treasury. This installment is about structured data, to be followed up next week with unstructured data; elements of “big data” are covered in both.

THE JARGON: As the name implies, structured data implies a top-down approach and is part of an overall enterprise architecture: “a well-defined practice for conducting enterprise analysis, design, planning, and implementation, using a holistic approach at all times, for the successful development and execution of strategy.” CIOs and CISOs rely on the architecture to create standardization and efficiency in IT hardware systems, data management, and security management.

The data model is the organized structure of the data: what the data is, how it enters a database and it is accessed by users, including potential changes.

The data sits in a database and has a set of rules (schema) about how to access the data. At the root, there are tables of data, much like a single Excel spreadsheet. Just like you can have multiple spreadsheet tabs in a workbook, there can be multiple tables in a database. Tables can be searched through a query, or an instruction to search the data tables, and this output of this query itself can become a data object. Standard queries can be come reports or views on the data.

This describes the typical relational database, where the data can be queried based on the relationship between the objects described. Data can be thought of having “dimensions” or key characteristics. Think of a typical graph with variables plotted on the X and Y axes, each of these a dimension of the data. A third dimension would be a Z axis and you can think of a data cube. Many relational databases today will have seven to 10 dimensions of the data that allow you to “spin the cube” of data.

The data warehouse is data gathered from multiple systems (think of transactional data here), then accessed through a data mart (think of a market where you request and query what you need).

WHY IT MATTERS: The amount of data available to analyze is growing exponentially. More data has been created in the past two years than the entire history of the human race. While structured data is estimated to be about 20 percent of current data, it is the main source of information that we in finance use and create. Our enterprise resource planning tools are based on structured data—GLs, point of sale, inventory. Additionally, call center logs, IoT data from equipment sensors, and website data points are all structured data. However, for all the data (structured and unstructured) we create, only 0.5 percent will be analyzed.

The availability of data is in front of us; we need the skills to analyze to stay relevant, and the company that makes smart investments in business analytics will have a key resource. This implies good hiring and training allocations for people, and constantly upgrading the systems that can harness this increasing bounty of information.

For us, we need to become partial data scientists who can dig through and find the right data. At the same time, separating out the signal from the noise will become the key human-skill that separates us from machine-learning algorithms.

However, there is a more important question: How do we weigh human judgment against infinite data? Do we build models to predict the future in order to minimize human biases, or do we take the automated outputs as guidance and then layer on judgment? And where is the line of demarcation between model and judgement? I don’t have a firm recommendation, except for this: It will become an issue when there is a large forecast variance and someone says, “That is what the model told us!” or when the entire team sweats out the forecast only to have it overruled by senior management, saying, “This can’t possibly be true… change it by X percent!”

I recommend we have this discussion, and revisit it, before compiling our forecasts and budgets.

Next Week: Structured Data Vs. Unstructured Data for FP&A and Treasury, Part 2

Bryan Lapidus, FP&A, is a contributing consultant and author to the Association for Financial Professionals. Reach him at [email protected].