How many data roles does it take to screw in a dashboard?

Data teams are bigger than ever and somehow the data is still wrong

Sep 09, 2024

Let’s set the stage: suppose you work at a big company. The company has infrastructure, pipelines, databases, the whole package. There are dashboards and customer-facing applications running on data. Top people in the company talk about data, maybe even preach the value of data. And yet… the data is inconsistent. The dashboards are slow and clunky. Ten different teams are reporting ten different numbers for the same metric.

This isn’t a heart-wrenching sob story. It’s real life for a lot of people working closely with data. And it begs the question: what the hell is going on with data?

Let’s explore this question by tumbling down a rabbit hole of other questions:

what are the roles and responsibilities in and around data?
who works with data and who works on data?
why do so many dashboards suck?

Disclaimer: at the time of writing this I work as (drum roll) a data engineer. So nothing personal, my fellow data brothers and sisters! We’re all in this mess together.

Number 1: roles and responsibilities in and around data.

Let’s kick this off with some generalizations. Engineers love automation, and what is automation if not a process for generalizing?

Now for some wild generalizations:

The more technical the engineer, the less time they spend hands-on with data.
Engineers tend to be more interested in engineering, less so product and business.
Product and business people lack context on engineering and infrastructure.
Data platform engineers make pipelines and automation possible.
Data engineers work a lot with frameworks, processes, and code.
Analytics engineering roles are often under-utilized as human ETL factories.
Data analysts wish they had more granular, faster, and more accurate data.

If these generalizations are even slightly on point, the discipline of creating data products is fragmented and messy. Responsibility and ownership is diluted across different roles, teams, and maybe even different departments in the company.

Just imagine: there is a data discrepancy. The business user tells the analyst the dashboard is wrong. The analyst checks the dashboard and it matches the underlying table. The data engineer checks the table’s outputs against raw data inputs and nothing is wrong with the process. The data platform engineer does whatever voodoo magic they do to confirm their battlestations are operational.

Clear lines are drawn in the sand, each role passes their message (and responsibility) down the line, and this journey in our bureaucratic machine lands us at: the dashboard is still wrong! The business user is left holding Schrodinger’s dashboard: it is both wrong and not wrong at the same time. I see this a lot, building data architecture for customer-facing data applications. Sometimes when a dashboard is “wrong”, it isn’t technically wrong — there’s often a disconnect between product-layer expectations and engineering implementations. But it’s incredibly rare to find skills that bridge the gap between engineering and product, so the solution is often left floating somewhere in no man’s land between the disciplines.

Before we move on, let’s loop back to my snarky bullet point above on analytics engineers. This ties into bridging the gap between engineering and product. I think the role analytics engineering is supposed to fill is one of the most important emerging disciplines in data: engineers who deliver product architecture tailored to product needs. Unfortunately, what I have experienced is that a lot of analytics engineer positions are essentially ETL assembly lines. That interpretation of the analytics engineering role does not necessarily bridge the gap between product and engineering. That’s a missed opportunity, as many product roadmaps are absolutely begging for these gaps to be bridged and analytics engineering isn’t on their radar.

Almost every company is doing development around data, piping data from one place to another place. But which roles are actively diving into the data anymore?

Number 2: who works with data and who works on data?

A really good data analyst is like a platinum mine filled with business context. They tend to be the least technical players in the data engineering game, but they have a lot of business context and they are more in harmony with the “so what” in the business.

Engineers who pipe data from point A to point B are usually thinking about data in terms of inputs and outputs. If 100 bananas enter the pipeline, a diligent engineer wants to see 100 bananas exit the pipeline. The engineer has no idea why there are 100 bananas, whether or not those bananas are peeled, or why we’re even in the banana business. But by golly, they will move those bananas with precision and grace. If only 98 bananas make it through the pipeline, we’d better hope we hired a good engineer who’s validating their pipelines or else we’ll have a data discrepancy on our hands.

When I started working in data over a decade ago, data engineering hadn’t yet forked in a thousand directions like it has now. Before my time everyone who worked with data was referred to as “IT”, and the guy setting up the wifi was the same person building your data pipelines and dashboards. It’s relatively recently that the business world started acknowledging data analysis, data engineering, and platform engineering as different disciplines requiring different sets of skills.

As the data world becomes increasingly specialized, something ironic is happening: many positions with “data” in the title don’t actually work with data. They work on data, but they don’t work with data. Data platform engineers don’t build dashboards. They don’t think about how product KPIs are defined. They write code (Java, Python, etc), but might only know basic SQL. Their primary concern is working on the infrastructure that supports the data infrastructure, which enables all the other data disciplines to do their jobs.

This extends to data engineering as well. While data platform engineers are deploying Airflow to containers, managing cluster resources, and deploying the latest version of Spark, data engineers are building on top of those tools. The data platform engineer’s frameworks address problems like how to deploy a database with high availability. The data engineer’s frameworks focus on building efficient Spark pipelines, Airflow DAGs, and various data extraction and transformation methodologies.

These data disciplines can be very technical, and sometimes the frameworks and processes reach a level of abstraction such that engineers are barely exposed to actual living, breathing data. Pipelines? Yes. Data and its meaning? No.

It’s a bit like engineers who build televisions. They create the infrastructure that enables watching TV, but they don’t necessarily watch your favorite shows. They might not watch TV at all. The same is true in data land: not everyone working on data has much experience working with data. Or in other words, the roles focusing on building infrastructure for data pipelines are not necessarily the same roles focusing on extracting value from the data.

This means a lot of the nuance relating to the data itself is becoming concentrated within a minority population of data-oriented roles. My observation is this is one of the leading causes of inefficiencies companies experience when working with data. Very few “data whisperers” are being cultivated, and these rare creatures are likely not in roles with direct control over data infrastructure or product roadmaps.

Analysts work with the data. Data is the essential ingredient for their dashboards. In some situations their assignment is churning out basic summaries and aggregations. But if anyone’s converting data into insightful information, that person has data analyst DNA. However, while they understand the data they aren’t necessarily equipped to engineer the data optimally for product delivery.

Analytics engineers (not the human ETL robots) work with the data and also on the data. In my view this is the unicorn role of the data landscape, and I think the role often has an identity crisis because it is hard to find people with experience being a unicorn. These are engineers who are good at engineering, but also product savvy. They need to be able to interpret what a product vision is striving to be, and overlay that interpretation onto the available tech stack. When gaps exist between where the tech is today and where it needs to be to enable the product vision, analytics engineers must then own the process of building bridges that fill the gaps.

This collection of skills is extremely rare, hence why so many analytics engineering positions “fail” and instead become ETL production lines. If an individual in this role is all engineering and no product, or is mostly product and not technical enough, or if they aren’t given the resources to be successful… they end up building a lot of bridges to nowhere.

Number 3: why so many dashboards suck

Building on what was discussed in the previous sections, a lot of dashboards and data products fail to bridge the gap between what the product yearns to be and what the technology is capable of delivering.

The most common variety of failed dashboard I’ve seen in my career are the dashboards that were doomed from the start:

business users want a product but struggle to nail down requirements
a data engineer is tasked with supplying data for this vague product
the data engineer creates an equally vague table to support the vague product
this vague, unoptimized pile of data is handed to a data analyst
the data analyst builds an unoptimized dashboard lacking the “so what”
the business needs remain underserved, and the cycle repeats

Want to know what’s really terrifying? The bullets above mirror how TONS of dashboards are built. In fact, that is the go-to playbook I’ve seen at multiple well-established companies.

For whatever reason, everyone wants dashboards but it’s rare to find a process that properly invests in planning and deploying dashboards. Properly investing in a dashboard means treating it like a product. A basic product has a vision, clear definitions, and a design. Many dashboards have none of that.

Good products also evolve over time. You don’t have the same phone you did ten years ago, you have a newer version with enhanced product features and loads of updates. Dashboards (and other data products) that don’t evolve or have product roadmaps are doomed to be irrelevant. They might not fail tomorrow or next week. But they will go stale and fade away, becoming tech debt the interns take turns keeping on life support.

Zooming in on performance: how is it possible a dashboard can still be slow if they’re powered by supercharged databases like Snowflake, etc?

This is a contributing factor: a lack of product architecture. Piping raw data directly into the product-layer doesn’t work out well. This might work for monitoring real-time metrics, but it is not a recipe for success in extracting maximum value from the data and building high-performing and responsive user-facing products.

Building good products with data requires building product architecture capable of serving jet fuel efficiently into the product engine.

A lot of dashboards suck because the gap between technical ingredients and product needs is left unfilled.

Bridging the gap is a skill. Having the right mix of engineering skills and product skills is rare. Many of the people who do fit this profile will figure out that their work is more visible if they choose one discipline and stick with it. I have seen this first hand, and it’s expected that talented workers want to move forward in their careers. If bridging the gap isn’t encouraged as a visible and rewarding career trajectory within an organization, in time the bridge builders will leave or abandon building bridges.

Unfortunately, currently it’s often the better career move to double down on engineering or product. This is part of why it’s so hard to fill the analytics engineering gap. Most senior product or engineering talent learns to be one or the other to advance their careers.

I hope we will see this change soon, as companies begin to realize productivity is falling off a cliff and the solution is investing in bridging the gaps between engineering potential and product execution.

Elliott’s Substack

Discussion about this post