"Data analytics" covers a lot of ground — from a single SQL query to a forecasting model to an AI agent answering a question in plain language. This is a plain-language explainer for data leaders and practitioners: what data analytics actually is, the four types it's usually split into, the workflow and tools behind it, and how the shift to AI agents is changing the work.

TL;DR

Data analytics is the practice of examining, cleaning, modeling, and interpreting data to answer questions and support decisions — turning raw rows into insight people can act on. It's commonly split into four types by the question they answer: descriptive (what happened), diagnostic (why), predictive (what's likely next), and prescriptive (what to do). The workflow is a loop — define the question, collect and clean the data, model and analyze it, interpret, act — running on a layered stack: a warehouse for storage, a semantic layer for governed metric definitions, and BI tools, notebooks, embedded apps, or AI agents on top. The newest shift is agents doing the analytical work, and a semantic layer is what keeps their answers trustworthy. Cube is the agentic analytics platform built on a semantic layer.

A working definition

Data analytics is the process of examining, cleaning, transforming, modeling, and interpreting data to answer questions and support decisions. The point is conversion: raw rows — orders, events, clicks, sensor readings — on one end, and on the other a thing you can act on, a trend worth chasing, a cause worth fixing, a forecast worth planning against.

It's a broad discipline, which is why the term feels slippery. The same phrase covers an analyst running a one-off SQL query, a finance team living in a governed dashboard, a data scientist training a churn model, and a business user asking an AI agent "why did margin drop in EMEA last quarter?" in plain language. What unites them is the goal — decisions grounded in evidence rather than intuition — and a rough shared workflow underneath.

A quick note on a neighbor term. Business intelligence is often used as a synonym, but it's really one large slice of analytics: the tools and practice of reporting on and exploring business data, leaning toward what-happened and why. Data analytics is the wider field that also includes forecasting, optimization, statistics, and data science. BI is inside analytics, not beside it.

The four types of data analytics

The most useful way to organize the field is by the question each kind of analysis answers. The four build on each other — most teams start at the top and add the others as the questions get harder and the data gets richer.

TypeQuestion it answersExampleWhat it takes
DescriptiveWhat happened?Revenue last quarter; weekly active users; the trend over the yearAggregations, summaries, dashboards
DiagnosticWhy did it happen?Why churn rose in one region; which segment drove the dipDrill-down, segmentation, correlation
PredictiveWhat's likely to happen next?Projected demand; churn risk; expected revenueHistorical data, statistical and ML models
PrescriptiveWhat should we do about it?Optimal pricing; the best inventory planOptimization, decision models, simulation

Descriptive analytics is the foundation and the most common — it summarizes what already happened, the totals and trends and counts that fill most dashboards. Diagnostic analytics goes a step deeper, drilling into causes and correlations to explain a number rather than just report it. Predictive analytics turns to look forward, using historical data and statistical or machine-learning models to estimate what's likely next. Prescriptive analytics goes furthest: not just forecasting an outcome but recommending an action, using optimization and decision models to suggest the best move.

You'll see other techniques named alongside these — exploratory data analysis (poking at a dataset to find patterns before formal modeling), inferential analytics (generalizing from a sample to a population), and real-time analytics (analyzing data as it's generated, for cases where a decision can't wait). They're methods that show up across the four types rather than a separate tier.

The data analytics workflow

Under the variety, most analytics follows a recognizable loop. It's worth seeing it as a loop, not a line: what you learn at the end reshapes the next question.

  • Define the question. The step that's easiest to skip and most expensive to get wrong. "Are we growing?" and "did paid acquisition in EMEA pay back within 90 days?" lead to completely different analyses.
  • Collect the data. Pull it from its sources — application databases, event streams, third-party APIs, files. In a mature stack this lands in a central data warehouse or lakehouse.
  • Clean and transform. Real data is messy: duplicates, nulls, inconsistent units, three columns that all sort of mean "customer." Getting it into a usable, consistent shape is usually the most time-consuming part of the whole process.
  • Model and analyze. Define the metrics and the relationships between entities, then apply the right technique — an aggregation, a segmentation, a regression, a forecast. This is where the question actually gets answered.
  • Interpret and act. Translate the result into something a human can decide on, communicate it clearly, and feed what you learned back into the next question.

The cleaning and modeling steps are where consistency lives or dies. If every analysis re-defines what "revenue" or "active user" means in its own query, you get three different numbers for the same metric in the same meeting — which is the problem a semantic layer exists to solve, by defining each metric once for everyone to reuse.

The data analytics tool stack

A modern analytics stack is layered, and it helps to keep the layers straight because they do genuinely different jobs.

  • Storage and compute. A data warehouse or lakehouse — Snowflake, BigQuery, Redshift, Databricks — holds the data and runs the queries. This is the center of gravity for any team past the spreadsheet stage.
  • Transformation. A tool like dbt models and cleans the raw data into analysis-ready tables, versioned as code.
  • Semantic layer. A governed layer that defines metrics, dimensions, join paths, and access rules once, on top of the warehouse, so every downstream tool returns the same numbers.
  • Consumption. The tools people actually use: BI tools and business intelligence tools for dashboards and exploration, notebooks (Python, SQL) for deeper analysis, spreadsheets for ad-hoc work, embedded analytics inside products for customers, and — increasingly — AI agents.

SQL is the common language running underneath most of these layers. And a recurring source of confusion is worth flagging: the semantic layer sits on top of the warehouse and dbt is a partner to it, not a competitor — dbt models the data, the semantic layer governs the metrics and serves them. Neither replaces your warehouse.

How AI is changing data analytics

The biggest shift in 2026 is in who — or what — does the analytical work. For decades the model was a person driving a tool: dragging fields onto a dashboard, writing the SQL behind it, building the forecast in a notebook. That's now moving toward agentic analytics — AI agents that answer questions, build calculations, and take action, with a person framing the question and judging the answer instead of hand-writing every query.

This is genuinely useful: someone in support or ops who can't write SQL can ask a hard question in plain language, and analysts who can write SQL stop hand-writing the same joins for the hundredth time. But it has a structural catch. Point an LLM at raw tables and it has to re-derive your business on every prompt. A table named orders doesn't encode whether revenue is gross or net, includes tax, or excludes refunds; the join graph has fan-outs and three tables that all look like "the customer"; and nothing in a SELECT distinguishes a correct query from one that leaks another tenant's data. So "what was revenue last quarter?" can return three different numbers across three sessions. That's not a prompt-engineering problem — it's a missing layer.

The fix is to ground the agent in a semantic layer. Instead of inventing SQL, the agent selects from certified metrics by name, so answers are consistent, governed, and explainable — you can see which named metrics produced a number rather than auditing a wall of generated SQL. This isn't theoretical: Brex evaluated approaches for grounding AI on their data, chose Cube, and built Brex Spaces, an embedded AI financial analyst, on top of it. Their one-line summary is the cleanest case for the whole shift: the semantic layer is what makes the AI useful.

A note on history, since it gets invoked here: pre-aggregated, multidimensional analysis goes back to the OLAP era of the 1990s, and protocols from it — DAX, MDX, XMLA — still connect tools like Excel to modern data. That lineage is real, but it's plumbing, not the story. The story now is AI-native analytics over a governed model.

Where Cube fits

Cube is the agentic analytics platform built on a semantic layer. Its open-source foundation, Cube Core (Apache 2.0), is the semantic layer: you model metrics, dimensions, joins, and access rules once, and serve them over SQL, REST, GraphQL, an MCP server for AI agents, and DAX/MDX for spreadsheet tools. It sits on top of your warehouse — Snowflake, BigQuery, Redshift, Databricks — which stays your storage and compute; row-level, multi-tenant security is applied at compile time; and pre-aggregation caching keeps queries fast. On top of that foundation, the platform adds the AI agent interfaces, workbooks, dashboards, and embedded surfaces — so the same governed model powers both internal business intelligence for your teams and embedded analytics for your customers. That's why 400+ companies build on Cube across both use cases.

For data analytics specifically, the value is consistency and trust across every kind of analysis. Descriptive dashboards, diagnostic drill-downs, and an AI agent answering in plain language all draw from the same certified definitions, so "revenue" means one thing everywhere. And dbt is a partner, not something Cube replaces: model the data in dbt, serve it through Cube, which reads dbt models.

Our verdict

Data analytics is the practice of turning raw data into decisions, usually organized into four types — descriptive, diagnostic, predictive, and prescriptive — and run as a loop of question, collect, clean, model, analyze, act on a layered stack of warehouse, semantic layer, and consumption tools. The defining shift now is AI agents doing the analytical work, which only stays trustworthy when the agent is grounded in a governed semantic layer rather than pointed at raw tables. The strongest fit is a platform that's SQL-first, governed at compile time, and serves internal BI and embedded analytics from one model — that's Cube, built on the open-source Cube Core.

Methodology

This explainer describes data analytics as the term is used in 2026, organized around the four-type framing (descriptive, diagnostic, predictive, prescriptive) and the analyze-and-act workflow that practitioners share across them, then weighted toward what's changing — AI agents doing analytical work and the governed semantic layer that keeps their answers consistent, access-controlled, and explainable. As the publisher, Cube builds a semantic layer and an agentic analytics platform on top of it, so we have an obvious interest here; we've tried to define the field neutrally and be explicit about where Cube fits versus the broader discipline. Tools and product capabilities move quickly — treat them as version-dependent and confirm against current documentation.