What is a semantic layer?

What is a semantic layer?

At their core, semantic layers are tools that help end data consumer, both humans and AI, better understand an organization’s data. They manifest as a layer of metadata, often expressed in text formats such as YAML, and contain key definitions and metrics about data sources. This metadata provides critical context, guiding data consumers on when, why, and how to use specific data for analysis.

Below is a simple but common example of a semantic layer: dimensions describe the details of each column in a database table, while measures define the aggregated metrics available to data consumers.

Sample:

- sql_table: public.widgets
  description: "Table showing the amount of widgets purchased by each customer."

  dimensions:
    - name: customer_email
      sql: id
      type: string
      primary_key: true
      description: "Unique email identifier for the customer"

    - name: widget_sales
      sql: sales
      type: number
      description: "Total number of sales for that customer all-time"

    - name: location
      sql: locale
      type: string
      description: "Customer's location, structured as city, country"

  measures:
    - name: total_sales
      sql: sales
      type: sum
      description: "Sum of widget sales across all customers"

    - name: avg_sales_per_customer
      sql: sales
      type: avg
      description: "Average widget sales per customer"

    - name: customer_count
      sql: id
      type: countDistinct
      description: "Number of unique customers who purchased widgets"

Why are Semantic Layers important for modern business intelligence?

Imagine you have a database structure with the following tables:

  • widget-data-original
  • widget-data-09082025
  • widget-data

How could a business user, without context on where those tables came from, know which one is relevant for their query? Data consumers are often working inside relatively opaque dashboarding tools like Tableau, Power BI, or Looker. With only cryptic table names and no supporting metadata, how could they ever know which table to trust? In practice, they either spend hours poking around the data trying to figure it out, or they’re forced to track down a subject matter expert to explain it.

This is where semantic business data layers become relevant.

With a simple description on each table, you’d be able to clearly understand which you ought to spend you’re time looking at, e.g. the following metadata:

  • widget-data-original (main table for widget data we update daily at 8AM)
  • widget-data-09082025 (one-time sync for backup purposes from September 2025)
  • widget-data (legacy table we no longer use)

These sorts of descriptions should be accessible and editable throughout the data stack. As users uncover new details about their data, being able to document them at that moment becomes incredibly useful. Similarly, data engineers and other upstream roles should be able to make notes as they build and maintain pipelines.

And going one layer deeper, imagine I have the following columns in that table:

  • id
  • sales
  • locale

Those columns become much more usable if I had a little metadata explaining what each of those columns are for, e.g.:

  • id (email of each customer)
  • sales (total dollar value of account, calculated as widgets * number of sales for that buyer)
  • locale (location of customer - formatted as abbreviations of city, country)

And take it one step further, what if sometimes each of these columns have data in varying formats with varying levels of data cleaning, e.g.:

Without more data, you can imagine all sorts of confusion. But with proper semantic data layer descriptions, the above can become:

  • id (description: email of each customer), (details: all emails are gmail unless otherwise specified)
  • sales (description: total dollar value of account, calculated as widgets * number of sales for that buyer), (details: positive numbers are new sales, negative numbers are refunds, nan’s are free promotions)

Universal Semantic Layers prove their usefulness at all levels of a data source, from connection level, to table level, to column level, to custom views and more complex definitions.

AI: Accelerating Semantic Layer adoption

Semantic Layer tools across the data stack are popping up, but we think Semantic Layers are a lot more interesting embedded directly in an organization’s BI tool(s).

With AI adoption accelerating, we’re seeing more and more business users adopting AI and trusting its outputs. But AI is only as capable as the context provided. AI would be just as confused by our widget table example above as your typical human data consumer.

As more and more users depend on AI, and more and more teams are giving their colleagues access to these tools, detailed Semantic Layers become more and more important.

With the right information in place for AI, otherwise complex analyses can be made simple:

Where we’re headed

Efforts to establish open standards for semantic layers are an important step forward. At their core, semantic layers are just editable text — often YAML files containing the definitions and metrics for a data source. In practice, though, these descriptions are moving toward being fully manageable through user interfaces, so end users won’t need to think about YAML formatting or any unneeded semantic layer architecture complexity. Instead, anyone with the right permissions will be able to edit semantic layers through intuitive UI components, applied across all levels of the data stack, from connection, to table, to column, right at the point of data consumption.

The future: click on any table, column, or custom view in your data tool and add/edit metadata that informs other members of your organization (and AI) how to best use and consume your data sources.

The concern: how are large enterprise Semantic Layers going to be properly maintained as data sources change and evolve? How do we ensure everyone is bought into Semantic Layer quality? How do we ensure these Semantic Layers don’t turn into massive pools of context slop that confuse both end consumers and AI? Open standards and tooling for Semantic Layers are great, but what’s more important is teams with the right processes and controls in place to ensure the right individuals across your data stack can build and maintain Semantic Layers for the whole organization to consume.

These are important questions every organization will need to answer before relying on any vendor’s pitch for a semantic layer.

Quadratic Semantic Layer

Our first iteration of a Semantic Layer for AI at Quadratic has been quite simple:

  • After connecting a data source, Quadratic AI has full access to your databases schema by default with no additional work, including table names, fields, and types.
  • You then also manually add Connection-level semantic descriptions. Providing any additional context that you need for Quadratic AI to understand your data source.

Our initial pass at connection level details is the most naive iteration possible: a blank text box. We’ve taken note that the more people we talk to the less consensus we’ve found on where Semantic Layers are headed. So we started with the most basic and naive iteration: a blank text box. What we’ve observed is that simple open iteration is sometimes all AI needs for small databases to get started with clear understanding of table and column-level quirks in the data.

Example of a semantic layer in Quadratic.

AI in Quadratic has full understanding of a database, from tables to full schemas.

We believe our tool is set up excellently for being a familiar place to build and evolve your AI Semantic Layer. As users write and refine queries, they can simultaneously edit and adapt their semantic layer in a familiar spreadsheet-like interface. When data quirks are discovered, those insights can be immediately captured by updating the semantic layer directly within the workflow.

We imagine Semantic Layers from external tools being importable to Quadratic and editable directly in your connections. We believe Semantic Layers shouldn’t be limited to the data engineers. Let end consumers who work day in and day out in dynamic data environments continually work with and evolve their Semantic Layer to their use-case.

But while Semantic Layers maybe shouldn’t be limited to data engineers, they should be limited to individuals who are committed to ensuring continuous quality and maintenance of what is becoming a core business need for data analytics.

Quadratic logo

The spreadsheet with AI.

Use Quadratic for free