How Derived Views in Connect AI Reduce LLM Token Usage

AI-driven analytics rely on data from multiple systems, such as CRMs, ERPs, Databases, HR platforms, and more. While AI can interpret and query each system, thanks to the Model Context Protocol (MCP), it must learn schemas, identify relationships, and create multiple queries for every prompt run. Because the LLM is rebuilding the model/schema every time, responses take longer to generate, especially when multiple APIs and sources are involved.

Not only are the responses taking longer to generate, but the repeated reasoning consumes tokens for every schema discovery, relationship mapping, and query call. CData Connect AI, the first managed MCP platform, introduces Derived Views to break the repetitive schema calls, saving you time and savings on tokens. Derived Views give teams the ability to create multi-source deterministic datasets that are reusable to make AI-driven analytics faster, cheaper, and more reliable.

In this article, we explore how Derived Views reduce token costs, how Connect AI supports multi-source modeling, and how together they dramatically accelerate AI-driven analytics.

Overview

Here's a quick overview of what is covered in this article:

How derived views solve the repetition (and token) problem
How Connect AI makes multi-source modeling possible
Demo: Set up a Derived View in Connect AI
Test results: Asking the same prompt with and without using a derived view

How derived views solve the repetition (and token) problem

Derived Views are a virtual, reusable table or view created by combining, filtering, or aggregating data from one or more existing base tables or views. Because of this, Derived Views solve a huge challenge for AI-driven analytics: the need to rebuild logic every time a prompt is received.

Once the Derived View is defined and created, it becomes a shared asset for AI agents (and humans) to query without recalculating or reconstructing anything. Every following prompt becomes faster, cheaper, and lighter because the LLM no longer has to rediscover schemas, relationships, and data models.

This results in:

Large token savings: Because the LLM no longer has to rediscover schemas and relationships, token consumption is significantly reduced because of smaller and fewer query calls. This is especially true for complex queries involving multiple data sources.
Faster responses: With the logic already defined in the Derived View, AI agents can generate responses quicker with no reconstruction needed.
Consistent answers: Derived Views ensure that every AI agent is working from the same deterministic dataset, leading to more consistent and reliable answers across different queries, agents, and users.
Reusable analytics: Every AI agent and team can leverage the same Derived View for different analytics tasks, promoting collaboration and efficiency.

Derived Views can create a foundation for AI analytics to deliver quicker, cheaper, and more consistent answers.

How Connect AI makes multi-source modeling possible

Derived Views are possible and efficient in Connect AI because the platform provides the semantic foundation needed to unify and interpret data across systems that were never designed to work together or communicate. At the core, Connect AI standardizes every connected data source through a unified semantic interface. This layer sits between the systems’ APIs and the LLM, presenting each data source in a consistent, tabular dataset model.

This means that Salesforce objects, Zendesk tickets, and Snowflake tables all appear as row-and-column datasets with source-level semantic intelligence that is enriched with metadata about each system’s tables, columns, data types, and relationships. This added context ensures the LLM understands not just the structure of the data, but also its meaning and the expected usage.

Under the hood, Connect AI translates requests from agents into each system’s native API language. AI agents (and humans) reason with simple tables, while Connect AI handles the complex translation required to access the source data. Because every source is presented in a common format and enriched with source-specific intelligence, AI agents can query, join, and reason across multiple systems with different schemas as if they were part of the same model. This translation layer is what makes cross-source derived views both possible and efficient through Connect AI.

Connect AI’s semantic interface and source-level metadata create the environment that allows derived views to exist, enabling them to behave like unified datasets even though the underlying source data is spread across multiple live systems and APIs

Demo: Set up a Derived View in Connect AI

Prerequisites

You need the following to follow along with this guide:

A Google account (if you do not have one, create one here)
A CData Connect AI account (sign up for a free trial)

Step 1: Copy the Sample Google Sheet

We'll start by copying a sample Google Sheet that contains customer health data including accounts, opportunities, support tickets, and product usage information.

Navigate to the sample customer health spreadsheet in your browser.
Click File > Make a copy to save a copy to your Google Drive.
NOTE: Remember the name you give to the copy (e.g., "demo_organization"). You'll need this name when configuring the connection in Connect AI.

The spreadsheet contains four sheets with related customer data:

account: Company information including name, industry, revenue, and employee count
opportunity: Sales opportunities with stage, amount, probability, and close dates
tickets: Customer support tickets with priority, status, and descriptions
usage: Product usage metrics including job runs, records processed, and revenue data

Step 2: Configure Google Sheets Connectivity in Connect AI

Next, we'll set up a connection to the copied Google Sheet in Connect AI.

Sign up or Log in to Connect AI

Navigate to https://www.cdata.com/ai/signup/ to create a new account, or https://cloud.cdata.com/ to log in to your existing account.
Complete the sign-up process or log in with your credentials.

Add a Google Sheets Connection

Once logged in to Connect AI, click Sources in the left navigation menu and click Add Connection.
Select Google Sheets from the Add Connection panel.
In the connection configuration:
- Set the Spreadsheet property to the name of your copied Google Sheet (e.g., "demo_organization")
- Click Sign in to authenticate with Google using OAuth
After successful authentication, navigate to the Permissions tab and configure user-based permissions as needed.

Step 3: Create a Derived View in Connect AI

Follow these steps to create a Derived View:

Log-in to Connect AI and select Explorer from the left sidebar.

Next, create the query in the Query Builder or SQL Editor. Below is an example SQL query to use as a derived view that uses the sample Google Sheets spreadsheet to generate a customer health score.

-- Simple Customer Health Score that combines account, usage, support tickets, and opportunities data
-- Health score ranges from 0-100 based on key customer success metrics
SELECT a.[id1]                         AS [Account_ID],
       a.[name]                        AS [Account_Name],
       a.[industry],
       a.[type]                        AS [Account_Type],
       -- Usage Metrics
       COALESCE(u.[total_jobs], 0)     AS [Usage_Job_Runs],
       COALESCE(u.[revenue_12m], 0)    AS [Annual_Revenue],
       u.[last_activity]               AS [Last_Usage_Date],
       -- Support Metrics
       COALESCE(s.[total_tickets], 0)  AS [Support_Tickets],
       COALESCE(s.[open_tickets], 0)   AS [Open_Support_Tickets],
       -- Sales Metrics
       COALESCE(o.[pipeline_count], 0) AS [Open_Opportunities],
       COALESCE(o.[pipeline_value], 0) AS [Pipeline_Value],
       -- Health Score Components (0-25 points each)
       -- Usage Health: Based on job runs and recent activity
       CASE
         WHEN COALESCE(u.[total_jobs], 0) >= 5000
              AND u.[days_since_activity] <= 30 THEN 25
         WHEN COALESCE(u.[total_jobs], 0) >= 2000
              AND u.[days_since_activity] <= 60 THEN 20
         WHEN COALESCE(u.[total_jobs], 0) >= 500
              AND u.[days_since_activity] <= 90 THEN 15
         WHEN COALESCE(u.[total_jobs], 0) > 0 THEN 10
         ELSE 0
       END                             AS [Usage_Score],
       -- Revenue Health: Based on customer value
       CASE
         WHEN COALESCE(u.[revenue_12m], 0) >= 100000 THEN 25
         WHEN COALESCE(u.[revenue_12m], 0) >= 50000 THEN 20
         WHEN COALESCE(u.[revenue_12m], 0) >= 25000 THEN 15
         WHEN COALESCE(u.[revenue_12m], 0) > 0 THEN 10
         ELSE 5
       END                             AS [Revenue_Score],
       -- Support Health: Fewer tickets = better health
       CASE
         WHEN COALESCE(s.[total_tickets], 0) = 0 THEN 25
         WHEN COALESCE(s.[open_tickets], 0) = 0
              AND COALESCE(s.[total_tickets], 0) <= 3 THEN 20
         WHEN COALESCE(s.[open_tickets], 0) <= 1
              AND COALESCE(s.[total_tickets], 0) <= 5 THEN 15
         WHEN COALESCE(s.[open_tickets], 0) <= 3 THEN 10
         ELSE 5
       END                             AS [Support_Score],
       -- Sales Health: Active pipeline indicates growth
       CASE
         WHEN COALESCE(o.[pipeline_count], 0) >= 3 THEN 25
         WHEN COALESCE(o.[pipeline_count], 0) >= 2 THEN 20
         WHEN COALESCE(o.[pipeline_count], 0) >= 1 THEN 15
         WHEN COALESCE(o.[won_count], 0) > 0 THEN 10
         ELSE 5
       END                             AS [Sales_Score],
       -- Overall Health Score (sum of all components)
       ( CASE WHEN COALESCE(u.[total_jobs], 0) >= 5000 AND
         u.[days_since_activity] <=
         30
         THEN 25 WHEN COALESCE(u.[total_jobs], 0) >= 2000 AND
         u.[days_since_activity]
         <= 60 THEN 20 WHEN COALESCE(u.[total_jobs], 0) >= 500 AND
         u.[days_since_activity] <= 90 THEN 15 WHEN COALESCE(u.[total_jobs], 0)
         > 0
         THEN 10 ELSE 0 END + CASE
         WHEN COALESCE(u.[revenue_12m], 0) >= 100000 THEN 25 WHEN
         COALESCE(u.[revenue_12m], 0) >= 50000 THEN 20 WHEN
         COALESCE(u.[revenue_12m], 0
         ) >= 25000 THEN
         15 WHEN COALESCE(u.[revenue_12m], 0) > 0 THEN 10 ELSE 5 END + CASE WHEN
         COALESCE(s.[total_tickets], 0) = 0 THEN 25 WHEN
         COALESCE(s.[open_tickets], 0)
         = 0 AND COALESCE(s.[total_tickets], 0) <= 3 THEN 20 WHEN
         COALESCE(s.[open_tickets], 0) <= 1 AND COALESCE(s.[total_tickets], 0)
         <= 5
         THEN 15 WHEN
         COALESCE(s.[open_tickets], 0) <= 3 THEN 10 ELSE 5 END + CASE WHEN
         COALESCE(o.[pipeline_count], 0) >= 3 THEN 25 WHEN
         COALESCE(o.[pipeline_count],
         0) >= 2 THEN 20
         WHEN COALESCE(o.[pipeline_count], 0) >= 1 THEN 15 WHEN
         COALESCE(o.[won_count],
         0) > 0 THEN 10 ELSE 5 END )   AS [Health_Score],
       -- Health Category
       CASE
         WHEN ( CASE WHEN COALESCE(u.[total_jobs], 0) >= 5000 AND
                u.[days_since_activity] <=
                       30
                       THEN 25 WHEN COALESCE(u.[total_jobs], 0) >= 2000 AND
                       u.[days_since_activity] <= 60 THEN 20 WHEN
                COALESCE(u.[total_jobs], 0)
                       >= 500 AND
                       u.[days_since_activity] <= 90 THEN 15 WHEN
                COALESCE(u.[total_jobs], 0)
                       > 0 THEN 10 ELSE 0 END
                       + CASE WHEN COALESCE(u.[revenue_12m], 0) >= 100000 THEN
                25 WHEN
                       COALESCE(u.[revenue_12m], 0) >= 50000 THEN 20 WHEN
                       COALESCE(u.[revenue_12m], 0) >= 25000 THEN 15 WHEN
                       COALESCE(u.[revenue_12m], 0) > 0
                       THEN 10 ELSE 5 END + CASE WHEN
                COALESCE(s.[total_tickets], 0) =
                0 THEN
                       25 WHEN COALESCE(s.[open_tickets], 0) = 0 AND
                       COALESCE(s.[total_tickets], 0) <= 3 THEN 20 WHEN
                       COALESCE(s.[open_tickets], 0) <= 1 AND
                       COALESCE(s.[total_tickets], 0) <= 5 THEN 15 WHEN
                       COALESCE(s.[open_tickets], 0) <= 3 THEN 10 ELSE 5 END +
                CASE
                WHEN
                       COALESCE(o.[pipeline_count], 0) >= 3 THEN 25 WHEN
                       COALESCE(o.[pipeline_count], 0) >= 2
                       THEN 20 WHEN COALESCE(o.[pipeline_count], 0) >= 1 THEN 15
                WHEN
                COALESCE
                       (o.[won_count], 0) > 0 THEN 10 ELSE 5 END ) >= 80 THEN
         'Healthy'
         WHEN ( CASE WHEN COALESCE(u.[total_jobs], 0) >= 5000 AND
                u.[days_since_activity] <=
                       30
                       THEN 25 WHEN COALESCE(u.[total_jobs], 0) >= 2000 AND
                       u.[days_since_activity] <= 60 THEN 20 WHEN
                COALESCE(u.[total_jobs], 0)
                       >= 500 AND
                       u.[days_since_activity] <= 90 THEN 15 WHEN
                COALESCE(u.[total_jobs], 0)
                       > 0 THEN 10 ELSE 0 END
                       + CASE WHEN COALESCE(u.[revenue_12m], 0) >= 100000 THEN
                25 WHEN
                       COALESCE(u.[revenue_12m], 0) >= 50000 THEN 20 WHEN
                       COALESCE(u.[revenue_12m], 0) >= 25000 THEN 15 WHEN
                       COALESCE(u.[revenue_12m], 0) > 0
                       THEN 10 ELSE 5 END + CASE WHEN
                COALESCE(s.[total_tickets], 0) =
                0 THEN
                       25 WHEN COALESCE(s.[open_tickets], 0) = 0 AND
                       COALESCE(s.[total_tickets], 0) <= 3 THEN 20 WHEN
                       COALESCE(s.[open_tickets], 0) <= 1 AND
                       COALESCE(s.[total_tickets], 0) <= 5 THEN 15 WHEN
                       COALESCE(s.[open_tickets], 0) <= 3 THEN 10 ELSE 5 END +
                CASE
                WHEN
                       COALESCE(o.[pipeline_count], 0) >= 3 THEN 25 WHEN
                       COALESCE(o.[pipeline_count], 0) >= 2
                       THEN 20 WHEN COALESCE(o.[pipeline_count], 0) >= 1 THEN 15
                WHEN
                COALESCE
                       (o.[won_count], 0) > 0 THEN 10 ELSE 5 END ) >= 60 THEN
         'At Risk'
         ELSE 'Critical'
       END                             AS [Health_Category]
FROM   [GoogleSheets1].[GoogleSheets].[demo_organization_account] a
       -- Usage data aggregated per account
       LEFT JOIN (SELECT [account_id],
                         Sum([s_standardjobruns])  AS [Total_Jobs],
                         Sum([last_12m_revenue])   AS [Revenue_12M],
                         Max([lasttelemetryevent]) AS [Last_Activity], Datediff(
                 'day',
                 Max(
                 [LASTTELEMETRYEVENT]), Getdate()) AS [Days_Since_Activity]
                  FROM
       [GoogleSheets1].[GoogleSheets].[demo_organization_usage] GROUP
                 BY [ACCOUNT_ID]) u
              ON a.[id1] = u.[account_id]
       -- Support tickets aggregated per account
       LEFT JOIN (SELECT [accountid],
                         Count(*) AS [Total_Tickets],
                         Sum(CASE
                               WHEN [status] IN ( 'pending', 'open' ) THEN 1
                               ELSE 0
                             END) AS [Open_Tickets]
                  FROM
       [GoogleSheets1].[GoogleSheets].[demo_organization_tickets]
                  GROUP  BY [accountid]) s
              ON a.[id1] = s.[accountid]
       -- Opportunities aggregated per account
       LEFT JOIN (SELECT [accountid],
                         Count(*) AS [Total_Opps],
                         Sum(CASE
                               WHEN [isclosed] = 0 THEN 1
                               ELSE 0
                             END) AS [Pipeline_Count],
                         Sum(CASE
                               WHEN [isclosed] = 0 THEN [amount]
                               ELSE 0
                             END) AS [Pipeline_Value],
                         Sum(CASE
                               WHEN [iswon] = 1 THEN 1
                               ELSE 0
                             END) AS [Won_Count]
                  FROM
       [GoogleSheets1].[GoogleSheets].[demo_organization_opportunity]
                  GROUP  BY [accountid]) o
              ON a.[id1] = o.[accountid]
ORDER  BY [health_score] DESC

Note: The example Derived View above assumes that the Google Sheets connection name is "GoogleSheets1".

After you execute the query, the Save button appears. Click Save > Save as Derived View, and fill out the name of the Derived View then click Create.
The new Derived View should now appear under the Derived Views Folder. Click the three vertical dots to the right of the new Derived View and then click Query to populate the Explorer with the Derived View.

Step 4: Use the new Derived View

Now that the Derived View is created and saved in Connect AI, LLMs and AI tools can query it directly through Connect AI. When prompting LLMs or an AI agent, simply reference the Derived View as you would any other table or dataset. For best results, be sure to include the Derived View's full table name (including the catalog and schema) in your prompt:

[CData].[DerivedViews].[Your_Derived_View_Name]

Note: By default, Derived Views saved in Connect AI use the CData catalog and DerivedViews schema.

See the difference: Asking the same prompt with and without using a derived view

To illustrate the value of using Derived Views with Connect AI, we asked the same prompt twice, once with a derived view configured and once without a derived view. We instructed the agent to blend customer data from Salesforce, support history from Zendesk, and product usage trends from Snowflake.

Prompt: “Create an executive summary of our top 5 highest-value customers and their current health status. Include any immediate actions needed.”

Results:

	Without Derived View	With Derived View
Execution Time	1 min 16.34s	35.88s
Tokens Used (Total)	96,000 Tokens	14,202 Tokens
Tool Calls	10	3

Using a derived view reduced token usage by over 80%, tool calls by 70%, and cut the execution time by more than half. Without a Derived View, the AI agent must build a multi-source customer health model from scratch, requiring multiple queries for metadata and context. With a Derived View, the AI agent can simply use a consistent, deterministic dataset that already has your own business logic baked into it.

This is the benefit of structure beneath your AI-driven analytic workflow. With Connect AI’s semantic interface and derived views, you get faster insights, lower token costs, and predictable results for every prompt.

Get CData Connect AI

To try derived views and get live data access to 150+ SaaS, Big Data, and NoSQL sources directly from your AI tools, try CData Connect AI today!