Optimizing Semantic Views with Verified Queries in Snowflake Cortex Analyst AI Studio

A practitioner’s guide to building self-improving AI analytics using an e-commerce use case

TLDR: Verified queries are curated question-SQL pairs that teach Cortex Analyst your business definitions. The AI Studio optimization engine analyzes them, extracts reusable logic patterns, and auto-generates improvements to your semantic view — descriptions, synonyms, custom instructions, and new metrics. This guide walks through the complete setup end-to-end: data model, semantic view, all 10 verified queries, the real optimization output, and the naming conflict you’ll likely hit along the way.

The Problem No Documentation Solves

Every company defines “revenue” differently.

At one company it’s gross sales. At another it’s net-after-discounts-for-delivered-orders-only. The word is the same. The SQL is not.

Out-of-the-box Cortex Analyst doesn’t know your definition. It makes an educated guess based on column names and comments. Sometimes it guesses right. Often it doesn’t — and you end up with SQL that counts cancelled orders in your revenue figure, or divides by zero when a channel has no sessions.

Verified queries fix this. They’re not documentation. They’re demonstrations. You show Cortex Analyst what “revenue” actually means in your context, and it learns to apply that definition to questions it’s never seen before.

That’s the feature we’re building end-to-end in this article.

What We’ll Build

  • An e-commerce data model (5 tables, ~80 rows) you can run today in any Snowflake account
  • A semantic view with facts, dimensions, metrics, and AI directives
  • 10 verified queries, each teaching a distinct business concept
  • Validation queries to confirm everything resolves correctly
  • A full walkthrough of the AI Studio optimization flow
  • The real output I got — including the duplicate name error you’ll likely hit

Part 1: The Data Foundation

Step 1 — Database and Schema Setup

CREATE DATABASE IF NOT EXISTS ECOMMERCE_DEMO;
CREATE SCHEMA IF NOT EXISTS ECOMMERCE_DEMO.SHOP;
USE DATABASE ECOMMERCE_DEMO;
USE SCHEMA SHOP;
USE WAREHOUSE COMPUTE_WH;

The data model is simple by design. Five tables. Clean foreign keys. Enough variation in order status, discount rates, and membership tiers to make the analytics interesting.

CUSTOMERS ──→ SESSIONS ──→ ORDERS ──→ ORDER_ITEMS ──→ PRODUCTS
(10 rows) (20 rows) (15 rows) (27 rows) (8 rows)

Step 2 — Create and Populate All Tables

CUSTOMERS — 10 shoppers across 5 countries, with Gold/Silver/Bronze tiers and a subscription flag.

CREATE OR REPLACE TABLE ECOMMERCE_DEMO.SHOP.CUSTOMERS (
CUSTOMER_ID NUMBER PRIMARY KEY,
EMAIL VARCHAR(200),
FIRST_NAME VARCHAR(50),
LAST_NAME VARCHAR(50),
COUNTRY VARCHAR(50),
MEMBERSHIP_TIER VARCHAR(20),
SIGNUP_DATE DATE,
IS_SUBSCRIBED BOOLEAN
);

INSERT INTO ECOMMERCE_DEMO.SHOP.CUSTOMERS VALUES
(1, 'alice@example.com', 'Alice', 'Martin', 'USA', 'Gold', '2022-03-10', TRUE),
(2, 'bob@example.com', 'Bob', 'Lee', 'Canada', 'Silver', '2022-07-22', TRUE),
(3, 'carol@example.com', 'Carol', 'Singh', 'UK', 'Gold', '2023-01-05', FALSE),
(4, 'dave@example.com', 'Dave', 'Patel', 'USA', 'Bronze', '2023-04-18', TRUE),
(5, 'eve@example.com', 'Eve', 'Kim', 'Germany', 'Gold', '2021-11-30', TRUE),
(6, 'frank@example.com', 'Frank', 'Nguyen', 'Canada', 'Silver', '2023-08-12', FALSE),
(7, 'grace@example.com', 'Grace', 'Brown', 'UK', 'Bronze', '2024-01-20', TRUE),
(8, 'hank@example.com', 'Hank', 'Davis', 'USA', 'Silver', '2022-12-01', TRUE),
(9, 'iris@example.com', 'Iris', 'Tanaka', 'Japan', 'Gold', '2023-06-15', TRUE),
(10, 'jack@example.com', 'Jack', 'Mueller', 'Germany', 'Bronze', '2024-03-05', FALSE);

PRODUCTS — 8 items across 5 categories, with both unit cost and unit price so profitability analysis is meaningful.

CREATE OR REPLACE TABLE ECOMMERCE_DEMO.SHOP.PRODUCTS (
PRODUCT_ID NUMBER PRIMARY KEY,
PRODUCT_NAME VARCHAR(150),
CATEGORY VARCHAR(50),
BRAND VARCHAR(50),
UNIT_COST NUMBER(10,2),
UNIT_PRICE NUMBER(10,2)
);

INSERT INTO ECOMMERCE_DEMO.SHOP.PRODUCTS VALUES
(201, 'Wireless Headphones', 'Electronics', 'SoundMax', 25.00, 79.99),
(202, 'Running Shoes', 'Footwear', 'StrideOn', 40.00, 129.99),
(203, 'Organic Coffee Beans', 'Grocery', 'BeanCo', 8.00, 24.99),
(204, 'Yoga Mat', 'Fitness', 'FlexFit', 10.00, 39.99),
(205, 'Laptop Stand', 'Electronics', 'DeskPro', 15.00, 59.99),
(206, 'Cotton T-Shirt', 'Apparel', 'BasicWear', 5.00, 19.99),
(207, 'Smart Watch', 'Electronics', 'TechLife', 80.00, 249.99),
(208, 'Protein Powder', 'Grocery', 'NutriPeak', 12.00, 44.99);

SESSIONS — 20 browsing sessions with device type and traffic source, spanning October–December 2024.

CREATE OR REPLACE TABLE ECOMMERCE_DEMO.SHOP.SESSIONS (
SESSION_ID VARCHAR(20) PRIMARY KEY,
CUSTOMER_ID NUMBER,
SESSION_START TIMESTAMP_NTZ,
SESSION_END TIMESTAMP_NTZ,
DEVICE_TYPE VARCHAR(20),
TRAFFIC_SOURCE VARCHAR(30),
PAGES_VIEWED NUMBER
);

INSERT INTO ECOMMERCE_DEMO.SHOP.SESSIONS VALUES
('S001', 1, '2024-10-01 09:00:00', '2024-10-01 09:25:00', 'Desktop', 'Google', 8),
('S002', 2, '2024-10-01 10:15:00', '2024-10-01 10:30:00', 'Mobile', 'Instagram', 4),
('S003', 3, '2024-10-02 14:00:00', '2024-10-02 14:45:00', 'Desktop', 'Direct', 12),
('S004', 1, '2024-10-03 08:30:00', '2024-10-03 09:00:00', 'Tablet', 'Email', 6),
('S005', 4, '2024-10-05 11:00:00', '2024-10-05 11:10:00', 'Mobile', 'Google', 3),
('S006', 5, '2024-10-06 16:00:00', '2024-10-06 16:50:00', 'Desktop', 'Facebook', 15),
('S007', 6, '2024-10-07 13:00:00', '2024-10-07 13:20:00', 'Mobile', 'Google', 5),
('S008', 7, '2024-10-08 19:00:00', '2024-10-08 19:35:00', 'Desktop', 'Direct', 9),
('S009', 8, '2024-10-10 10:00:00', '2024-10-10 10:40:00', 'Desktop', 'Email', 11),
('S010', 9, '2024-10-12 07:30:00', '2024-10-12 08:00:00', 'Mobile', 'Google', 7),
('S011', 10, '2024-10-15 12:00:00', '2024-10-15 12:05:00', 'Mobile', 'Instagram', 2),
('S012', 1, '2024-11-01 09:00:00', '2024-11-01 09:30:00', 'Desktop', 'Google', 10),
('S013', 2, '2024-11-03 11:00:00', '2024-11-03 11:20:00', 'Mobile', 'Direct', 5),
('S014', 5, '2024-11-05 15:00:00', '2024-11-05 15:45:00', 'Desktop', 'Email', 14),
('S015', 3, '2024-11-10 17:00:00', '2024-11-10 17:30:00', 'Tablet', 'Facebook', 8),
('S016', 8, '2024-11-12 10:00:00', '2024-11-12 10:25:00', 'Desktop', 'Google', 6),
('S017', 9, '2024-11-15 08:00:00', '2024-11-15 08:20:00', 'Mobile', 'Direct', 4),
('S018', 4, '2024-12-01 14:00:00', '2024-12-01 14:30:00', 'Desktop', 'Google', 9),
('S019', 7, '2024-12-05 18:00:00', '2024-12-05 18:40:00', 'Desktop', 'Email', 11),
('S020', 5, '2024-12-10 16:00:00', '2024-12-10 16:55:00', 'Desktop', 'Direct', 16);

ORDERS — 15 orders with mixed statuses (Delivered, Returned, Cancelled, Processing) and optional coupon codes. The status mix is deliberate — it’s what makes the WHERE ORDER_STATUS = 'Delivered' filter in verified queries matter.

CREATE OR REPLACE TABLE ECOMMERCE_DEMO.SHOP.ORDERS (
ORDER_ID NUMBER PRIMARY KEY,
SESSION_ID VARCHAR(20),
CUSTOMER_ID NUMBER,
ORDER_DATE DATE,
ORDER_STATUS VARCHAR(20),
SHIPPING_COST NUMBER(8,2),
COUPON_CODE VARCHAR(30)
);

INSERT INTO ECOMMERCE_DEMO.SHOP.ORDERS VALUES
(3001, 'S001', 1, '2024-10-01', 'Delivered', 5.99, NULL),
(3002, 'S003', 3, '2024-10-02', 'Delivered', 0.00, 'FALL20'),
(3003, 'S004', 1, '2024-10-03', 'Delivered', 5.99, NULL),
(3004, 'S006', 5, '2024-10-06', 'Delivered', 0.00, NULL),
(3005, 'S008', 7, '2024-10-08', 'Delivered', 7.99, 'WELCOME10'),
(3006, 'S009', 8, '2024-10-10', 'Returned', 5.99, NULL),
(3007, 'S010', 9, '2024-10-12', 'Delivered', 0.00, NULL),
(3008, 'S012', 1, '2024-11-01', 'Delivered', 5.99, 'LOYALTY15'),
(3009, 'S013', 2, '2024-11-03', 'Cancelled', 0.00, NULL),
(3010, 'S014', 5, '2024-11-05', 'Delivered', 0.00, NULL),
(3011, 'S015', 3, '2024-11-10', 'Delivered', 5.99, 'FALL20'),
(3012, 'S016', 8, '2024-11-12', 'Delivered', 5.99, NULL),
(3013, 'S018', 4, '2024-12-01', 'Delivered', 7.99, 'HOLIDAY25'),
(3014, 'S019', 7, '2024-12-05', 'Delivered', 5.99, NULL),
(3015, 'S020', 5, '2024-12-10', 'Processing', 0.00, NULL);

ORDER_ITEMS — 27 line items linking orders to products, with varying quantities and discount percentages.

CREATE OR REPLACE TABLE ECOMMERCE_DEMO.SHOP.ORDER_ITEMS (
ORDER_ITEM_ID NUMBER PRIMARY KEY,
ORDER_ID NUMBER,
PRODUCT_ID NUMBER,
QUANTITY NUMBER,
DISCOUNT_PCT NUMBER(5,2)
);

INSERT INTO ECOMMERCE_DEMO.SHOP.ORDER_ITEMS VALUES
(1, 3001, 201, 1, 0.00), (2, 3001, 205, 1, 0.00),
(3, 3002, 207, 1, 0.20), (4, 3002, 204, 2, 0.20),
(5, 3003, 203, 3, 0.00), (6, 3004, 207, 1, 0.00),
(7, 3004, 201, 2, 0.00), (8, 3005, 206, 3, 0.10),
(9, 3005, 204, 1, 0.10), (10, 3006, 208, 2, 0.00),
(11, 3007, 202, 1, 0.00), (12, 3007, 203, 2, 0.00),
(13, 3008, 201, 1, 0.15), (14, 3008, 208, 1, 0.15),
(15, 3009, 206, 2, 0.00), (16, 3010, 207, 1, 0.00),
(17, 3010, 205, 2, 0.00), (18, 3011, 203, 4, 0.20),
(19, 3011, 204, 1, 0.20), (20, 3012, 202, 1, 0.00),
(21, 3012, 208, 1, 0.00), (22, 3013, 207, 1, 0.25),
(23, 3013, 201, 1, 0.25), (24, 3014, 206, 5, 0.00),
(25, 3014, 203, 2, 0.00), (26, 3015, 207, 2, 0.00),
(27, 3015, 205, 1, 0.00);

Part 2: The Semantic View

A semantic view is Cortex Analyst’s contract with your data — it defines what things mean, how tables connect, and what calculations are valid. Everything downstream depends on getting this right.

The Full Semantic View (with All 10 Verified Queries)

The complete CREATE OR REPLACE SEMANTIC VIEW statement below includes tables, relationships, facts, dimensions, metrics, AI directives, and all 10 verified queries — ready to execute as-is.

CREATE OR REPLACE SEMANTIC VIEW ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV
-- ─── TABLES ───────────────────────────────────────────────────────────────
TABLES (
customers AS ECOMMERCE_DEMO.SHOP.CUSTOMERS
PRIMARY KEY (CUSTOMER_ID)
WITH SYNONYMS ('shoppers', 'buyers', 'users')
COMMENT = 'Registered customers with membership and subscription info',
products AS ECOMMERCE_DEMO.SHOP.PRODUCTS
PRIMARY KEY (PRODUCT_ID)
WITH SYNONYMS ('items', 'SKUs', 'merchandise')
COMMENT = 'Product catalog with cost and price',
sessions AS ECOMMERCE_DEMO.SHOP.SESSIONS
PRIMARY KEY (SESSION_ID)
WITH SYNONYMS ('visits', 'browsing sessions')
COMMENT = 'Website browsing sessions with device and traffic source',
orders AS ECOMMERCE_DEMO.SHOP.ORDERS
PRIMARY KEY (ORDER_ID)
WITH SYNONYMS ('purchases', 'transactions')
COMMENT = 'Placed orders with status and shipping',
order_items AS ECOMMERCE_DEMO.SHOP.ORDER_ITEMS
PRIMARY KEY (ORDER_ITEM_ID)
COMMENT = 'Individual line items within an order'
)
-- ─── RELATIONSHIPS ────────────────────────────────────────────────────────
RELATIONSHIPS (
sessions_to_customers AS sessions (CUSTOMER_ID) REFERENCES customers,
orders_to_sessions AS orders (SESSION_ID) REFERENCES sessions,
orders_to_customers AS orders (CUSTOMER_ID) REFERENCES customers,
order_items_to_orders AS order_items (ORDER_ID) REFERENCES orders,
order_items_to_products AS order_items (PRODUCT_ID) REFERENCES products
)
-- ─── FACTS (row-level calculations) ──────────────────────────────────────
FACTS (
products.unit_price AS products.UNIT_PRICE
COMMENT = 'Selling price per unit',
products.unit_cost AS products.UNIT_COST
COMMENT = 'Cost per unit',
order_items.line_revenue AS products.unit_price * order_items.QUANTITY
COMMENT = 'Gross line revenue before discount',
order_items.line_discount AS products.unit_price * order_items.QUANTITY * order_items.DISCOUNT_PCT
COMMENT = 'Discount amount per line item',
order_items.net_line_revenue AS products.unit_price * order_items.QUANTITY * (1 - order_items.DISCOUNT_PCT)
COMMENT = 'Net revenue after discount per line item',
order_items.line_cost AS products.unit_cost * order_items.QUANTITY
COMMENT = 'Total cost for the line item',
order_items.line_profit AS
(products.unit_price * order_items.QUANTITY * (1 - order_items.DISCOUNT_PCT))
- (products.unit_cost * order_items.QUANTITY)
COMMENT = 'Profit per line item (net revenue minus cost)',
sessions.session_duration_min AS TIMESTAMPDIFF('MINUTE', sessions.SESSION_START, sessions.SESSION_END)
COMMENT = 'Session duration in minutes',
-- PRIVATE: helper flags consumed only by metrics; not exposed for direct querying
PRIVATE orders.is_delivered AS CASE WHEN orders.ORDER_STATUS = 'Delivered' THEN 1 ELSE 0 END,
PRIVATE orders.is_returned AS CASE WHEN orders.ORDER_STATUS = 'Returned' THEN 1 ELSE 0 END
)
-- ─── DIMENSIONS (groupable / filterable attributes) ───────────────────────
DIMENSIONS (
customers.customer_name AS customers.FIRST_NAME || ' ' || customers.LAST_NAME
WITH SYNONYMS = ('shopper name', 'buyer name')
COMMENT = 'Full name of the customer',
customers.country AS customers.COUNTRY
WITH SYNONYMS = ('region', 'location')
COMMENT = 'Customer country',
customers.membership_tier AS customers.MEMBERSHIP_TIER
WITH SYNONYMS = ('tier', 'loyalty tier', 'membership level')
COMMENT = 'Customer loyalty tier (Gold, Silver, Bronze)',
customers.is_subscribed AS customers.IS_SUBSCRIBED
COMMENT = 'Whether the customer is subscribed to marketing emails',
products.product_name AS products.PRODUCT_NAME
COMMENT = 'Product display name',
products.category AS products.CATEGORY
WITH SYNONYMS = ('product category', 'department')
COMMENT = 'Product category (Electronics, Footwear, Grocery, etc.)',
products.brand AS products.BRAND
COMMENT = 'Product brand name',
sessions.device_type AS sessions.DEVICE_TYPE
WITH SYNONYMS = ('device', 'platform')
COMMENT = 'Device used during session (Desktop, Mobile, Tablet)',
sessions.traffic_source AS sessions.TRAFFIC_SOURCE
WITH SYNONYMS = ('channel', 'acquisition source', 'referrer')
COMMENT = 'How the customer arrived (Google, Direct, Email, etc.)',
sessions.session_date AS DATE(sessions.SESSION_START)
COMMENT = 'Date of the session',
orders.order_date AS orders.ORDER_DATE
COMMENT = 'Date the order was placed',
orders.order_month AS DATE_TRUNC('MONTH', orders.ORDER_DATE)
COMMENT = 'Month the order was placed',
orders.order_status AS orders.ORDER_STATUS
WITH SYNONYMS = ('status', 'fulfillment status')
COMMENT = 'Order status (Delivered, Returned, Cancelled, Processing)',
orders.has_coupon AS orders.COUPON_CODE IS NOT NULL
COMMENT = 'Whether a coupon code was used on the order'
)
-- ─── METRICS (aggregations) ───────────────────────────────────────────────
METRICS (
sessions.total_sessions AS COUNT(sessions.SESSION_ID)
COMMENT = 'Total number of browsing sessions',
sessions.unique_visitors AS COUNT(DISTINCT sessions.CUSTOMER_ID)
COMMENT = 'Count of distinct customers who visited',
sessions.avg_session_duration AS AVG(sessions.session_duration_min)
COMMENT = 'Average session duration in minutes',
sessions.avg_pages_viewed AS AVG(sessions.PAGES_VIEWED)
COMMENT = 'Average number of pages viewed per session',
orders.total_orders AS COUNT(orders.ORDER_ID)
COMMENT = 'Total number of orders placed',
orders.delivered_orders AS SUM(orders.is_delivered)
COMMENT = 'Number of successfully delivered orders',
orders.returned_orders AS SUM(orders.is_returned)
COMMENT = 'Number of returned orders',
orders.total_shipping AS SUM(orders.SHIPPING_COST)
COMMENT = 'Total shipping charges collected',
customers.customer_count AS COUNT(DISTINCT customers.CUSTOMER_ID)
COMMENT = 'Count of distinct customers',
order_items.gross_revenue AS SUM(order_items.line_revenue)
COMMENT = 'Total gross revenue before discounts',
order_items.total_discount AS SUM(order_items.line_discount)
COMMENT = 'Total discount amount given',
order_items.net_revenue AS SUM(order_items.net_line_revenue)
COMMENT = 'Total net revenue after discounts',
order_items.total_cost AS SUM(order_items.line_cost)
COMMENT = 'Total cost of goods sold',
order_items.total_profit AS SUM(order_items.line_profit)
COMMENT = 'Total profit (net revenue minus COGS)',
order_items.total_units_sold AS SUM(order_items.QUANTITY)
COMMENT = 'Total units sold across all orders',
order_items.avg_order_value AS AVG(order_items.net_line_revenue)
COMMENT = 'Average net revenue per line item'
)
COMMENT = 'E-commerce analytics: sessions, conversions, revenue, profitability, and customer insights'
-- ─── AI DIRECTIVES ────────────────────────────────────────────────────────
AI_SQL_GENERATION 'Always use net_line_revenue for revenue calculations (accounts for discounts). Round monetary values to 2 decimal places. When asked about active or engaged customers, use customers who placed at least one delivered order. When asked about profitability, use line_profit which is net revenue minus cost. Conversion rate = orders / sessions.'
AI_QUESTION_CATEGORIZATION 'If the question is about inventory or stock levels, explain this model covers sales analytics only, not inventory management. If the question asks about a specific product without naming it, ask the user to specify the product name or category.'
-- ─── VERIFIED QUERIES (10 queries for optimization) ──────────────────────
-- Each query teaches Cortex Analyst a distinct business concept.
-- The optimizer analyzes these patterns and uses them to answer novel questions.
AI_VERIFIED_QUERIES (
-- #1 - Monthly Revenue Trend
-- TEACHES: revenue = net after discount; filter to Delivered orders only
monthly_revenue AS (
QUESTION 'What is our monthly revenue trend?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION TRUE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
DATE_TRUNC(''MONTH'', o.ORDER_DATE) AS order_month,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
COUNT(DISTINCT o.ORDER_ID) AS order_count
FROM ECOMMERCE_DEMO.SHOP.ORDERS o
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY DATE_TRUNC(''MONTH'', o.ORDER_DATE)
ORDER BY order_month'
),
-- #2 - Revenue by Product Category
-- TEACHES: profit = net revenue - COGS
revenue_by_category AS (
QUESTION 'What is the revenue breakdown by product category?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION TRUE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
p.CATEGORY,
SUM(oi.QUANTITY) AS units_sold,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)), 2) AS profit
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY p.CATEGORY
ORDER BY net_revenue DESC'
),
-- #3 - Top-Selling Products
-- TEACHES: ranking pattern with multi-column GROUP BY
top_products AS (
QUESTION 'What are our top selling products by revenue?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION TRUE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
p.PRODUCT_NAME, p.CATEGORY, p.BRAND,
SUM(oi.QUANTITY) AS units_sold,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY p.PRODUCT_NAME, p.CATEGORY, p.BRAND
ORDER BY net_revenue DESC'
),
-- #4 - Conversion Rate by Traffic Source
-- TEACHES: conversion = orders / sessions; LEFT JOIN ensures zero-converting
-- channels still appear in results; NULLIF prevents division by zero
conversion_by_channel AS (
QUESTION 'What is the conversion rate by traffic source?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION TRUE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
s.TRAFFIC_SOURCE,
COUNT(DISTINCT s.SESSION_ID) AS total_sessions,
COUNT(DISTINCT o.ORDER_ID) AS orders_placed,
ROUND(COUNT(DISTINCT o.ORDER_ID) / NULLIF(COUNT(DISTINCT s.SESSION_ID), 0), 4) AS conversion_rate
FROM ECOMMERCE_DEMO.SHOP.SESSIONS s
LEFT JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.SESSION_ID = s.SESSION_ID
GROUP BY s.TRAFFIC_SOURCE
ORDER BY conversion_rate DESC'
),
-- #5 - Revenue by Membership Tier
-- TEACHES: per-customer metrics; NULLIF safe division
revenue_by_membership AS (
QUESTION 'How does revenue compare across membership tiers?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION FALSE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
c.MEMBERSHIP_TIER,
COUNT(DISTINCT c.CUSTOMER_ID) AS customer_count,
COUNT(DISTINCT o.ORDER_ID) AS order_count,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) /
NULLIF(COUNT(DISTINCT c.CUSTOMER_ID), 0), 2) AS revenue_per_customer
FROM ECOMMERCE_DEMO.SHOP.CUSTOMERS c
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.CUSTOMER_ID = c.CUSTOMER_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY c.MEMBERSHIP_TIER
ORDER BY net_revenue DESC'
),
-- #6 - Device Performance
-- TEACHES: session engagement metrics combined with conversion by device type
device_performance AS (
QUESTION 'How do sessions and orders break down by device type?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION FALSE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
s.DEVICE_TYPE,
COUNT(DISTINCT s.SESSION_ID) AS sessions,
ROUND(AVG(TIMESTAMPDIFF(''MINUTE'', s.SESSION_START, s.SESSION_END)), 1) AS avg_duration_min,
ROUND(AVG(s.PAGES_VIEWED), 1) AS avg_pages,
COUNT(DISTINCT o.ORDER_ID) AS orders_placed,
ROUND(COUNT(DISTINCT o.ORDER_ID) / NULLIF(COUNT(DISTINCT s.SESSION_ID), 0), 4) AS conversion_rate
FROM ECOMMERCE_DEMO.SHOP.SESSIONS s
LEFT JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.SESSION_ID = s.SESSION_ID
GROUP BY s.DEVICE_TYPE
ORDER BY sessions DESC'
),
-- #7 - Coupon Impact
-- TEACHES: coupon = COUPON_CODE IS NOT NULL; gross vs net revenue comparison
coupon_impact AS (
QUESTION 'What is the impact of coupon usage on revenue and discounts?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION FALSE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
CASE WHEN o.COUPON_CODE IS NOT NULL THEN ''With Coupon'' ELSE ''No Coupon'' END AS coupon_used,
COUNT(DISTINCT o.ORDER_ID) AS order_count,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY), 2) AS gross_revenue,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * oi.DISCOUNT_PCT), 2) AS total_discount,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.ORDERS o
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY CASE WHEN o.COUPON_CODE IS NOT NULL THEN ''With Coupon'' ELSE ''No Coupon'' END
ORDER BY net_revenue DESC'
),
-- #8 - Profit Margin by Brand
-- TEACHES: margin % = profit / net_revenue * 100
profit_margin_by_brand AS (
QUESTION 'What is the profit margin by brand?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION FALSE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
p.BRAND,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM(p.UNIT_COST * oi.QUANTITY), 2) AS total_cost,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)), 2) AS profit,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)) /
NULLIF(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 0) * 100, 1) AS margin_pct
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY p.BRAND
ORDER BY profit DESC'
),
-- #9 - Revenue by Country
-- TEACHES: geographic segmentation routed through the customer join
country_revenue AS (
QUESTION 'What is the revenue by customer country?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION TRUE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
c.COUNTRY,
COUNT(DISTINCT c.CUSTOMER_ID) AS customers,
COUNT(DISTINCT o.ORDER_ID) AS orders,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.CUSTOMERS c
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.CUSTOMER_ID = c.CUSTOMER_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = ''Delivered''
GROUP BY c.COUNTRY
ORDER BY net_revenue DESC'
),
-- #10 - Return Rate
-- TEACHES: rate = conditional COUNT / total * 100
return_rate AS (
QUESTION 'What is our return rate?'
VERIFIED_AT 1714700000
ONBOARDING_QUESTION FALSE
VERIFIED_BY '(STEWARD = ecommerce_analytics)'
SQL 'SELECT
COUNT(DISTINCT CASE WHEN ORDER_STATUS = ''Returned'' THEN ORDER_ID END) AS returned_orders,
COUNT(DISTINCT ORDER_ID) AS total_orders,
ROUND(COUNT(DISTINCT CASE WHEN ORDER_STATUS = ''Returned'' THEN ORDER_ID END) /
NULLIF(COUNT(DISTINCT ORDER_ID), 0) * 100, 2) AS return_rate_pct
FROM ECOMMERCE_DEMO.SHOP.ORDERS'
)
);

Two things worth calling out:

PRIVATE facts (is_delivered, is_returned) are helper flags consumed only by the delivered_orders and returned_orders metrics. Marking them PRIVATE keeps them off the AI's answerable surface so users can't accidentally query internal calculation flags directly.

AI directives travel with every Cortex Analyst request. AI_SQL_GENERATION locks in your rules — net revenue for revenue calculations, ROUND to 2 decimal places, delivered-only scope for "active customer" questions. AI_QUESTION_CATEGORIZATION gives Cortex Analyst a graceful exit for out-of-scope questions rather than letting it hallucinate an inventory answer from a sales model.

Part 3: Validate the Semantic View

Before triggering optimization, confirm the view resolves correctly. A broken metric or mis-configured join path will silently degrade optimization results.

Inspect the View Structure

DESCRIBE SEMANTIC VIEW ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;

SHOW SEMANTIC DIMENSIONS IN ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;
SHOW SEMANTIC FACTS IN ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;
SHOW SEMANTIC METRICS IN ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;

-- Dump the full YAML representation - useful for debugging join path issues
SELECT SYSTEM$READ_YAML_FROM_SEMANTIC_VIEW('ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV');
name: ECOMMERCE_INSIGHTS_SV
description: "E-commerce analytics: sessions, conversions, revenue, profitability, and customer insights"
tables:
- name: CUSTOMERS
synonyms:
- buyers
- shoppers
- users
description: Registered customers with membership and subscription info
base_table:
database: ECOMMERCE_DEMO
schema: SHOP
table: CUSTOMERS
primary_key:
columns:
- CUSTOMER_ID
dimensions:
- name: COUNTRY
synonyms:
- location
- region
description: Customer country
expr: customers.COUNTRY
data_type: VARCHAR(50)
- name: CUSTOMER_NAME
synonyms:
- buyer name
- shopper name
description: Full name of the customer
expr: customers.FIRST_NAME || ' ' || customers.LAST_NAME
data_type: VARCHAR(101)
- name: IS_SUBSCRIBED
description: Whether the customer is subscribed to marketing emails
expr: customers.IS_SUBSCRIBED
data_type: BOOLEAN
- name: MEMBERSHIP_TIER
synonyms:
- loyalty tier
- membership level
- tier
description: "Customer loyalty tier (Gold, Silver, Bronze)"
expr: customers.MEMBERSHIP_TIER
data_type: VARCHAR(20)
metrics:
- name: CUSTOMER_COUNT
description: Count of distinct customers
expr: COUNT(DISTINCT customers.CUSTOMER_ID)
access_modifier: public_access
- name: ORDERS
synonyms:
- purchases
- transactions
description: Placed orders with status and shipping
base_table:
database: ECOMMERCE_DEMO
schema: SHOP
table: ORDERS
primary_key:
columns:
- ORDER_ID
dimensions:
- name: HAS_COUPON
description: Whether a coupon code was used on the order
expr: orders.COUPON_CODE IS NOT NULL
data_type: BOOLEAN
- name: ORDER_DATE
description: Date the order was placed
expr: orders.ORDER_DATE
data_type: DATE
- name: ORDER_MONTH
description: Month the order was placed
expr: "DATE_TRUNC('MONTH', orders.ORDER_DATE)"
data_type: DATE
- name: ORDER_STATUS
synonyms:
- fulfillment status
- status
description: "Order status (Delivered, Returned, Cancelled, Processing)"
expr: orders.ORDER_STATUS
data_type: VARCHAR(20)
facts:
- name: IS_DELIVERED
expr: CASE WHEN orders.ORDER_STATUS = 'Delivered' THEN 1 ELSE 0 END
data_type: "NUMBER(1,0)"
access_modifier: private_access
- name: IS_RETURNED
expr: CASE WHEN orders.ORDER_STATUS = 'Returned' THEN 1 ELSE 0 END
data_type: "NUMBER(1,0)"
access_modifier: private_access
metrics:
- name: DELIVERED_ORDERS
description: Number of successfully delivered orders
expr: SUM(orders.is_delivered)
access_modifier: public_access
- name: RETURNED_ORDERS
description: Number of returned orders
expr: SUM(orders.is_returned)
access_modifier: public_access
- name: TOTAL_ORDERS
description: Total number of orders placed
expr: COUNT(orders.ORDER_ID)
access_modifier: public_access
- name: TOTAL_SHIPPING
description: Total shipping charges collected
expr: SUM(orders.SHIPPING_COST)
access_modifier: public_access
- name: ORDER_ITEMS
description: Individual line items within an order
base_table:
database: ECOMMERCE_DEMO
schema: SHOP
table: ORDER_ITEMS
primary_key:
columns:
- ORDER_ITEM_ID
facts:
- name: LINE_COST
description: Total cost for the line item
expr: products.unit_cost * order_items.QUANTITY
data_type: "NUMBER(38,2)"
access_modifier: public_access
- name: LINE_DISCOUNT
description: Discount amount per line item
expr: products.unit_price * order_items.QUANTITY * order_items.DISCOUNT_PCT
data_type: "NUMBER(38,4)"
access_modifier: public_access
- name: LINE_PROFIT
description: Profit per line item (net revenue minus cost)
expr: (products.unit_price * order_items.QUANTITY * (1 - order_items.DISCOUNT_PCT)) - (products.unit_cost * order_items.QUANTITY)
data_type: "NUMBER(38,4)"
access_modifier: public_access
- name: LINE_REVENUE
description: Gross line revenue before discount
expr: products.unit_price * order_items.QUANTITY
data_type: "NUMBER(38,2)"
access_modifier: public_access
- name: NET_LINE_REVENUE
description: Net revenue after discount per line item
expr: products.unit_price * order_items.QUANTITY * (1 - order_items.DISCOUNT_PCT)
data_type: "NUMBER(38,4)"
access_modifier: public_access
metrics:
- name: AVG_ORDER_VALUE
description: Average net revenue per line item
expr: AVG(order_items.net_line_revenue)
access_modifier: public_access
- name: GROSS_REVENUE
description: Total gross revenue before discounts
expr: SUM(order_items.line_revenue)
access_modifier: public_access
- name: NET_REVENUE
description: Total net revenue after discounts
expr: SUM(order_items.net_line_revenue)
access_modifier: public_access
- name: TOTAL_COST
description: Total cost of goods sold
expr: SUM(order_items.line_cost)
access_modifier: public_access
- name: TOTAL_DISCOUNT
description: Total discount amount given
expr: SUM(order_items.line_discount)
access_modifier: public_access
- name: TOTAL_PROFIT
description: Total profit (net revenue minus COGS)
expr: SUM(order_items.line_profit)
access_modifier: public_access
- name: TOTAL_UNITS_SOLD
description: Total units sold across all orders
expr: SUM(order_items.QUANTITY)
access_modifier: public_access
- name: PRODUCTS
synonyms:
- items
- merchandise
- SKUs
description: Product catalog with cost and price
base_table:
database: ECOMMERCE_DEMO
schema: SHOP
table: PRODUCTS
primary_key:
columns:
- PRODUCT_ID
dimensions:
- name: BRAND
description: Product brand name
expr: products.BRAND
data_type: VARCHAR(50)
- name: CATEGORY
synonyms:
- department
- product category
description: "Product category (Electronics, Footwear, Grocery, etc.)"
expr: products.CATEGORY
data_type: VARCHAR(50)
- name: PRODUCT_NAME
description: Product display name
expr: products.PRODUCT_NAME
data_type: VARCHAR(150)
facts:
- name: UNIT_COST
description: Cost per unit
expr: products.UNIT_COST
data_type: "NUMBER(10,2)"
access_modifier: public_access
- name: UNIT_PRICE
description: Selling price per unit
expr: products.UNIT_PRICE
data_type: "NUMBER(10,2)"
access_modifier: public_access
- name: SESSIONS
synonyms:
- browsing sessions
- visits
description: Website browsing sessions with device and traffic source
base_table:
database: ECOMMERCE_DEMO
schema: SHOP
table: SESSIONS
primary_key:
columns:
- SESSION_ID
dimensions:
- name: DEVICE_TYPE
synonyms:
- device
- platform
description: "Device used during session (Desktop, Mobile, Tablet)"
expr: sessions.DEVICE_TYPE
data_type: VARCHAR(20)
- name: SESSION_DATE
description: Date of the session
expr: DATE(sessions.SESSION_START)
data_type: DATE
- name: TRAFFIC_SOURCE
synonyms:
- acquisition source
- channel
- referrer
description: "How the customer arrived (Google, Direct, Email, etc.)"
expr: sessions.TRAFFIC_SOURCE
data_type: VARCHAR(30)
facts:
- name: SESSION_DURATION_MIN
description: Session duration in minutes
expr: "TIMESTAMPDIFF('MINUTE', sessions.SESSION_START, sessions.SESSION_END)"
data_type: "NUMBER(18,0)"
access_modifier: public_access
metrics:
- name: AVG_PAGES_VIEWED
description: Average number of pages viewed per session
expr: AVG(sessions.PAGES_VIEWED)
access_modifier: public_access
- name: AVG_SESSION_DURATION
description: Average session duration in minutes
expr: AVG(sessions.session_duration_min)
access_modifier: public_access
- name: TOTAL_SESSIONS
description: Total number of browsing sessions
expr: COUNT(sessions.SESSION_ID)
access_modifier: public_access
- name: UNIQUE_VISITORS
description: Count of distinct customers who visited
expr: COUNT(DISTINCT sessions.CUSTOMER_ID)
access_modifier: public_access
relationships:
- name: ORDERS_TO_CUSTOMERS
left_table: ORDERS
right_table: CUSTOMERS
relationship_columns:
- left_column: CUSTOMER_ID
right_column: CUSTOMER_ID
- name: ORDERS_TO_SESSIONS
left_table: ORDERS
right_table: SESSIONS
relationship_columns:
- left_column: SESSION_ID
right_column: SESSION_ID
- name: ORDER_ITEMS_TO_ORDERS
left_table: ORDER_ITEMS
right_table: ORDERS
relationship_columns:
- left_column: ORDER_ID
right_column: ORDER_ID
- name: ORDER_ITEMS_TO_PRODUCTS
left_table: ORDER_ITEMS
right_table: PRODUCTS
relationship_columns:
- left_column: PRODUCT_ID
right_column: PRODUCT_ID
- name: SESSIONS_TO_CUSTOMERS
left_table: SESSIONS
right_table: CUSTOMERS
relationship_columns:
- left_column: CUSTOMER_ID
right_column: CUSTOMER_ID
module_custom_instructions:
sql_generation: "Always use net_line_revenue for revenue calculations (accounts for discounts). Round monetary values to 2 decimal places. When asked about active or engaged customers, use customers who placed at least one delivered order. When asked about profitability, use line_profit which is net revenue minus cost. Conversion rate = orders / sessions."
question_categorization: "If the question is about inventory or stock levels, explain this model covers sales analytics only, not inventory management. If the question asks about a specific product without naming it, ask the user to specify the product name or category."
verified_queries:
- name: MONTHLY_REVENUE
sql: |-
SELECT
DATE_TRUNC('MONTH', o.ORDER_DATE) AS order_month,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
COUNT(DISTINCT o.ORDER_ID) AS order_count
FROM ECOMMERCE_DEMO.SHOP.ORDERS o
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY DATE_TRUNC('MONTH', o.ORDER_DATE)
ORDER BY order_month
question: What is our monthly revenue trend?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: true
- name: REVENUE_BY_CATEGORY
sql: |-
SELECT
p.CATEGORY,
SUM(oi.QUANTITY) AS units_sold,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)), 2) AS profit
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY p.CATEGORY
ORDER BY net_revenue DESC
question: What is the revenue breakdown by product category?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: true
- name: TOP_PRODUCTS
sql: |-
SELECT
p.PRODUCT_NAME,
p.CATEGORY,
p.BRAND,
SUM(oi.QUANTITY) AS units_sold,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY p.PRODUCT_NAME, p.CATEGORY, p.BRAND
ORDER BY net_revenue DESC
question: What are our top selling products by revenue?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: true
- name: CONVERSION_BY_CHANNEL
sql: |-
SELECT
s.TRAFFIC_SOURCE,
COUNT(DISTINCT s.SESSION_ID) AS total_sessions,
COUNT(DISTINCT o.ORDER_ID) AS orders_placed,
ROUND(COUNT(DISTINCT o.ORDER_ID) / NULLIF(COUNT(DISTINCT s.SESSION_ID), 0), 4) AS conversion_rate
FROM ECOMMERCE_DEMO.SHOP.SESSIONS s
LEFT JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.SESSION_ID = s.SESSION_ID
GROUP BY s.TRAFFIC_SOURCE
ORDER BY conversion_rate DESC
question: What is the conversion rate by traffic source?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: true
- name: REVENUE_BY_MEMBERSHIP
sql: |-
SELECT
c.MEMBERSHIP_TIER,
COUNT(DISTINCT c.CUSTOMER_ID) AS customer_count,
COUNT(DISTINCT o.ORDER_ID) AS order_count,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) /
NULLIF(COUNT(DISTINCT c.CUSTOMER_ID), 0), 2) AS revenue_per_customer
FROM ECOMMERCE_DEMO.SHOP.CUSTOMERS c
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.CUSTOMER_ID = c.CUSTOMER_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY c.MEMBERSHIP_TIER
ORDER BY net_revenue DESC
question: How does revenue compare across membership tiers?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: false
- name: DEVICE_PERFORMANCE
sql: |-
SELECT
s.DEVICE_TYPE,
COUNT(DISTINCT s.SESSION_ID) AS sessions,
ROUND(AVG(TIMESTAMPDIFF('MINUTE', s.SESSION_START, s.SESSION_END)), 1) AS avg_duration_min,
ROUND(AVG(s.PAGES_VIEWED), 1) AS avg_pages,
COUNT(DISTINCT o.ORDER_ID) AS orders_placed,
ROUND(COUNT(DISTINCT o.ORDER_ID) / NULLIF(COUNT(DISTINCT s.SESSION_ID), 0), 4) AS conversion_rate
FROM ECOMMERCE_DEMO.SHOP.SESSIONS s
LEFT JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.SESSION_ID = s.SESSION_ID
GROUP BY s.DEVICE_TYPE
ORDER BY sessions DESC
question: How do sessions and orders break down by device type?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: false
- name: COUPON_IMPACT
sql: |-
SELECT
CASE WHEN o.COUPON_CODE IS NOT NULL THEN 'With Coupon' ELSE 'No Coupon' END AS coupon_used,
COUNT(DISTINCT o.ORDER_ID) AS order_count,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY), 2) AS gross_revenue,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * oi.DISCOUNT_PCT), 2) AS total_discount,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.ORDERS o
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY CASE WHEN o.COUPON_CODE IS NOT NULL THEN 'With Coupon' ELSE 'No Coupon' END
ORDER BY net_revenue DESC
question: What is the impact of coupon usage on revenue and discounts?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: false
- name: PROFIT_MARGIN_BY_BRAND
sql: |-
SELECT
p.BRAND,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue,
ROUND(SUM(p.UNIT_COST * oi.QUANTITY), 2) AS total_cost,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)), 2) AS profit,
ROUND(SUM((p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)) - (p.UNIT_COST * oi.QUANTITY)) /
NULLIF(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 0) * 100, 1) AS margin_pct
FROM ECOMMERCE_DEMO.SHOP.PRODUCTS p
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.PRODUCT_ID = p.PRODUCT_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.ORDER_ID = oi.ORDER_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY p.BRAND
ORDER BY profit DESC
question: What is the profit margin by brand?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: false
- name: COUNTRY_REVENUE
sql: |-
SELECT
c.COUNTRY,
COUNT(DISTINCT c.CUSTOMER_ID) AS customers,
COUNT(DISTINCT o.ORDER_ID) AS orders,
ROUND(SUM(p.UNIT_PRICE * oi.QUANTITY * (1 - oi.DISCOUNT_PCT)), 2) AS net_revenue
FROM ECOMMERCE_DEMO.SHOP.CUSTOMERS c
JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.CUSTOMER_ID = c.CUSTOMER_ID
JOIN ECOMMERCE_DEMO.SHOP.ORDER_ITEMS oi ON oi.ORDER_ID = o.ORDER_ID
JOIN ECOMMERCE_DEMO.SHOP.PRODUCTS p ON p.PRODUCT_ID = oi.PRODUCT_ID
WHERE o.ORDER_STATUS = 'Delivered'
GROUP BY c.COUNTRY
ORDER BY net_revenue DESC
question: What is the revenue by customer country?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: true
- name: RETURN_RATE
sql: |-
SELECT
COUNT(DISTINCT CASE WHEN ORDER_STATUS = 'Returned' THEN ORDER_ID END) AS returned_orders,
COUNT(DISTINCT ORDER_ID) AS total_orders,
ROUND(COUNT(DISTINCT CASE WHEN ORDER_STATUS = 'Returned' THEN ORDER_ID END) /
NULLIF(COUNT(DISTINCT ORDER_ID), 0) * 100, 2) AS return_rate_pct
FROM ECOMMERCE_DEMO.SHOP.ORDERS
question: What is our return rate?
verified_at: 1714700000
verified_by: (STEWARD = ecommerce_analytics)
use_as_onboarding_question: false

Test with SEMANTIC_VIEW() Queries

Revenue trend by month:

SELECT * FROM SEMANTIC_VIEW(
ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV
METRICS order_items.net_revenue, orders.total_orders
DIMENSIONS orders.order_month
) ORDER BY order_month;

Expected:

| NET_REVENUE | TOTAL_ORDERS | ORDER_MONTH |
| ----------- | ------------ | ----------- |
| 1248.81 | 7 | 2024-10-01 |
| 803.12 | 5 | 2024-11-01 |
| 957.39 | 3 | 2024-12-01 |

Product performance:

SELECT * FROM SEMANTIC_VIEW(
ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV
METRICS order_items.net_revenue, order_items.total_units_sold
DIMENSIONS products.product_name, products.category
) ORDER BY net_revenue DESC;

Smart Watch should lead at $1,387.44 across 6 units. If it doesn’t, the discount calculation or ORDER_ITEMS data has drifted somewhere.

Session engagement by device:

SELECT * FROM SEMANTIC_VIEW(
ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV
METRICS sessions.total_sessions, sessions.avg_session_duration, sessions.avg_pages_viewed
DIMENSIONS sessions.device_type
) ORDER BY total_sessions DESC;

Expected: Desktop leads with 11 sessions, avg ~38 min, avg ~11 pages.

Sessions and orders by traffic source:

SELECT * FROM SEMANTIC_VIEW(
ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV
METRICS sessions.total_sessions, orders.total_orders
DIMENSIONS sessions.traffic_source
) ORDER BY total_sessions DESC;

Expected: Google leads with 7 sessions and 5 orders.

Spot-Check a Verified Query Directly

Run query #4 (conversion by channel) directly against the base tables to confirm the LEFT JOIN and NULLIF logic produces the right results:

SELECT
s.TRAFFIC_SOURCE,
COUNT(DISTINCT s.SESSION_ID) AS total_sessions,
COUNT(DISTINCT o.ORDER_ID) AS orders_placed,
ROUND(COUNT(DISTINCT o.ORDER_ID) / NULLIF(COUNT(DISTINCT s.SESSION_ID), 0), 4) AS conversion_rate
FROM ECOMMERCE_DEMO.SHOP.SESSIONS s
LEFT JOIN ECOMMERCE_DEMO.SHOP.ORDERS o ON o.SESSION_ID = s.SESSION_ID
GROUP BY s.TRAFFIC_SOURCE
ORDER BY conversion_rate DESC;

Expected:

Instagram appearing at the bottom with 0 orders is exactly why the LEFT JOIN matters — an INNER JOIN would silently drop it from results entirely.

Part 4: AI Studio Optimization Walkthrough

Navigate to AI & ML → Cortex Analyst in the Snowsight left sidebar. Find ECOMMERCE_INSIGHTS_SV in the list and open it.

You’ll see four tabs: Tables, Model, Verified Queries, and Chat. Go to Verified Queries — all 10 should be listed, with the 5 ONBOARDING_QUESTION TRUE queries marked.

https://youtu.be/oNISRjG1GfU

Triggering Optimization

Look at the right panel. There’s a Suggestions section with a “Get more suggestions” button. Click it.

A dialog appears asking for a role and warehouse. Select ACCOUNTADMIN (or any role with SELECT on the underlying tables) and COMPUTE_WH. Click Start.

What happens behind the scenes: Cortex Analyst executes each verified query up to four times on the warehouse you selected, analyzes the SQL patterns, extracts reusable concepts, and generates suggestions for improving the semantic view. With 10 queries, expect 5–15 minutes.

What I Actually Got

After optimization completed, the results dashboard showed:

40% overall SQL accuracy — measured by temporarily removing the exact SQL from each verified query and testing whether Cortex Analyst could reconstruct them from the semantic model structure alone.

That measurement matters. The optimizer isn’t testing rote memorization. It’s testing whether the model is rich enough — through descriptions, synonyms, relationships, and facts — that an LLM can independently arrive at the correct SQL for your business definitions.

The suggestions panel populated with 9 suggestions across 5 categories:

  • 3 description improvements — richer natural-language docs for metrics like net_line_revenue
  • 2 new metrics — orders.distinct_order_count, sessions.distinct_session_count
  • 1 filter — a reusable delivered-orders filter extracted from the WHERE ORDER_STATUS = 'Delivered' pattern that appeared in 8 of the 10 queries
  • 7 new synonyms — alternate names for frequently referenced dimensions
  • 6 custom instruction lines — refined AI_SQL_GENERATION guidance

The enhanced description for net_line_revenue was the most practically useful output. The optimizer generated something like: "Calculates total revenue after applying discounts on delivered orders. Use for accurate revenue reporting and financial analysis as it accounts for promotional discounts and excludes undelivered orders. Essential for monthly revenue trends and profitability analysis." That's documentation that's easy to skip when you're building the model and genuinely valuable when someone else inherits it later.

The Naming Conflict You’ll Hit

One of the 9 suggested metrics was order_items.NET_LINE_REVENUE. If you click Accept, you'll get:

Invalid semantic model yaml
SQL compilation error: Duplicate expression name 'ORDER_ITEMS.NET_LINE_REVENUE'.

Why: net_line_revenue already exists as a fact in the model. Facts and metrics share the same expression namespace — duplicates fail at compile time. The optimizer doesn't check for conflicts with existing facts when generating metric suggestions.

Fix: dismiss this specific suggestion. We already have order_items.net_revenue as a metric (SUM(net_line_revenue)) which covers the same intent. The enhanced description the optimizer generated for it is the actually useful piece — accept that independently if AI Studio surfaces it separately.

The lesson: never click “Accept All.” Review each suggested metric against your existing facts and metrics for naming overlap before accepting.

What to Accept vs. Skip

| Action    | Suggestion Type                              | Reason                                          |
| --------- | -------------------------------------------- | ----------------------------------------------- |
| ✅ Accept | Enhanced descriptions | Better documentation = better AI understanding |
| ✅ Accept | New synonyms | More natural phrasings for the same concepts |
| ✅ Accept | Custom instruction lines | Refined AI behavior |
| ✅ Accept | Delivered-orders filter | Reusable pattern extracted from 8 of 10 queries |
| ✅ Accept | distinct_order_count, distinct_session_count | Genuinely useful new metrics |
| ⚠️ Review | Other suggested metrics | Check for naming conflicts before accepting |
| ❌ Dismiss | NET_LINE_REVENUE metric | Duplicate of existing fact – will error on save |

Verify the Updated View

After saving, rerun the inspection queries to confirm the new metrics and descriptions are in place:

DESCRIBE SEMANTIC VIEW ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;

SHOW SEMANTIC METRICS IN ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV;

SELECT SYSTEM$READ_YAML_FROM_SEMANTIC_VIEW('ECOMMERCE_DEMO.SHOP.ECOMMERCE_INSIGHTS_SV');

Part 5: Testing the Generalization

Switch to the Chat tab and try questions that don’t exactly match any verified query. This is the real test — can Cortex Analyst combine learned concepts to answer novel questions?

| Novel Question                          | Pattern Combination                                  |
| --------------------------------------- | ---------------------------------------------------- |
| Avg revenue per Gold customer? | Revenue calculation (#1) + Customer tier filter (#5) |
| Which brand has the highest margin? | Margin % formula (#8) |
| Desktop conversion rate? | Conversion definition (#4) + Device filter (#6) |
| Revenue from coupon orders in December? | Coupon logic (#7) + Monthly time filter (#1) |
| Return rate by country? | Return rate pattern (#10) + Geographic join (#9) |
| Top category by profit? | Profit calculation (#2) |

10 verified queries teaching distinct concepts → a much larger space of answerable combinations.

The Feedback Loop

This is the part that compounds over time.

Users ask questions in Chat / Playground

Cortex Analyst suggests follow-up questions (Suggestions panel)

Your team verifies the SQL and promotes them to verified queries

Run "Get more suggestions" → optimizer learns new patterns

Broader coverage, better accuracy, better question suggestions
↓ (repeat)

Each cycle improves the system without touching your data pipeline. The verified query registry becomes a living document of how your business defines its metrics — owned by the data team, readable by everyone.

Quick Reference: Best Practices

Semantic View Design

  • Add COMMENT on every element — the optimizer uses these to generate improved descriptions
  • Use WITH SYNONYMS for alternate names users might naturally say ("region" for country, "tier" for membership level)
  • Mark internal helper calculations as PRIVATE so they don't surface as directly queryable elements
  • Use AI_SQL_GENERATION to lock in calculation rules that should always apply
  • Use AI_QUESTION_CATEGORIZATION to handle out-of-scope questions gracefully rather than letting the model guess

Verified Queries

  • Cover diverse patterns and join paths — variety delivers far more value than volume
  • Encode your business logic explicitly in SQL; don’t leave filter conditions implicit
  • Use NULLIF() for safe division wherever you compute rates or per-unit averages
  • Use fully qualified table names (ECOMMERCE_DEMO.SHOP.ORDERS, not just ORDERS)
  • Keep the total under 20; the optimizer runs best with 5–15 queries
  • Mark 4–5 as ONBOARDING_QUESTION TRUE — enough to orient a new user, not so many it overwhelms

Reviewing Optimization Suggestions

  • Always review before accepting — check proposed metrics against existing facts for naming conflicts
  • Accept descriptions, synonyms, and custom instructions freely (low risk, high value)
  • Test in Chat/Playground after accepting to verify the improvements hold
  • Never click “Accept All” — the duplicate name error is real and will break the model on save

Summary

| Component           | Detail                                                                                                |
| ------------------- | ----------------------------------------------------------------------------------------------------- |
| Database | ECOMMERCE_DEMO.SHOP |
| Tables | 5 tables (~80 rows) |
| Facts | 8 (including 2 PRIVATE helper flags) |
| Dimensions | 14 |
| Metrics | 16 |
| Verified Queries | 10 (5 onboarding, 5 optimization) |
| AI Directives | 2 (AI_SQL_GENERATION, AI_QUESTION_CATEGORIZATION) |
| Optimization Output | 40% SQL accuracy; 9 suggestions: 3 descriptions, 2 metrics, 1 filter, 7 synonyms, 6 instruction lines |

What each verified query encodes:

| #  | Question               | Concept Taught                                                |
| -- | ---------------------- | ------------------------------------------------------------- |
| 1 | Monthly revenue trend | Revenue = net after discount; Delivered orders only |
| 2 | Revenue by category | Profit = net revenue − COGS |
| 3 | Top-selling products | Ranking with multi-column GROUP BY |
| 4 | Conversion by channel | Conversion = orders / sessions; LEFT JOIN for zero-converters |
| 5 | Revenue by tier | Per-customer metrics with NULLIF safe division |
| 6 | Device performance | Session engagement + conversion combined |
| 7 | Coupon impact | Coupon = IS NOT NULL; gross vs net comparison |
| 8 | Profit margin by brand | Margin % = profit / revenue × 100 |
| 9 | Revenue by country | Geographic segmentation via customer join |
| 10 | Return rate | Rate = conditional COUNT / total × 100 |

The gap between “it works in the sandbox” and “it consistently answers business questions correctly” is exactly where verified query optimization operates. If your Cortex Analyst deployment is giving inconsistent answers on revenue or conversion questions, the verified query registry is where to start.

Disclaimer: Cortex Analyst AI Studio UI, verified query optimization flow, and suggestion panels reflect the Snowflake setup at the time of writing. Snowflake updates frequently — labels, tabs, and features may vary. Refer to official documentation for the latest behavior. Views expressed are my own and do not reflect those of my current or past employer.

👏 Give it a clap if it added value
🔗 Share it with your team
➕ Follow for more
📘 Medium: @SnowflakeChronicles
🔗 LinkedIn: satishkumar-snowflake

See you in the next one! 👋

#Snowflake #CortexAnalyst #DataEngineering #SemanticLayer #AIAnalytics #SnowflakeChronicles #NaturalLanguageSQL #DataGovernance #CloudDataWarehouse #SnowflakeAI


Optimizing Semantic Views with Verified Queries in Snowflake Cortex Analyst AI Studio was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top