dbt Cloud

Official

Remote MCP server for dbt Cloud AI features including Semantic Layer, SQL, and Discovery tools

Data & analytics24 toolsAuth: api key

Tools (24)

get_mart_models

Get the name and description of all mart models in the environment. A mart model is part of the presentation layer of the dbt project. It's where cleaned, transformed data is organized for consumption by end-users, like analysts, dashboards, or business tools.

get_all_models

Get the name and description of all dbt models in the environment.

get_model_details

<instructions> Retrieves information about a specific dbt model, including compiled SQL, description, database, schema, alias, and column details. IMPORTANT: Use uniqueId when available. - Using uniqueId guarantees the correct model is retrieved - Using only name may return incorrect results or fail entirely - If you obtained models via get_all_models(), you should always use the uniqueId from those results </instructions> <parameters> uniqueId: The unique identifier of the model (format: "model.project_name.model_name"). STRONGLY RECOMMENDED when available. name: The name of the dbt model. Only use this when uniqueId is unavailable. </parameters> <examples> 1. PREFERRED METHOD - Using uniqueId (always use this when available): get_model_details(uniqueId="model.my_project.customer_orders") 2. FALLBACK METHOD - Using only name (only when uniqueId is unknown): get_model_details(name="customer_orders")

get_model_parents

<instructions> Retrieves the parent models of a specific dbt model. These are the models that the specified model depends on. You can provide either a name or a uniqueId, if known, to identify the model. Using uniqueId is more precise and guarantees a unique match, which is especially useful when models might have the same name in different projects. Returned parents include `resourceType`, `name`, and `description`. For upstream sources, also provide `sourceName` and `uniqueId` so lineage can be linked back via `get_all_sources`. This is specifically for retrieving model parents from the production manifest. If you want development lineage, use `get_model_lineage_dev` instead. </instructions> <parameters> name: The name of the dbt model to retrieve parents for. uniqueId: The unique identifier of the model. If provided, this will be used instead of name for a more precise lookup. You can get the uniqueId values for all models from the get_all_models() tool. </parameters> <examples> 1. Getting parents for a model by name: get_model_parents(name="customer_orders") 2. Getting parents for a model by uniqueId (more precise): get_model_parents(name="customer_orders", uniqueId="model.my_project.customer_orders") 3. Getting parents using only uniqueId: get_model_parents(uniqueId="model.my_project.customer_orders") </examples>

get_model_children

<instructions> Retrieves the child models (downstream dependencies) of a specific dbt model. These are the models that depend on the specified model. You can provide either a name or a uniqueId, if known, to identify the model. Using uniqueId is more precise and guarantees a unique match, which is especially useful when models might have the same name in different projects. This is specifically for retrieving model children from the production manifest. If you want development lineage, use `get_model_lineage_dev` instead. </instructions> <parameters> name: The name of the dbt model to retrieve children for. uniqueId: The unique identifier of the model. If provided, this will be used instead of name for a more precise lookup. You can get the uniqueId values for all models from the get_all_models() tool. </parameters> <examples> 1. Getting children for a model by name: get_model_children(name="customer_orders") 2. Getting children for a model by uniqueId (more precise): get_model_children(name="customer_orders", uniqueId="model.my_project.customer_orders") 3. Getting children using only uniqueId: get_model_children(uniqueId="model.my_project.customer_orders") </examples>

get_model_health

<instructions> Retrieves information about the health of a dbt model, including the last time it ran, the last test execution status, and whether the upstream data for the model is fresh. IMPORTANT: Use uniqueId when available. - Using uniqueId guarantees the correct model is retrieved - Using only name may return incorrect results or fail entirely - If you obtained models via get_all_models(), you should always use the uniqueId from those results ASSESSING MODEL HEALTH: For all of the below, summarize whether the model is healthy, questionable, or unhealthy. Only provide more details when asked. - for the model executionInfo, if the lastRunStatus is "success" consider the model healthy - for the test executionInfo, if the lastRunStatus is "success" consider the model healthy - for the models parents: -- check the modelexecutionInfo, snapshotExecutionInfo, and seedExecutionInfo. If the lastRunStatus is "success" consider the model healthy. If the lastRunStatus is "error" consider the model unhealthy. -- if the parent node is a SourceAppliedStateNestedNode: --- If the freshnessStatus is "pass", consider the model healthy --- If the freshnessStatus is "fail", consider the model unhealthy --- If the freshnessStatus is null, consider the model health questionable --- If the freshnessStatus is "warn", consider the model health questionable </instructions> <parameters> uniqueId: The unique identifier of the model (format: "model.project_name.model_name"). STRONGLY RECOMMENDED when available. name: The name of the dbt model. Only use this when uniqueId is unavailable. </parameters> <examples> 1. PREFERRED METHOD - Using uniqueId (always use this when available): get_model_health(uniqueId="model.my_project.customer_orders") 2. FALLBACK METHOD - Using only name (only when uniqueId is unknown): get_model_health(name="customer_orders") </examples>

get_exposures

Get the id, name, description, and url of all exposures in the dbt environment. Exposures represent downstream applications or analyses that depend on dbt models. Returns information including: - uniqueId: The unique identifier for this exposure taht can then be used to get more details about the exposure - name: The name of the exposure - description: Description of the exposure - url: URL associated with the exposure

get_exposure_details

Get detailed information about one or more exposures by name or unique IDs. Parameters: - unique_id (optional): ID of exposure (e.g., ["exposure.project.exposure1", "exposure.project.exposure2"]) (more efficient - uses GraphQL filter) Returns a list of detailed information dictionaries, each including: - name: The name of the exposure - description: Detailed description of the exposure - exposureType: Type of exposure (application, dashboard, analysis, etc.) - maturity: Maturity level of the exposure (high, medium, low) - ownerName: Name of the exposure owner - ownerEmail: Email of the exposure owner - url: URL associated with the exposure - label: Optional label for the exposure - parents: List of parent models/sources that this exposure depends on - meta: Additional metadata associated with the exposure - freshnessStatus: Current freshness status of the exposure - uniqueId: The unique identifier for this exposure Example usage: - Get an exposure by unique ID: get_exposure_details(unique_id="exposure.analytics.customer_dashboard")

get_all_sources

Get the name, description, and metadata of all dbt sources in the environment. Sources represent external data tables that your dbt models build upon. Parameters (all optional): - source_names: List of specific source names to filter by (e.g., ['raw_data', 'external_api']) - unique_ids: List of specific source table IDs to filter by Note: - source_names correspond to the top-level source grouping in the source YML config - unique_ids have the form `source.{YOUR-DBT-PROJECT}.{SOURCE-NAME}.{SOURCE-TABLE}` Returns information including: - name: The table name within the source - uniqueId: The unique identifier for this source table - identifier: The underlying table identifier in the warehouse - description: Description of the source table - sourceName: The source name (e.g., 'raw_data', 'external_api') - database: Database containing the source table - schema: Schema containing the source table - resourceType: Will be 'source' - freshness: Real-time freshness status from production including: - maxLoadedAt: When the source was last loaded - maxLoadedAtTimeAgoInS: How long ago the source was loaded (in seconds) - freshnessStatus: Current freshness status (e.g., 'pass', 'warn', 'error')

get_source_details

<instructions> Retrieves detailed information about a specific dbt source, including database, schema, identifier, freshness status, and column-level details (name, type, description). IMPORTANT: Use uniqueId when available. - Using uniqueId guarantees the correct source table is retrieved - Using only name may return incorrect results if multiple sources have tables with the same name - If you obtained sources via get_all_sources(), you should always use the uniqueId from those results </instructions> <parameters> uniqueId: The unique identifier of the source (format: "source.project_name.source_name.table_name"). STRONGLY RECOMMENDED when available. name: The table name within the source. Only use this when uniqueId is unavailable. </parameters> <examples> 1. PREFERRED METHOD - Using uniqueId (always use this when available): get_source_details(uniqueId="source.my_project.raw_data.customers") 2. FALLBACK METHOD - Using only name (only when uniqueId is unknown): get_source_details(name="customers") </examples>

get_macro_details

<instructions> Fetch detailed metadata for a dbt macro. - Supply at least one of `unique_id` or `name`. Call will fail if both are missing. - Always prefer providing `unique_id` for exact matches. </instructions> <examples> 1. Fetch a macro by unique_id: get_macro_details(unique_id="macro.dbt_utils.date_spine") 2. Fetch a macro when only the node name is known: get_macro_details(name="date_spine") </examples>

get_seed_details

<instructions> Fetch detailed metadata for a dbt seed. - Supply at least one of `unique_id` or `name`. Call will fail if both are missing. - Always prefer providing `unique_id` for exact matches. </instructions> <examples> 1. Fetch a seed by unique_id: get_seed_details(unique_id="seed.analytics.customers") 2. Fetch a seed when only the node name is known: get_seed_details(name="customers") </examples>

get_semantic_model_details

<instructions> Fetch detailed metadata for a dbt MetricFlow semantic model. - Supply at least one of `unique_id` or `name`. Call will fail if both are missing. - Always prefer providing `unique_id` for exact matches. </instructions> <examples> 1. Fetch a semantic model by unique_id: get_semantic_model_details(unique_id="semantic_model.analytics.customers") 2. Fetch a semantic model when only the node name is known: get_semantic_model_details(name="customers") </examples>

get_snapshot_details

<instructions> Fetch detailed metadata for a dbt snapshot. - Supply at least one of `unique_id` or `name`. Call will fail if both are missing. - Always prefer providing `unique_id` for exact matches. </instructions> <examples> 1. Fetch a snapshot by unique_id: get_snapshot_details(unique_id="snapshot.analytics.customers") 2. Fetch a snapshot when only the node name is known: get_snapshot_details(name="customers") </examples>

get_test_details

<instructions> Fetch detailed metadata for a dbt test. - Supply at least one of `unique_id` or `name`. Call will fail if both are missing. - Always prefer providing `unique_id` for exact matches. </instructions> <examples> 1. Fetch a test by unique_id: get_test_details(unique_id="test.analytics.is_empty_string") 2. Fetch a test when only the node name is known: get_test_details(name="is_empty_string") </examples>

get_related_models

Search the current environment for dbt models that are semantically similar to the given query. The search will consider model name, description, columns, and raw SQL when matching for semantic similarity.

list_metrics

List all metrics from the dbt Semantic Layer. If the user is asking a data-related or business-related question, this tool should be used as a first step to get a list of metrics that can be used with other tools to answer the question. Examples: - "What are the top 5 products by revenue?" - "How many users did we have last month?" <parameters> search: Optional string used to filter metrics by name using partial matches </parameters>

list_saved_queries

List all saved queries from the dbt Semantic Layer. Saved queries are pre-defined queries that have been saved in the Semantic Layer. They contain specific metrics, group-by dimensions, and filters that are commonly used. This tool helps discover what predefined queries are available for quick execution. Use this tool when: - The user wants to see what pre-built queries are available - The user mentions "saved queries" or "predefined queries" - You need to find commonly used analytical queries Examples: - "What saved queries are available?" - "Show me the predefined reports" - "List all saved queries for revenue analysis" <parameters> search: Optional string used to filter saved queries by name, label, or description using partial matches </parameters>

get_dimensions

<instructions> Get the dimensions for specified metrics Dimensions are the attributes, features, or characteristics that describe or categorize data. </instructions> <examples> <example> Question: "I want to analyze revenue trends - what dimensions are available?" Thinking step-by-step: - Using list_metrics(), I find "revenue" is available - Now I can get the dimensions for this metric - The search parameter is not needed here since the user is interested in all available dimensions. Parameters: metrics=["revenue"] search=null </example> <example> Question: "Are there any time-related dimensions for my sales metrics?" Thinking step-by-step: - Using list_metrics(), I find "total_sales" and "average_order_value" are available - The user is interested in time dimensions specifically - I should use the search parameter to filter for dimensions with "time" in the name - This will narrow down the results to just time-related dimensions Parameters: metrics=["total_sales", "average_order_value"] search="time" </example> </examples> <parameters> metrics: List of metric names search: Optional string used to filter dimensions by name using partial matches (only use when absolutely necessary as some dimensions might be missed due to specific naming styles) </parameters>

get_entities

<instructions> Get the entities for specified metrics Entities are real-world concepts in a business such as customers, transactions, and ad campaigns. Analysis is often focused around specific entities, such as customer churn or annual recurring revenue modeling. </instructions> <examples> <example> Question: "I want to analyze revenue - what entities are available?" Thinking step-by-step: - Using list_metrics(), I find "revenue" is available - Now I can get the entities for this metric - The search parameter is not needed here since the user is interested in all available entities. Parameters: metrics=["revenue"] search=null </example> <example> Question: "Are there any customer-related entities for my sales metrics?" Thinking step-by-step: - Using list_metrics(), I find "total_sales" and "average_order_value" are available - The user is interested in customer entities specifically - I should use the search parameter to filter for entities with "customer" in the name - This will narrow down the results to just customer-related entities Parameters: metrics=["total_sales", "average_order_value"] search="customer" </example> </examples> <parameters> metrics: List of metric names search: Optional string used to filter entities by name using partial matches (only use when absolutely necessary as some entities might be missed due to specific naming styles) </parameters>

query_metrics

<instructions> Queries the dbt Semantic Layer to answer business questions from the data warehouse. This tool allows ordering and grouping by dimensions and entities. To use this tool, you must first know about specific metrics, dimensions and entities to provide. You can call the list_metrics, get_dimensions, and get_entities tools to get information about which metrics, dimensions, and entities to use. When using the `order_by` parameter, you must ensure that the dimension or entity also appears in the `group_by` parameter. When fulfilling a lookback query, prefer using order_by and limit instead of using the where parameter. A lookback query requires that the `order_by` parameter includes a descending order for a time dimension. The `where` parameter should be database agnostic SQL syntax, however dimensions and entity are referenced differently. For categorical dimensions, use `{{ Dimension('<name>') }}` and for time dimensions add the grain like `{{ TimeDimension('<name>', '<grain>') }}`. For entities, use `{{ Entity('<name>') }}`. When referencing dates in the `where` parameter, only use the format `yyyy-mm-dd`. Don't call this tool if the user's question cannot be answered with the provided metrics, dimensions, and entities. Instead, clarify what metrics, dimensions, and entities are available and suggest a new question that can be answered and is approximately the same as the user's question. For queries that may return large amounts of data, it's recommended to use a two-step approach: 1. First make a query with a small limit to verify the results are what you expect 2. Then make a follow-up query without a limit (or with a larger limit) to get the full dataset IMPORTANT: Do the below if the GET_MODEL_HEALTH tool is enabled. When responding to user requests to pull metrics data, check the health of the dbt models that are the parents of the dbt semantic models. Use the instructions from the "ASSESSING MODEL HEALTH" section of the get_model_health() prompts to do this. </instructions> <examples> <example> Question: "What were our total sales last month?" Thinking step-by-step: - I know "total_sales" is the metric I need - I know "metric_time" is a valid dimension for this metric and supports MONTH grain - I need to group by metric_time to get monthly data - Since this is time-based data, I should order by metric_time. I am also grouping by metric_time, so this is valid. - The user is asking for a lookback query, so I should set descending to true so the most recent month is at the top of the results. - The user is asking for just the last month, so I should limit to 1 month of data Parameters: metrics=["total_sales"] group_by=[{"name": "metric_time", "grain": "MONTH", "type": "time_dimension"}] order_by=[{"name": "metric_time", "descending": true}] limit=1 </example> <example> Question: "Show me our top customers by revenue in the last quarter" Thinking step-by-step: - First, I need to find the revenue metric - Using list_metrics(), I find "revenue" is available - I need to check what dimensions are available for revenue - Using get_dimensions(["revenue"]), I see "customer_name" and "metric_time" supports QUARTER grain - I need to check what entities are available - Using get_entities(["revenue"]), I confirm "customer" is an entity - I need quarterly time grain - Since this is time-based data, I should order by metric_time. I am grouping by metric_time, so this is valid. This is a lookback query, so I should set descending to true. - I should also order by revenue to see top customers. I am grouping by revenue, so this is valid. The user is asking for the highest revenue customers, so I should set descending to true. - I should limit to top 5 results to verify the query works Parameters: metrics=["revenue"] group_by=[{"name": "customer_name", "type": "dimension"}, {"name": "metric_time", "grain": "QUARTER", "type": "time_dimension"}] order_by=[{"name": "metric_time", "descending": true}, {"name": "revenue", "descending": true}] limit=5 Follow-up Query (after verifying results): metrics=["revenue"] group_by=[{"name": "customer_name", "type": "dimension"}, {"name": "metric_time", "grain": "QUARTER", "type": "time_dimension"}] order_by=[{"name": "metric_time", "descending": true}, {"name": "revenue", "descending": true}] limit=null </example> <example> Question: "What's our average order value by product category for orders over $100?" Thinking step-by-step: - I know "average_order_value" is the metric I need - I know "product_category" is a valid dimension - I need to filter for orders over $100 - No time dimension needed - I should first limit results to verify the query works Parameters (initial query): metrics=["average_order_value"] group_by=[{"name": "product_category", "type": "dimension", "grain": null}] where="{{ Dimension('order_value') }} > 100" limit=10 Follow-up Query (after verifying results): metrics=["average_order_value"] group_by=[{"name": "product_category", "type": "dimension", "grain": null}] where="{{ Dimension('order_value') }} > 100" limit=null </example> <example> Question: "How many new users did we get each week last year?" Thinking step-by-step: - First, I need to find the new users metric - Using list_metrics(), I find "new_users" is available - I need to check what dimensions are available - Using get_dimensions(["new_users"]), I see "metric_time" supports WEEK grain - I need to check what entities are available - Using get_entities(["new_users"]), I confirm "user" is an entity - I need weekly time grain - I need to group by metric_time - Since this is time-based data, I should order by metric_time to show progression - I need to filter for the previous year's data using proper time dimension syntax - Should first get a few weeks to verify the query works Parameters (initial query): metrics=["new_users"] group_by=[{"name": "metric_time", "grain": "WEEK", "type": "time_dimension"}] order_by=[{"name": "metric_time", "descending": false}] where="{{ TimeDimension('metric_time', 'WEEK') }} >= '2023-01-01' AND {{ TimeDimension('metric_time', 'WEEK') }} < '2024-01-01'" limit=4 Follow-up Query (after verifying results): metrics=["new_users"] group_by=[{"name": "metric_time", "grain": "WEEK", "type": "time_dimension"}] order_by=[{"name": "metric_time", "descending": false}] where="{{ TimeDimension('metric_time', 'WEEK') }} >= '2023-01-01' AND {{ TimeDimension('metric_time', 'WEEK') }} < '2024-01-01'" limit=null </example> <example> Question: "What's our customer satisfaction score by region?" Thinking step-by-step: - First, I need to check if we have a customer satisfaction metric - Using list_metrics(), I find we don't have a direct "customer_satisfaction" metric - I should check what metrics we do have that might be related - I see we have "net_promoter_score" and "customer_retention_rate" - I should inform the user that we don't have a direct customer satisfaction metric - I can suggest using NPS as a proxy for customer satisfaction Response to user: "I don't have a direct customer satisfaction metric, but I can show you Net Promoter Score (NPS) by region, which is often used as a proxy for customer satisfaction. Would you like to see that instead?" If user agrees, then: Parameters: metrics=["net_promoter_score"] group_by=[{"name": "region", "type": "dimension", "grain": null}] order_by=[{"name": "net_promoter_score", "descending": true}] limit=10 </example> </examples> <parameters> metrics: List of metric names (strings) to query for. group_by: Optional list of objects with name (string), type ("dimension" or "time_dimension"), and grain (string or null for time dimensions only). order_by: Optional list of objects with name (string) and descending (boolean, default false). where: Optional SQL WHERE clause (string) to filter results. limit: Optional limit (integer) for number of results. </parameters>

get_metrics_compiled_sql

<instructions> Gets compiled SQL for given metrics and dimensions/entities from the dbt Semantic Layer. This tool generates the underlying SQL that would be executed for a given metric query by the `query_metrics` tool, without actually running the query. This is useful for understanding what SQL is being generated, debugging query issues, or getting SQL to run elsewhere. To use this tool, you must first know about specific metrics, dimensions and entities to provide. You can call the list_metrics, get_dimensions, and get_entities tools to get information about which metrics, dimensions, and entities to use. When using the `group_by` parameter, ensure that the dimensions and entities you specify are valid for the given metrics. Time dimensions can include grain specifications (e.g., MONTH, DAY, YEAR). The tool will return the compiled SQL that the dbt Semantic Layer would generate to calculate the specified metrics with the given groupings. Don't call this tool if the user's question cannot be answered with the provided metrics, dimensions, and entities. Instead, clarify what metrics, dimensions, and entities are available and suggest a new question that can be answered and is approximately the same as the user's question. This tool is particularly useful when: - Users want to see the underlying SQL for a metric calculation - Debugging complex metric definitions - Understanding how grouping affects the generated SQL - Getting SQL to run in other tools or systems </instructions> Returns the compiled SQL as a string, or an error message if the compilation fails. <parameters> metrics: List of metric names (strings) to query for. group_by: Optional list of objects with name (string), type ("dimension" or "time_dimension"), and grain (string or null for time dimensions only). order_by: Optional list of objects with name (string) and descending (boolean, default false). where: Optional SQL WHERE clause (string) to filter results. limit: Optional limit (integer) for number of results. </parameters>

text_to_sql

Generate a SQL query from a natural language description.

execute_sql

This is a tool used to run arbitrary dbt SQL against the user's data warehouse. <sql_generation> Guidelines: - Dialect Adherence: Use correct syntax and functions for the specified SQL dialect only. - Strict Schema Awareness: Only use tables and columns explicitly provided or referenced in prior turns. - If a model or column is not in your provided context, you must not guess. - Optimization: Write efficient, readable SQL—avoid unnecessary joins, subqueries, or row-by-row logic. - Contextual Reasoning: Use prior conversation turns to refine understanding and query generation. - Formatting: Follow the style guide provided. - dbt Naming: Always use "dbt" in lowercase—even at the start of a sentence. Replace "DBT" with "dbt". - dbt Referencing: - The assistant operates in a dbt environment where context refers strictly to dbt models, not external sources. - **CRITICAL: Always use ref() when referencing dbt models - this is non-negotiable.** - Never use source(), do not assume or fabricate source references. - Prefer referencing a dbt model by its name instead of using its compiled sql - Minimal changes: When provided with pre-existing SQL, make minimal changes to fulfill the user's request. Avoid changes that the user hasn't referenced. Always generate SQL that is correct, performant, secure, and aligned with user intent. Important: If a schema or column is not explicitly provided or confirmed by the user or previous context, do not include it in any SQL output. Instead, guide the user to provide or clarify the necessary structure. </sql_generation> ### Examples of CORRECT SQL #### Used ref() for dbt models ```sql SELECT * FROM {{ ref('customers') }} SELECT * FROM {{ ref('orders') }} ``` #### Using ref() in more complex SQL ```sql SELECT c.customer_id, o.order_date FROM {{ ref('customers') }} c JOIN {{ ref('orders') }} o ON c.customer_id = o.customer_id ``` ### Examples of WRONG SQL #### Never reference tables directly without ref() ```sql SELECT * FROM customers SELECT * FROM schema.orders ``` #### Never use source() unless explicitly working with sources ```sql SELECT * FROM {{ source('raw', 'customers') }} ```

Build with
confidence.

Ship what's next.

Build MCP Generate SDKs