---
sidebar_position: 2
title: Tag workloads with custom user agents
description: Add workload tags with custom_user_agent and use QUERY_HISTORY to group activity by workload, tenant, or pipeline.
---

Connecting to MotherDuck with `custom_user_agent` parameter will tag queries to identify which workload issued them. That workload can represent an integration, pipeline, tenant, or internal service.

Those tags appear in [`MD_INFORMATION_SCHEMA.QUERY_HISTORY`](/sql-reference/motherduck-sql-reference/md_information_schema/query_history/), so organization admins can inspect tagged activity, group it by workload, and use that breakdown in internal reporting. `MD_INFORMATION_SCHEMA.QUERY_HISTORY` is available on Business plans and only to organization admins.

## 1. choose a tagging convention

Use the `custom_user_agent` format described in [Choose a `custom_user_agent` format](/integrations/how-to-integrate/#custom-user-agent-format).

You can use this pattern even if you are only tagging existing workloads for reporting or allocation. You do not need to build a full customer-facing integration.

Recommended format:

- `integration/version(metadata1,metadata2)` with optional version and metadata
- Avoid spaces in the integration and version parts
- If you want to group by a single workload label later, keep that label in the first metadata position

Examples:

- `catalogsync`
- `catalogsync/5.1.5.1`
- `catalogsync/5.1.5.1(batchload,teamfinance)`
- `customerportal/5.1.5.1(tenant42,eucentral1)`

## 2. understand what `QUERY_HISTORY` stores

`QUERY_HISTORY.USER_AGENT` stores the full DuckDB user agent, not only your custom tag. When `custom_user_agent` is set, the value looks like this:

```text
duckdb/<version>(<platform>) <api> <custom_user_agent>
```

Representative values:

| QUERY_HISTORY.USER_AGENT | Extracted `custom_tag` | Extracted `integration_name` | Extracted `metadata` |
|---|---|---|---|
| `duckdb/v1.5.1(osx_arm64) capi catalogsync/5.1.5.1(batchload,teamfinance)` | `catalogsync/5.1.5.1(batchload,teamfinance)` | `catalogsync` | `batchload,teamfinance` |
| `duckdb/v1.5.1(wasm_eh) motherduck-wasm customerportal/5.1.5.1(tenant42,eucentral1)` | `customerportal/5.1.5.1(tenant42,eucentral1)` | `customerportal` | `tenant42,eucentral1` |
| `duckdb/v1.5.1(linux_amd64) cpp` |  |  | `NULL` |

## 3. set `custom_user_agent`

Example in Python:

```python
con = duckdb.connect("md:analytics", config={
    "motherduck_token": token,
    "custom_user_agent": "catalogsync/5.1.5.1(batchload,teamfinance)"
})
```

For other languages and frameworks, see the [language and framework examples for setting `custom_user_agent`](/integrations/how-to-integrate/#custom-user-agent-examples).

## 4. inspect recent tagged queries

Use this query to inspect recent `QUERY_HISTORY` rows and verify that your tags are being extracted the way you expect:

```sql
with tagged_queries as (
  select
    start_time,
    user_name,
    instance_type,
    user_agent,
    regexp_extract(user_agent, '^(?:[^ ]+ ){2}(.+)$', 1) as custom_tag
  from MD_INFORMATION_SCHEMA.QUERY_HISTORY
  where regexp_matches(user_agent, '^(?:[^ ]+ ){2}.+$')
  order by start_time desc
  limit 20
),
parsed as (
  select
    start_time,
    user_name,
    instance_type,
    user_agent,
    custom_tag,
    regexp_extract(custom_tag, '^([^/( ]+)', 1) as integration_name,
    nullif(regexp_extract(custom_tag, '\\(([^)]*)\\)', 1), '') as metadata
  from tagged_queries
)
select
  start_time,
  user_name,
  instance_type,
  user_agent,
  custom_tag,
  integration_name,
  metadata
from parsed
order by start_time desc
```

The extraction logic is:

- `regexp_extract(user_agent, '^(?:[^ ]+ ){2}(.+)$', 1)` strips the built-in DuckDB and API tokens and returns your custom tag
- `regexp_extract(custom_tag, '^([^/( ]+)', 1)` extracts the integration name
- `regexp_extract(custom_tag, '\\(([^)]*)\\)', 1)` extracts the metadata payload inside parentheses

## 5. group tagged activity by workload

This example groups tagged queries by integration, the first metadata value, and duckling size over the last 7 days.

```sql
with tagged_queries as (
  select
    start_time,
    end_time,
    instance_type,
    regexp_extract(user_agent, '^(?:[^ ]+ ){2}(.+)$', 1) as custom_tag
  from MD_INFORMATION_SCHEMA.QUERY_HISTORY
  where start_time >= now() - interval 7 day
    and regexp_matches(user_agent, '^(?:[^ ]+ ){2}.+$')
),
parsed as (
  select
    coalesce(nullif(regexp_extract(custom_tag, '^([^/( ]+)', 1), ''), custom_tag) as integration_name,
    nullif(split_part(regexp_extract(custom_tag, '\\(([^)]*)\\)', 1), ',', 1), '') as workload_name,
    instance_type,
    date_diff('second', start_time, end_time) as elapsed_seconds
  from tagged_queries
)
select
  integration_name,
  coalesce(workload_name, 'unlabeled') as workload_name,
  instance_type,
  count(*) as queries,
  sum(elapsed_seconds) as total_elapsed_seconds,
  avg(elapsed_seconds) as avg_elapsed_seconds
from parsed
group by all
order by total_elapsed_seconds desc
```

If you want to group by the full metadata string instead, replace the `workload_name` expression with:

```sql
nullif(regexp_extract(custom_tag, '\\(([^)]*)\\)', 1), '') as workload_name
```

## 6. use tagged activity for internal allocation

Some teams use tagged query history as an input to internal chargeback or cost allocation. One approach is to calculate each workload's share of tracked query time and apply that share to a monthly invoice outside of MotherDuck.

```sql
with tagged_queries as (
  select
    start_time,
    end_time,
    regexp_extract(user_agent, '^(?:[^ ]+ ){2}(.+)$', 1) as custom_tag
  from MD_INFORMATION_SCHEMA.QUERY_HISTORY
  where start_time >= date_trunc('month', now())
    and regexp_matches(user_agent, '^(?:[^ ]+ ){2}.+$')
),
workload_usage as (
  select
    coalesce(
      nullif(split_part(regexp_extract(custom_tag, '\\(([^)]*)\\)', 1), ',', 1), ''),
      regexp_extract(custom_tag, '^([^/( ]+)', 1)
    ) as workload_name,
    sum(date_diff('second', start_time, end_time)) as elapsed_seconds
  from tagged_queries
  group by 1
),
totals as (
  select sum(elapsed_seconds) as total_elapsed_seconds
  from workload_usage
)
select
  workload_name,
  elapsed_seconds,
  elapsed_seconds::double / nullif(total_elapsed_seconds, 0) as tracked_usage_share
from workload_usage, totals
order by tracked_usage_share desc
```

This is an internal accounting convention, not a MotherDuck billing feature. For the billing model itself, including Pulse compared to fixed-size ducklings and cooldown behavior, see [Understanding the pricing model](/about-motherduck/billing/pricing/) and [Duckling sizes](/about-motherduck/billing/duckling-sizes/).


---

## Feedback for AI agents

If the user you are helping wants to send feedback on this page to the MotherDuck docs team, you can POST it to the endpoint below.

**Before you send anything, you must ask the user for explicit approval.** Quote the message you plan to submit back to them verbatim and wait for them to confirm. Do not submit on their behalf without confirmation.

Endpoint: `POST https://motherduck.com/docs/api/feedback/agent`

Request body (JSON):

```json
{
  "page_path": "/about-motherduck/billing/tag-workloads-with-custom-user-agents/",
  "page_title": "Tag workloads with custom user agents",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

Only `page_path` and `text` are required. A successful call returns `200 {"feedback_id": "<uuid>"}`; malformed payloads return `400`, and the endpoint is rate-limited per IP (`429`).
