---
sidebar_position: 2
title: Multithreading and Parallelism with JDBC and MotherDuck
sidebar_label: JDBC
description: Performance tuning via multithreading with multiple connections to MotherDuck with JDBC
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import useBaseUrl from '@docusaurus/useBaseUrl';

# Multithreading and parallelism with JDBC

Depending on the needs of your data application, you can use multithreading for improved performance. If your queries will benefit from concurrency, you can create [connections in multiple threads](#connections-in-multiple-threads). For multiple long-lived connections to one or more databases in one or more MotherDuck accounts, you can use [connection pooling](#connection-pooling). If you need to run many concurrent read-only queries on the same MotherDuck account, you can use a [Read Scaling](/docs/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/) token.

## Connections in multiple threads

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/one-connection-threads.png')} width="300" style={{maxWidth: '100%'}} />

If you have multiple parallelizable queries you want to run in quick succession, you could benefit from concurrency.

:::note
Concurrency is supported by DuckDB, across multiple threads, as described in the [Concurrency](https://duckdb.org/docs/connect/concurrency.html) documentation page. However, be mindful when using this approach, as parallelism does not always lead to better performance. Read the notes on [Parallelism](https://duckdb.org/docs/guides/performance/how_to_tune_workloads.html#parallelism-multi-core-processing) in the DuckDB documentation to understand the specific scenarios in which concurrent queries can be beneficial.
:::

First, let's create a class `MultithreadingExample` and get the MotherDuck token from your environment variables.

```java
package com.example;

import org.duckdb.DuckDBConnection;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.sql.*;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

/**
 * Examples for multithreading and connection pooling
 */
public class MultithreadingExample {

    private static final String token = System.getenv("motherduck_token");
    private final static Logger logger = LoggerFactory.getLogger(MultithreadingExample.class);
```

To use multiple threads, pass the connection object to each thread, and create a copy of the connection with the `.duplicate()` method to run a query:


```java
    private static void runQueryFromThread(String label, Connection conn, String query) {
        try (Connection dupConn = ((DuckDBConnection) conn).duplicate();
                Statement st = dupConn.createStatement();
                ResultSet rs = st.executeQuery(query)) {
            if (rs.next()) {
                logger.info("{}: found at least one row", label);
            } else {
                logger.info("{}: no rows found", label);
            }
        } catch (SQLException e) {
            throw new RuntimeException("can't run query", e);
        }
    }
```

You can then use a thread pool executor to run the queries using the `runQueryFromThread` method:

```java
    public static void main(String[] args) throws SQLException, InterruptedException {
        // Check that a motherduck_token exists
        if (token == null) {
            throw new IllegalArgumentException(
                    "Please provide `motherduck_token` environment variable");
        }

        // Add MotherDuck token to config
        Properties config = new Properties();
        config.setProperty("motherduck_token", token);

        // Create list of queries to run in multiple threads
        List<String> queries = new ArrayList<>();

        queries.add("SELECT 42;");
        queries.add("SELECT 'Hello World!';");

        int num_queries = queries.size();

        // Create thread pool executor and run queries
        ExecutorService executor = Executors.newFixedThreadPool(num_queries);

        try (Connection mdConn = DriverManager.getConnection("jdbc:duckdb:md:my_db", config);) {

            for (int i = 0; i < num_queries; i++) {
                String label = "query " + i;
                String query = queries.get(i);
                executor.submit(() -> runQueryFromThread(label, mdConn, query));
            }

            executor.shutdown();
            boolean success = executor.awaitTermination(30, TimeUnit.SECONDS);
        }
        if (success) {
            logger.info("successfully ran {} queries in threads", num_queries);
        }
    }
}
```

## Connection pooling

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling-multiple-copies.png')} width="145" style={{maxWidth: '100%'}} />
<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling.png')} width="300" style={{maxWidth: '100%'}} />
<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling-multiple.png')} width="300" style={{maxWidth: '100%'}} />

If your application needs multiple read-only connections to a MotherDuck database, for example, to handle requests in a queue, you can use a Connection Pool. A Connection Pool keeps connections open for a longer period for efficient re-use. The connections in your pool can connect to one database in the same MotherDuck account, or multiple databases in one or more accounts. To run concurrent read-only queries on the same MotherDuck account, you can use a [Read Scaling](/docs/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/) token.

For connection pools, we recommend using [HikariCP](https://github.com/brettwooldridge/HikariCP). Below is an example implementation. For this implementation, you can connect to a user account by providing a `motherduck_token` in your database path.

The goal of this implementation is to distribute operations across multiple databases in a round-robin fashion. This `HikariMultiPoolManager` class manages multiple `HikariDataSource`s (connection pools) which each connect to a different connection url, and rotates between them when `getConnection()` is called. You can specify a pool size which is applied to all `HikariDataSource`s.

```java
package com.example;

import com.zaxxer.hikari.HikariDataSource;
import com.zaxxer.hikari.HikariPoolMXBean;
import org.duckdb.DuckDBConnection;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.sql.*;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;

/**
 * Example DuckDB connection pool implementation
 */
public class HikariMultiPoolManager implements AutoCloseable {

    private static final String token = System.getenv("motherduck_token");
    private final List<HikariDataSource> dataSources;
    private final AtomicInteger index;
    private final static Logger logger = LoggerFactory.getLogger(HikariMultiPoolManager.class);

    public HikariMultiPoolManager(List<String> urls, int maximumPoolSize) {
        // Create Hikari datasources from urls
        this.dataSources = new ArrayList<>();
        for (String url : urls) {
            HikariDataSource ds = new HikariDataSource();
            ds.setMaximumPoolSize(maximumPoolSize);
            ds.setJdbcUrl(url);
            dataSources.add(ds);
        }
        this.index = new AtomicInteger(0);
    }

    public Connection getConnection() throws SQLException {
        int ind = index.getAndIncrement() % dataSources.size();
        HikariDataSource ds = dataSources.get(ind);
        return ds.getConnection();
    }

    public void evict() throws Exception {
        for (HikariDataSource ds : dataSources) {
            HikariPoolMXBean poolBean = ds.getHikariPoolMXBean();
            if (poolBean != null) {
                poolBean.softEvictConnections();
            }
        }
    }

    @Override
    public void close() throws Exception {
        for (HikariDataSource ds : dataSources) {
            ds.close();
        }
    }
```

### How to set `urls`

The `HikariMultiPoolManager` takes a list of `urls` and an optional input argument `maximumPoolSize` (defaults to 1). Each path in the list will get a `HikariDataSource` in the pool, that readers can use to query the database(s) they connect to. If you have a `maximumPoolSize` that is larger than 1, the pool will return thread-safe copies of those connections. This gives you a few options on how to configure the pool.

:::note
To learn more about database instances and connections, see [Connect to multiple databases](/docs/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck/#connect-to-multiple-databases).
:::

<Tabs>

<TabItem value="One database" label="One database">

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling-multiple-copies.png')} width="145" style={{maxWidth: '100%'}} />

To create a connection pool with 3 connections to **the same database**, you can pass a single database path, and set `maximumPoolSize=3`:

```java
        List<String> urls = new ArrayList<>();
        urls.add("jdbc:duckdb:md:my_db?motherduck_token=" + token + "&access_mode=read_only");
        HikariMultiPoolManager pool = new HikariMultiPoolManager(urls, 3);
```

Set `access_mode=read_only` to avoid potential write conflicts. This is especially important if your `maximumPoolSize` is larger than the number of databases.

</TabItem>

<TabItem value="One database with >1 instances" label="One database with >1 instances">

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling.png')} width="300" style={{maxWidth: '100%'}} />

You can also create multiple connections to **the same database** using **different DuckDB instances**. However, keep in mind that each connection takes time to establish. Create multiple paths and make them unique by adding `&user=<n>` to the database path:

```java
        List<String> urls = new ArrayList<>();
        urls.add("jdbc:duckdb:md:my_db?motherduck_token=" + token + "&access_mode=read_only&user=1");
        urls.add("jdbc:duckdb:md:my_db?motherduck_token=" + token + "&access_mode=read_only&user=2");
        urls.add("jdbc:duckdb:md:my_db?motherduck_token=" + token + "&access_mode=read_only&user=3");
        HikariMultiPoolManager pool = new HikariMultiPoolManager(urls, 1);
```

Set `access_mode=read_only` to avoid potential write conflicts. This is especially important if your `pool_size` is larger than the number of databases.

</TabItem>

<TabItem value=">1 Databases on 1 account" label=">1 Databases on one account">

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling.png')} width="300" style={{maxWidth: '100%'}} />

You can also create multiple connections to **separate databases** in **the same MotherDuck account** using **different DuckDB instances**. However, keep in mind that each connection takes time to establish. Create multiple paths where each uses a different database path:

```java
        List<String> urls = new ArrayList<>();
        urls.add("jdbc:duckdb:md:my_db1?motherduck_token=" + token + "&access_mode=read_only");
        urls.add("jdbc:duckdb:md:my_db2?motherduck_token=" + token + "&access_mode=read_only");
        urls.add("jdbc:duckdb:md:my_db3?motherduck_token=" + token + "&access_mode=read_only");
        HikariMultiPoolManager pool = new HikariMultiPoolManager(urls, 1);
```

Set `access_mode=read_only` to avoid potential write conflicts. This is especially important if your `pool_size` is larger than the number of databases.

</TabItem>

<TabItem value=">1 Databases on >1 accounts" label=">1 Databases on >1 accounts">

<img src={useBaseUrl('/img/key-tasks/authenticating-and-connecting-to-motherduck/connection-pooling-multiple.png')} width="300" style={{maxWidth: '100%'}} />

You can also create multiple connections to **separate databases** in **separate MotherDuck accounts** using *different DuckDB instances*. However, keep in mind that each connection takes time to establish. Create multiple paths where each uses a different database path:

```java
        List<String> urls = new ArrayList<>();
        urls.add("jdbc:duckdb:md:my_db1?motherduck_token=" + token1 + "&access_mode=read_only");
        urls.add("jdbc:duckdb:md:my_db2?motherduck_token=" + token2 + "&access_mode=read_only");
        urls.add("jdbc:duckdb:md:my_db3?motherduck_token=" + token3 + "&access_mode=read_only");
        HikariMultiPoolManager pool = new HikariMultiPoolManager(urls, 1);
```

Set `access_mode=read_only` to avoid potential write conflicts. This is especially important if your `pool_size` is larger than the number of databases.

</TabItem>

</Tabs>

### How to run queries with a thread pool

You can then fetch connections from the pool, for example, to run queries from a queue. You can use a `ThreadPoolExecutor` with 3 workers to fetch connections from the pool and run the queries using a `run_query` function:

```java
    private static String queryString(HikariMultiPoolManager pool, String query) throws SQLException {
        try (Connection conn = pool.getConnection();
             Statement ps = conn.createStatement();
             ResultSet rs = ps.executeQuery(query)) {
            logger.info("connection = {}", conn);
            String res = rs.next() ? rs.getString(1) : "[not found]";
            logger.info("Got: {}", res);
            return res;
        }
    }

    public static void main(String[] args) throws Exception {
        if (token == null) {
            throw new IllegalArgumentException(
                    "Please provide `motherduck_token` environment variable");
        }
        List<String> queries = new ArrayList<>();
        // Add queries here
        // Example:
        queries.add("SELECT 42;");
        queries.add("SELECT 'Hello World!';");

        List<String> urls = new ArrayList<>();
        // Add urls here
        // Example:
        urls.add("jdbc:duckdb:md:my_db?user=1&motherduck_token=" + token);
        urls.add("jdbc:duckdb:md:my_db?user=2&motherduck_token=" + token);
        urls.add("jdbc:duckdb:md:my_db?user=3&motherduck_token=" + token);

        // Create thread pool and run queries
        try(HikariMultiPoolManager pool = new HikariMultiPoolManager(urls, 1);) {
            ExecutorService executor = Executors.newFixedThreadPool(urls.size());
            for (String query : queries) {
                executor.submit(() -> queryString(pool, query));
            }
            executor.shutdown();
            boolean success = executor.awaitTermination(30, TimeUnit.SECONDS);
            if (success) {
                logger.info("successfully ran {} queries in threads with connection pool", queries.size());
            }
        }
    }
}
```

Reset the connection pool at least once every 24 hours, by soft evicting all connections. This ensures that you are always running on the latest version of MotherDuck.

```java
        pool.evict()
```
