ponder.configure#

ponder.configure(default_connection=None, query_timeout=600, row_transfer_limit=10000, bigquery_dataset=None)

Configure Ponder with a database connection for data ingest, as well as set up database specific parameters.

This method allows the user to configure Ponder with a default database connection to use when ingesting data (e.g. reading a CSV or parquet file), as well as allows the user to specify parameters specific to whichever database backend they plan to use.

Parameters:
default_connectionSQL Connection Object, optional, default: None

The default connection to use for I/O operations. This can be one of the following objects:

  • snowflake.connector.Connection

  • duckdb.Connection

  • google.cloud.bigquery.dbapi.Connection

query_timeoutint, optional, default: 600

The query timeout to pass to Snowflake. Queries that run longer than this timeout will be terminated by Snowflake.

row_transfer_limitint, optional, default: 10000

The maximum number of rows to pull out of the database when it is necessary to use in memory computation.

bigquery_datasetstr, optional, default: None

The BigQuery Dataset to use when ingesting data into BigQuery.

Examples#

import ponder
ponder.init()

import snowflake.connector as connector
snowflake_con = connector.connect(**snowflake_params)
ponder.configure(default_connection=snowflake_con, query_timeout=30) # Timeout if a query takes longer than 30 seconds.
import ponder
ponder.init()

import duckdb
duckdb_con = duckdb.connect()
ponder.configure(default_connection=duckdb_con)
import ponder
ponder.init()

from google.cloud import bigquery
from google.cloud.bigquery import dbapi
from google.oauth2 import service_account

bigquery_con = dbapi.Connection(
            bigquery.Client(
            credentials=service_account.Credentials.from_service_account_info(
                    credentials,
                    scopes=["https://www.googleapis.com/auth/bigquery"],
                )
            )
        )

ponder.configure(default_connection=bigquery_con, bigquery_dataset='CSV') # Use `CSV` BigQuery Dataset for data ingest.