ponder.configure#
ponder.configure(default_connection=None
, query_timeout=600
, row_transfer_limit=10000
, bigquery_dataset=None
)
Configure Ponder with a database connection for data ingest, as well as set up database specific parameters.
This method allows the user to configure Ponder with a default database connection to use when ingesting data (e.g. reading a CSV or parquet file), as well as allows the user to specify parameters specific to whichever database backend they plan to use.
- Parameters:
- default_connectionSQL Connection Object, optional, default: None
The default connection to use for I/O operations. This can be one of the following objects:
snowflake.connector.Connection
duckdb.Connection
google.cloud.bigquery.dbapi.Connection
- query_timeoutint, optional, default: 600
The query timeout to pass to Snowflake. Queries that run longer than this timeout will be terminated by Snowflake.
- row_transfer_limitint, optional, default: 10000
The maximum number of rows to pull out of the database when it is necessary to use in memory computation.
- bigquery_datasetstr, optional, default: None
The BigQuery Dataset to use when ingesting data into BigQuery.
Examples#
import ponder
ponder.init()
import snowflake.connector as connector
snowflake_con = connector.connect(**snowflake_params)
ponder.configure(default_connection=snowflake_con, query_timeout=30) # Timeout if a query takes longer than 30 seconds.
import ponder
ponder.init()
import duckdb
duckdb_con = duckdb.connect()
ponder.configure(default_connection=duckdb_con)
import ponder
ponder.init()
from google.cloud import bigquery
from google.cloud.bigquery import dbapi
from google.oauth2 import service_account
bigquery_con = dbapi.Connection(
bigquery.Client(
credentials=service_account.Credentials.from_service_account_info(
credentials,
scopes=["https://www.googleapis.com/auth/bigquery"],
)
)
)
ponder.configure(default_connection=bigquery_con, bigquery_dataset='CSV') # Use `CSV` BigQuery Dataset for data ingest.