10-min Quickstart Guide#
On this page, we will walk through how you can get started in using Ponder in just a few easy steps.
Note
Before we get started, you first need a Ponder account. If you don’t already have a Ponder account, you can create a free account by signing up here.
Step 1: Setting up Ponder#
You can use Ponder by simply installing Ponder as a library on your own machine. With this flexible and lightweight approach, you can continue using Ponder within your own environment with your existing notebook/IDE setup.
To install the library, run the following command:
pip install ponder
Step 2: Login to Authenticate#
Next, you will need to login to register your product key.
ponder login
Go to your Account Settings and copy your product key.

When you are prompted to enter your product key, please copy and paste the following key and press enter to proceed.
Step 3: Initialize Ponder#
Now we are ready to start using Ponder! To get started, you first need to initialize Ponder.
import ponder
ponder.init()
Step 4: Configure your database connection#
Next, configure your connection to whichever database engine you’d like to work on. If you have a cloud data warehouse already, you can use your warehouse provider’s standard Python connection library. If you don’t currently use a cloud data warehouse, we encourage you to use DuckDB as the engine. Below we show you how to configure each:
To establish a connection to Snowflake, we leverage Snowflake’s Python connector.
import snowflake.connector
db_con = snowflake.connector.connect(user=****, password=****, account=****, role=****, database=****, schema=****, warehouse=****)
To establish a connection to BigQuery, we leverage Google Cloud’s Python client for Google BigQuery.
from google.cloud import bigquery
from google.cloud.bigquery import dbapi
from google.oauth2 import service_account
import json
db_con = dbapi.Connection(
bigquery.Client(
credentials=service_account.Credentials.from_service_account_info(
json.loads(open("my_service_account_key.json").read()),
scopes=["https://www.googleapis.com/auth/bigquery"]
)
)
)
To establish connection to DuckDB, all you need to do is use duckdb.connect(), which creates an in-memory database.
import duckdb
db_con = duckdb.connect()
If you are looking for more information about how to set up the connection, please check out this guide for more information.
Step 5: Selecting Your Data Source#
With Ponder, you can work with an existing table in your database using read_sql
and operate on CSV or Parquet files using read_csv
and read_parquet
, see this guide for more information.
If you already have your data in our warehouse, you can connect to the table by passing the database connection you configured to read_sql
as follows:
import modin.pandas as pd
df = pd.read_sql("DBNAME.TABLENAME", con=db_con)
Or if you want to work with a CSV file, since pandas’s read_csv
doesn’t take in a database connection, we first need to configure Ponder to leverage the database connection that we established earlier.
ponder.configure(default_connection=db_con)
Then, we can use pandas’ read_csv
command to load the CSV in your database for further processing.
import modin.pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/ponder-org/ponder-datasets/main/tpch/orders.csv")
Step 6: Starting Pondering 🎉#
Once the data is loaded, we can now start hacking away with pandas! Note that any operations you are doing here with pandas is directly being run on your database, rather than on the local CSV file.
df.describe()
df.groupby("O_ORDERSTATUS").mean()
pd.concat([df, df])
# .. and much more! 🧹📊🔍🧪
In this tutorial, we took a look at a quick example of how we can use pandas to work with the data directly in your database. Next, we will take a look at the different ways you can work with a data source in Ponder. If you are looking to learn more about how you can use Ponder, check out this tutorial series.