Benchmarking datasets
Notice: Since EDB Hosted services have been removed from the Cloud Service, Lakehouse capabilities are now only available as part of the EDB Postgres AI Hybrid Control Plane, which is currently in tech preview.
When you provision a Lakehouse node, it comes pre-configured to point to a public S3 bucket in its same region, containing sample benchmarking datasets.
You can query tables in these datasets by referencing them with their schema name.
Schema Name | Dataset |
---|---|
tpcds_sf_1 | TPC-DS, Scale Factor 1 |
tpcds_sf_10 | TPC-DS, Scale Factor 10 |
tpcds_sf_100 | TPC-DS, Scale Factor 100 |
tpcds_sf_1000 | TPC-DS, Scale Factor 1000 |
tpch_sf_1 | TPC-H, Scale Factor 1 |
tpch_sf_10 | TPC-H, Scale Factor 10 |
tpch_sf_100 | TPC-H, Scale Factor 100 |
tpch_sf_1000 | TPC-H, Scale Factor 1000 |
clickbench | ClickBench, 100 million rows |
brc_1b | Billion row challenge |
Notes about ClickBench data:
Data columns (EventData
) are integers, not dates.
You must quote ClickBench column names, because they contain uppercase letters, but unquoted identifiers in Postgres are case-insensitive. For example:
✅ select "Title" from clickbench.hits;
🚫 select Title from clickbench.hits;
Could this page be better? Report a problem or suggest an addition!