Overview of Pendo Data Sync

Last updated: July 16, 2026 14:20

Pendo Data Sync allows you to push data out of Pendo and centralize it in a data lake or warehouse for additional analysis. With Pendo Data Sync fully implemented and connected to other data sources, you can:

Use Pendo data in your business intelligence (BI) visualizations and dashboards.
Measure the impact of product enhancements and guide campaigns on sales and renewals.
Calculate a comprehensive customer health score to shape your renewal strategy.
Create a custom churn-risk model from a foundation of product usage and sentiment signals.
Identify upsell and cross-sell opportunities to drive data-informed account growth.

Prerequisites

You must be a subscription admin or have the Configure Data Sync permission to set up Data Sync. Contact your Pendo representative for subscription admin access.

You can also try Data Sync on a subset of data first. For information, see the Test exports section in this article.

How it works

Data Sync can deliver Pendo data in one of two ways:

Data tables synced directly to a warehouse destination (Snowflake).
Avro files to a cloud storage destination (Amazon S3, Google Cloud Storage, or Azure Storage).

After a destination is configured, you can begin to sync event data and account and visitor metadata.

Event data is configured at the application level (you can choose which applications to include).
Account and visitor metadata is optional and configured at the subscription level.

Regardless of your data destination type, Data Sync includes the same underlying dataset. See Data Sync schema definitions for more information about the data available in Data Sync.

Export timing in UTC against local time zone

Data Sync always exports data in whole UTC calendar days. Every record in an export belongs to a UTC day (00:00–23:59 UTC), regardless of your subscription's configured time zone. This keeps historical partitions stable and consistent across every export.

When syncs run

Your recurring event export runs nightly after midnight in your subscription's local time zone. Visitor and account metadata syncs run on a separate 24-hour cycle. The nightly job runs some time after midnight in the time zone set in your subscription's settings, but Data Sync aligns the exported data to the most recent UTC calendar day.

Subscriptions ahead of UTC

Data Sync exports data by UTC day, while the scheduled sync occurs after local midnight. Subscriptions in time zones ahead of UTC will see their latest day land about two calendar days before the sync date, not one.

For example, a subscription in Paris (UTC+2 in summer) runs its nightly sync some time after midnight on July 9. This is around 22:00 UTC on July 8, before the July 8 UTC day has closed. The most recently completed UTC day is July 7, so the data is two days old rather than one.

Subscriptions at or behind UTC

Subscriptions at or behind UTC run late enough in the UTC day that the prior UTC day is already complete, so the most recent day is approximately one day before the sync date.

For example, a subscription in New York (UTC-4 in summer) runs its nightly sync some time after midnight on July 9. This is around 04:00 UTC on July 9, after the July 8 UTC day has already closed. The most recently completed UTC day is July 8, so the data is one day old.

Some time zones sit exactly at UTC for part of the year and move ahead of UTC when daylight saving time begins. For example, the UK, Ireland, mainland Portugal, the Canary Islands, and the Faroe Islands are at UTC during standard time (winter) but shift to UTC+1 during DST (summer). Subscriptions in these regions may see the most recent export day change with daylight saving time.

Note: We don't recommend changing your subscription time zone for Data Sync labeling. Changing time zones shifts all day boundaries across your Pendo reporting and dashboards to UTC, impacting more than just Data Sync. It also begins a full re-processing of your subscription's data, which is disruptive and can take significant time.

For help with exports and time settings, contact your Pendo account representative.

Warehouse destinations

If you choose a warehouse destination such as Snowflake, Data Sync creates and syncs tables into your warehouse. For more information, see the articles Set up Data Sync to Snowflake and Data Sync to Snowflake architecture.

Once your warehouse destination is configured, you can choose which applications you want to sync from Pendo to your warehouse.

You can also choose to sync accounts and visitors to your warehouse. If activated, Pendo automatically creates tables of accounts, account metadata, visitors, and visitor metadata and syncs updates on a nightly basis.

See the Data Sync to Snowflake ERD for more information on the standard warehouse table schema. To access the ERD, you'll need to provide the password: datasync. If you have issues with access, contact support.

Cloud storage destinations

You can use cloud storage services like Amazon S3, Google Cloud Storage, or Microsoft Azure Storage as destinations for Data Sync exports. Data Sync sends Avro files to your cloud storage along with an export manifest JSON file that can be used for the custom ETL required to bring data from cloud storage to your data lake or warehouse. For setup instructions for the different cloud storage destinations, see the following articles:

Alternatively, you can pull data from a Pendo-hosted Google Cloud Storage (GCS) bucket. See Set up Data Sync with a Pendo-hosted Google Cloud Storage (GCS) destination for more information.

After setting up a cloud storage destination, you can:

Set up recurring daily exports.
Backfill up to three calendar years of historical data. For example, if you create an export on December 31, 2025, it can extend back to January 1, 2022.
Automatically receive updates when Page or Feature rules are added or updated in Pendo.

Create and manage exports

After you've set up a destination, you can create:

Event exports. These are configured and run at the application level. They can be one-time exports or recurring exports. For more information and instructions, see Data Sync event export handling.
Visitor and account metadata exports. These are configured and run at the subscription level. They can be one-time exports or ongoing metadata updates. For more information and instructions, see Data Sync account and visitor export handling.

If you're not yet a Data Sync customer, you can instead create a single test export for event data only.

ETL for Data Sync to Cloud Storage

A data engineer must create a pipeline to move files from cloud storage into your data platform. This can be either an ETL (extract, transform, and load) or ELT (extract, load, and transform) process. ELT involves first loading data into a data warehouse before it's transformed into the final data structure. Either ETL or ELT is a viable option as long as the pipeline supports the data additions and updates described in this article. The choice of using ETL or ELT depends on your organization's data architecture. For more information, see Google's article What is ETL?

For a typical Data Sync implementation, a data engineer at your organization creates an ETL pipeline that listens for new Pendo exports in your storage bucket. Using those files, the automated process creates and updates tables in your data lake or warehouse.

The pipeline must:

Process new data as it is exported.
Update existing records when data is finalized.
Accommodate retroactive changes when a Page or Feature rule is added or updated in Pendo.

If a data engineer isn't familiar with Pendo data, we recommend that they work with someone in your organization who understands data in Pendo. For information about Pendo event data and metadata, see Events overview and Configure visitor and account metadata.

To design a helpful, long-term foundational table structure, ensure that the data engineer works with someone who understands the end goals of the implementation.

As part of this process, we recommend the following when building your pipeline:

Listen for updates (new Pendo exports) hourly.
Load files in parallel, not sequentially.
Partition data by UTC day to match Data Sync's export pattern.

To set up your Data Sync ETL pipeline, see Data Sync event export handling.

Data estimation

If you’re not yet a Data Sync customer and want to understand how much data can be synced from Pendo, you can run a data estimate for your accounts, visitors, and application-level event data before purchasing Data Sync.

To run a data estimation:

In Pendo, go to Settings > Data Sync.
Select Run data estimate. This opens the Data to estimate window.
Select which sources you want a data estimate for.
Select Run estimate.

This starts the estimation process in the background and redirects you to the Data Sync page. The estimates may take a few minutes to complete. Pendo notifies you by email when the estimates are ready.

For each application you select to estimate, you receive:

A data volume estimate for the amount of data you can expect to receive in cloud storage for one day of event data.
A data volume estimate for a 12-month backfill of historical data.

For accounts and visitors, you receive a data volume estimate of the full sync of account and visitor metadata.

If you're using Snowflake as your data destination, Pendo also estimates the data volume for Snowflake storage. Pendo estimates an approximate 50% rate of compression between cloud storage and Snowflake, but actual compression rates may vary.

Test exports

If you're not yet a Data Sync customer, you can create a single test export containing one day of Pendo event data so your organization's data engineering team can see how your Pendo data appears in Data Sync Avro files and plan the ETL pipeline required to pull Pendo data from your cloud storage. Test exports are not available for visitor and account metadata.

Go to Settings > Data Sync and start the setup process for one of the supported cloud storage services. After you set up a destination, create an export where you can choose the Test export option.