Pendo Data Sync allows you to send Pendo data into your data lake or warehouse so that you can combine it with other data sources and incorporate it into AI workflows, machine learning models, and business intelligence dashboards. For more information, see Overview of Pendo Data Sync.
This article describes the schema used in Data Sync exports, which includes event data as well as account and visitor metadata. The two supported export methods are:
- Export to cloud storage. Data is delivered in three separate Avro files: one for events, visitors, and accounts.
- Sync to Snowflake. Data is delivered in structured tables. See the Data Sync to Snowflake ERD for more information on the standard warehouse table schema. To access the ERD, you'll need to provide the password: datasync.
For event data, the underlying data is the same in both formats. For account and visitor metadata, the structure differs between export methods. Formatting differences are noted in the schema tables in this article.
All dates and timestamps are in Coordinated Universal Time (UTC). For more information, see the section Timestamps and time zones in Data Sync export handling.
Events
Data Sync includes all events captured by Pendo, including tagged events like Pages, Features, and Track Events. These are delivered in a combination of unified and event-specific files or tables depending on your export method.
All events are included in a comprehensive export file or table.
Tagged events (Pages, Features, and Track Events) also appear in separate, event-specific exports.
Guide events are included in the all events export.
The table below shows how these event types map to Avro files and Snowflake tables:
| Event type | Avro files | Warehouse table |
|---|---|---|
| All events | allevents.avro |
ALLEVENTS |
| Page events | matchedEvents/Page/{pageId}.avro |
MATCHEDPAGEEVENTS |
| Feature events | matchedEvents/Feature/{featureId}.avro |
MATCHEDFEATUREEVENTS |
| Track Events | matchedEvents/TrackType/{trackId}.avro |
MATCHEDTRACKTYPEEVENTS |
The schema definitions that follow apply to all four event tables: ALLEVENTS, MATCHEDPAGEEVENTS, MATCHEDFEATUREEVENTS, and MATCHEDTRACKTYPEEVENTS. Two fields are exceptions: matchableId and lastUpdatedAt appear only in the matched events tables, not in ALLEVENTS. These exceptions are noted in the field descriptions where relevant.
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
matchableId |
STRING | STRING NOT NULL |
ID of the Page, Feature, or Track Event that has a rule matching this event. Not present in the All Events data.
ID contains uppercase letters (A-Z), lowercase letters (a-z), numbers (0-9), hyphens (-), and underscores (_). Maximum character limit is 500. |
periodId |
DATE | NUMBER(38,0) NOT NULL | Convenience field to assist in data warehouse loading, equal to the date portion of the browserTimestamp for a given event. All dates are in UTC. Format is YYYY-MM-DD. |
visitorId |
STRING | STRING NOT NULL |
The visitor who performed the event. Maximum character limit is 500. Can be joined to Visitors and Visitor metadata. Visitor ID set by customer during Pendo installation. |
accountId |
STRING | STRING NOT NULL |
The account associated with the visitor. Maximum character limit is 500. Can be joined to Accounts and Account metadata. An empty string is used when no account information is available. Account ID set by customer during Pendo installation. |
analyticsSessionId |
STRING | STRING | Unique identifier for the web analytics session active when this event was logged. A session begins when no previous session is active in the browser and ends after 30 minutes of inactivity. Sessions in web analytics are tracked using cookies or local storage and can persist across browser tabs within the same domain. 16-digit alphanumeric string. |
browserTimestamp |
LONG | NUMBER(38,0) | Timestamp of the event. These can be loaded into a data warehouse as dates with the use_avro_logical_types flag. |
country |
STRING | STRING | Country associated with the remoteIp. Uses the ISO 3166-1 alpha-2 country code. This field can be blank. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
destinationStepId |
STRING | STRING | Relates to guideAdvanced events that specify the ID of the destination step in the guide showing flow. This can be the previous or next step ID. This shows up in the singleEvents and guideEvents sources. Maximum character limit is 500. |
elementPath |
STRING | STRING | For web events. This is either empty or a CSS-style string specifying the DOM element related to the event. For mobile, this is a JSON-formatted description of the widget related to this event. Maximum character limit is 60,000. |
eventClass |
STRING | STRING | An event can be classified as either ui or track. |
eventId |
STRING | STRING |
A unique key for Pendo Data Sync data can be created by combining
|
eventSource |
STRING | STRING | One of: web, events originating in a web page; mobile, events originating in a mobile app; email, events originating from an email. This value only contains lowercase letters (a–z) and has a maximum length of 9 characters. |
eventType |
STRING | STRING | Type of event. One of: change, visitor has changed an element in the app; click, visitor has clicked on an element in the app; focus, visitor has focused on an element in the app; group, event in response to the group call with a Twilio Segment integration; guideActivity, visitor has interacted with a Pendo guide; guideDismissed, visitor has dismissed a Pendo guide; guideSeen, visitor has seen a Pendo guide; identify, the app identified the visitor or account identity; load, a page was loaded; meta, the app sent visitor and account metadata; pollResponse, visitor submitted a Pendo poll; track, a Track Event was sent. Mobile events include AppButtonClicked, AppScreenViewed, AppSessionStart, AppSessionEnd, and others. |
guideId |
STRING | STRING | Unique identifier of the guide generating guide- and poll-related events. This field can be blank. Maximum character limit is 500. |
guideSeenReason |
STRING | STRING | For the guideSeen event type, why the guide was displayed to the user. |
guideSeenTimeoutMS |
LONG | NUMBER(38,0) | For guideTimeout events, the amount of time that the agent waited to show a specific guide step before sending a guideTimeout event. |
guideSessionId |
STRING | STRING | Identifiers of the list of guides and other deliverables that were loaded together. This ID changes every time guides are requested by the client and events that happen between each load carry the same ID. Maximum character limit is 500. |
guideSnoozeDurationMS |
LONG | NUMBER(38,0) | For the guideSnoozed event type, the amount of time the guide was snoozed for in milliseconds. |
guideStepId |
STRING | STRING | Unique identifier of the guide step generating the guide- or poll-related event. This field can be blank. Maximum character limit is 500. |
language |
STRING | STRING | For the guideSeen event type, the language of the guide being displayed. This field can be blank. Uses https://datatracker.ietf.org/doc/html/bcp47 language tags (for example, en-US). |
latitude |
FLOAT | FLOAT | The latitude of the remoteIp for the event. This field is blank if remoteIp not being collected by Pendo. |
lastUpdatedAt |
n/a | NUMBER(38,0) | Snowflake only. Epoch timestamp for when the matched event was last included in a transfer. This field updates every time a matched event is included in a transfer, including during retroactive processing. Use this field to identify fresh compared to stale data. |
longitude |
FLOAT | FLOAT | The longitude of the remoteIp for the event. This field is blank if remoteIp not being collected by Pendo. |
loadDurationMS |
LONG | NUMBER(38,0) | For load events, the amount of time it took for the webpage to render in milliseconds. This doesn't include the time it takes for dynamic parts of the page to load. |
oldVisitorId |
STRING | STRING | The anonymous Visitor ID that was previously assigned to the visitor before they were identified by Pendo through authentication. |
pollId |
STRING | STRING | The identifier of the poll that generated any pollResponse events. Maximum character limit is 500. The poll object associated with this identifier can be found in the allguides.avro file, specifically for the guide with the identifier guideId (from this event) in the polls list. |
pollResponse |
STRING | STRING | The JSON-formatted response to the poll that generated a pollResponse event. It is empty unless the eventType is pollResponse. This can be an index into data that's only available in the poll itself. For responses that map to a set of custom responses defined in the poll, the value of pollResponse is an index into a map of responses that can be found in the allguides.avro file, specifically for the guide with the identifier guideId (from this event) and the poll with the identifier pollId (from this event) in the polls list. Maximum character limit is 5,000. |
pollType |
STRING | STRING | The type of poll that generated a pollResponse event. One of: NumberScale, PositiveNegative, FreeForm, PickList. |
propertiesJson |
STRING | STRING | A JSON-formatted map of all the user-defined event properties for this event. |
region |
STRING | STRING | The United States region of the remoteIp for the event. This field can be blank. Presented as a two-letter code for the state. |
remoteIp |
STRING | STRING | The remoteIp that generated the event. If not collecting this data, 0.0.0.0 is stored instead. This can also be an ipv6 address. Some proxies and mobile networks prevent useful IP addresses being collected. |
server |
STRING | STRING | The server name portion of the URL for the event. This field can be blank. Value is a URL-safe string (per https://datatracker.ietf.org/doc/html/rfc3986) and can be up to 60,000 characters. |
uiElementActions |
STRING | STRING | The element actions, such as openLink or guideSnoozed, associated with a guide interaction where a guideActivity event is sent by the agent. This appears in the following sources: singleEvents and guideEvents. The value is URL-safe (Base64-style) encoded (per https://datatracker.ietf.org/doc/html/rfc4648) and can be up to 60,000 characters. |
uiElementId |
STRING | STRING | The guide element's unique identifier when a UI element inside a guide is clicked on, which sends a guideActivity event to the agent. This appears in the following sources: singleEvents, guideEvents, guideElementClick, and guideElementClickEver. Maximum character limit is 60,000. |
uiElementText |
STRING | STRING | The guide element's text for when the agent sends the field as part of a guideActivity event. When a guideActivity event is sent with the guide element's text, ui_element_text should appear in the following sources: singleEvents and guideEvents. If a guideActivity event is sent without the ui_element_text field, we still process it. If a subscription has opted to exclude all text, ui_element_text isn't stored, even if the event is sent with it. Maximum character limit is 60,000. |
uiElementType |
STRING | STRING | The type of element that was clicked when a guideActivity event is being sent. This appears in the following sources: singleEvents and guideEvents. This can be any valid HTML element tag (for example, button) and can be up to 60,000 characters. |
url |
STRING | STRING | The normalized URL for the page that generated a web event. For mobile events, the URL is a JSON representation of the screen structure. |
userAgent |
STRING | STRING | The user agent string from the HTTPS request when a web event is received. For mobile events, this is the textual representation of the device type that generated the event. userAgent values are included on all events and used in aggregation and segmentation. Pendo uses its own copy of a third-party user agent parsing library to extract details like browser, OS, and device type. If the incoming string does not match any known patterns in the library, it is recorded as Unknown. |
Page, Feature, Track Event, and Guide definitions
The definitions for Pages, Features, and Track Events are included in their own files or tables, which can be joined to the event data using the matchableId.
The following table shows where each definition appears, depending on the export method:
| Business object | Avro file | Warehouse table |
|---|---|---|
| Pages | allpages.avro |
PAGES |
| Features | allfeatures.avro |
FEATURES |
| Track Events | alltracktypes.avro |
TRACKTYPES |
| Guides | allguides.avro |
GUIDES |
Pages
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
pageId |
STRING | STRING NOT NULL | Page identifier. Maximum character limit is 500. |
kind |
STRING | STRING | Description of the type of object. This will always be Page. |
name |
STRING | STRING | The name given to the Page. Maximum character limit is 1500. |
lastUpdatedAt |
LONG | NUMBER(38,0) | Epoch timestamp for when the Page was last updated in milliseconds. |
isCoreEvent |
BOOLEAN | BOOLEAN | Whether the event is a Pendo Core Event. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
rulesUpdatedAt |
n/a | NUMBER(38,0) | Snowflake only. Epoch timestamp for when the Page had its rules updated. |
softDeleteMethod |
n/a | NUMBER(38,0) | Snowflake only. Indicates which soft delete approach applies (1 = original, 2 = new timestamp comparison). |
rulesJson |
STRING | STRING | This field is only used by the Pendo Classic (legacy) Designer and is expected to be empty for most pages. Is a UTF-8 JSON string. |
Features
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
featureId |
STRING | STRING NOT NULL | Feature identifier (unique per subscription). Maximum character limit is 500. |
kind |
STRING | STRING | Description of the type of object. This will always be Feature. |
name |
STRING | STRING | The name given to the Feature. |
lastUpdatedAt |
LONG | NUMBER(38,0) | Epoch timestamp for when the Feature was last updated in milliseconds. |
isCoreEvent |
BOOLEAN | BOOLEAN | Whether the event is a Pendo Core Event. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
rulesUpdatedAt |
n/a | NUMBER(38,0) | Snowflake only. Epoch timestamp for when the Feature had its rules updated. |
softDeleteMethod |
n/a | NUMBER(38,0) | Snowflake only. Indicates which soft delete approach applies (1 = original, 2 = new timestamp comparison). |
Track Events
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
trackTypeId |
STRING | STRING NOT NULL | Track Event identifier (unique per subscription). Maximum character limit is 500. |
kind |
STRING | STRING | Description of the type of object. This will always be TrackType. |
name |
STRING | STRING | The name given to the Track Event. |
trackTypeName |
STRING | STRING | The display name of the Track Event. |
lastUpdatedAt |
LONG | NUMBER(38,0) | Epoch timestamp for when the Track Event was last updated in milliseconds. |
isCoreEvent |
BOOLEAN | BOOLEAN | Whether the event is a Pendo Core Event. |
trackTypeRules |
STRING | ARRAY | List of rules used to define the Track Event. |
eventPropertyNames |
STRING | ARRAY | The names of the Track Event properties included. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
rulesUpdatedAt |
n/a | NUMBER(38,0) | Snowflake only. Epoch timestamp for when the Track Event had its rules updated. This value is set when the Track Event is created and doesn't update after that. |
softDeleteMethod |
n/a | NUMBER(38,0) | Snowflake only. Indicates which soft delete approach applies (1 = original, 2 = new timestamp comparison). |
Guides
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
guideId |
STRING | STRING | Guide identifier (unique per subscription). Maximum character limit is 500. |
kind |
STRING | STRING | Description of the type of object. This will always be Guide. |
lastUpdatedAt |
LONG | NUMBER(38,0) | Epoch timestamp for when the guide was last updated in milliseconds. |
createdAt |
LONG | NUMBER(38,0) | Epoch timestamp for when the guide was created in milliseconds. |
state |
STRING | STRING | The visibility state of the guide: draft, staged, public, or disabled. |
name |
STRING | STRING | The name given to the Guide. |
emailState |
STRING | STRING | The state of email backup for NPS: draft when disabled, and public when enabled. |
launchMethod |
STRING | STRING | The set of launch methods a guide might use, delineated by a hyphen. |
isMultiStep |
BOOLEAN | BOOLEAN | Whether a guide has more than one step. |
isTraining |
BOOLEAN | BOOLEAN | Whether the guide belongs to an "Adopt for Partners" end-user application. |
recurrence |
LONG | NUMBER(38,0) | The recurrence period for an NPS guide in milliseconds. |
recurrenceEligibilityWindow |
LONG | NUMBER(38,0) | The length of time in milliseconds for which an individual visitor is eligible for an NPS guide when even distribution is enabled. |
attributeJson |
STRING | STRING | JSON representation of guide attributes, including the type of guide, the badge description, the types of devices the guide is enabled for, and the last version of the Visual Design Studio that the guide was edited on. |
audience |
STRING | STRING | The logic defining the visitors targeted by the guide. |
audienceUiHint |
STRING | STRING | A more human-readable representation of the segment that was applied to the guide. |
resetAt |
LONG | NUMBER(38,0) | The timestamp for when the guide was last reset. |
publishedAt |
LONG | NUMBER(38,0) | The timestamp for when the guide was most recently published. |
steps |
RECORD | ARRAY |
Guide steps containing the following values:
|
polls |
RECORD | ARRAY |
Poll questions containing the following values:
|
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
Accounts and Visitors
Definitions for accounts and visitors appear in both the main object tables and supplemental metadata files or tables, depending on your export method. The table below shows where each type of data appears:
| Data type | Avro file | Warehouse table |
|---|---|---|
| Accounts | accounts.avro |
ACCOUNTS |
| Account metadata | accountmetadata.avro |
ACCOUNTMETADATA |
| Visitors | visitors.avro |
VISITORS |
| Visitor metadata | visitormetadata.avro |
VISITORMETADATA |
Accounts
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
id |
STRING | STRING NOT NULL | Unique identifier for the account. Maximum character limit is 500. |
auto.id |
STRING | STRING NOT NULL | The same unique identifier as id. Included for schema normalization in Snowflake exports. Maximum character limit is 500. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
firstvisitMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when an event was first captured for the account. |
idhash |
INT | NUMBER(38,0) | The hash of the account ID. |
lastvisitMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when an event was last recorded for the account. |
lastupdatedMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when the account was last updated. |
value |
VARIANT | VARIANT | The value associated with this metadata field. Format and type vary depending on the field's definition in the metadata schema. |
<fieldname>_<APP_ID> |
Varies | Varies | If in a multi-application subscription, Pendo sends <fieldname>_<APP_ID> values for any of the above fields for which values may vary among applications. Metadata fields for which this is possible will have isPerApp set to true in the metadata schema file (see below). |
agent_<fieldname> |
Varies | Varies | For all agent metadata fields, a field will be created named agent_<fieldname> in the account avro file. See below for a mapping of Pendo metadata field types to avro types. |
<metadata_group> |
STRING | STRING | For any metadata group other than agent, we will send a JSON representation of the metadata group and all its fields. The metadata schema file will contain the name, type, and other information for each field in the metadata group. |
incompleteFields |
LIST of STRINGs | ARRAY | List of metadata fields with values that were incompatible with the type specified in your metadata schema. |
Visitors
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
id |
STRING | STRING NOT NULL | Unique identifier for the visitor. Maximum character limit is 500. |
accountids |
ARRAY | ARRAY | List of STRING values that are unique account IDs to which the visitor belongs. For each account, maximum character limit is 500. |
accountid |
STRING | STRING | The account ID last associated with the visitor. Maximum character limit is 500. |
deletedAt |
LONG | DATE | Date when the object was deleted. Empty if the object hasn't been deleted. |
firstvisitMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when an event was first captured for the visitor. |
idhash |
INT | NUMBER(38,0) | The hash of the visitor ID. |
lastbrowsername |
STRING | STRING | The most recent browser name. This value only contains uppercase (A-Z) or lowercase letters (a-z) and has a maximum length of 13 characters. |
lastbrowserversion |
STRING | STRING | The most recent browser version. Maximum character limit is 5,000. |
lastoperatingsystem |
STRING | STRING | The most recent operating system. Maximum character limit is 5,000. |
lastservername |
STRING | STRING | The most recent server name. Maximum character limit is 60,000. |
lastvisitMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when an event was last recorded for the visitor. |
lastupdatedMS |
LONG | NUMBER(38,0) | The timestamp in milliseconds when the visitor was last updated. |
lastuseragent |
STRING | STRING | The most recent user agent (unparsed). Maximum character limit is 512. |
lastvisitorid |
STRING | STRING | The most recent visitor ID. Maximum character limit is 500. |
identifiedvisitoratMS |
LONG | NUMBER(38,0) | If an anonymous visitor ID has been merged with an identified visitor ID with Pendo identity mapping, this field will show the timestamp in milliseconds when the visitor was identified. |
identifiedvisitorid |
STRING | STRING | If an anonymous visitor ID has been merged with an identified visitor with Pendo identity mapping, this field will be updated with the identified visitor ID. Utilizing the identified visitor IDs allows for a unified view of user journeys. Maximum character limit is 500. |
value |
VARIANT | VARIANT | The value associated with this metadata field. Format and type vary depending on the field's definition in the metadata schema. |
<fieldname>_<APP_ID> |
Varies | Varies | If in a multi-application subscription, Pendo sends <fieldname>_<APP_ID> values for any of the above fields for which values may vary among applications. Metadata fields for which this is possible will have isPerApp set to true in the metadata schema file (see below). |
agent_<fieldname> |
Varies | Varies | For all agent metadata fields, a field will be created named agent_<fieldname> in the visitor avro file. See below for a mapping of Pendo metadata field types to avro types. |
<metadata_group> |
STRING | STRING | For any metadata group other than agent, we will send a JSON representation of the metadata group and all its fields. The metadata schema file will contain the name, type, and other information for each field in the metadata group. |
incompleteFields |
LIST of STRINGs | ARRAY | List of metadata fields with values that were incompatible with the type specified in your metadata schema. |
Metadata
Account and visitor metadata is represented differently depending on your export method.
Cloud storage
Each export delivers two types of files:
Entity files (
accounts.avroandvisitors.avro). One row for each entity. Metadata values are embedded as dynamic columns: agent metadata group fields appear asagent_<fieldname>, while other groups (for example, custom and Salesforce) appear as a single JSON-encoded string column named after the group.Metadata schema files (
accountmetadata.avroandvisitormetadata.avro). One row for each metadata field definition. These files contain no entity IDs and no per-entity values. Use theavroFieldNamefield to map column names in the entity file back to their metadata definition.
Snowflake
Data Sync generates a Snowflake-specific internal file format that differs from the generic cloud storage metadata schema files. This format is long (normalized): one row for each entity and metadata field, including both the entity ID and the per-entity value.
The ACCOUNTMETADATA and VISITORMETADATA tables are built from this Snowflake format. They're not a direct copy of the cloud-storage accountmetadata.avro and visitormetadata.avro files.
Note: The Snowflake stored procedures ACCOUNTMETADATA_LOAD_AND_MERGE and VISITORMETADATA_LOAD_AND_MERGE reference AccountID and VisitorID because they load from this Snowflake-specific long-format file, not from accountmetadata.avro or visitormetadata.avro.
Sample records: Comparing cloud storage and Snowflake
The following sample records show the structural differences between the two export methods.
Sample cloud storage record (accountmetadata.avro or visitormetadata.avro). This is a schema definition only, with no entity ID and no value.
group |
name |
avroFieldName |
displayName |
type |
isDeleted |
isPerApp |
|---|---|---|---|---|---|---|
custom |
role |
agent_role |
Role |
string |
false |
false |
Sample Snowflake record (VISITORMETADATA). Includes the entity ID and the per-entity value, with one row per visitor and metadata field combination.
visitorID |
metadataGroup |
name |
displayName |
type |
isDeleted |
isPerApp |
value |
|---|---|---|---|---|---|---|---|
visitor_abc |
custom |
role |
Role |
string |
false |
false |
"admin" |
Metdata schema fields
The following table describes the fields in the metadata schema. Fields that are specific to one export method are marked as cloud storage only or Snowflake only.
The cloud storage metadata schema files (accountmetadata.avro and visitormetadata.avro) contain field definitions only, with no entity IDs and no per-entity values. The avroFieldName column links each field definition to its corresponding column in accounts.avro or visitors.avro. The Snowflake ACCOUNTMETADATA and VISITORMETADATA tables instead include entity IDs and values directly, one row for each entity and field combination.
| Field name | Avro type | Snowflake type | Description |
|---|---|---|---|
accountId or visitorId
|
n/a | STRING NOT NULL | Entity identifier. Snowflake only. Not present in accountmetadata.avro or visitormetadata.avro. |
avroFieldName |
STRING | n/a | Cloud storage only. The name of the field in the Avro entity file (accounts.avro or visitors.avro). Use this field to map a metadata definition back to its corresponding column in the entity file. Maximum character limit is 5,002. This field doesn't exist in the Snowflake ACCOUNTMETADATA or VISITORMETADATA tables. |
name |
STRING | STRING NOT NULL | The name of the metadata field in Pendo. Maximum character limit is 5000. |
group |
STRING | STRING | The name of the group of the metadata field in Pendo. Maximum character limit is 5,000. This field is named metadataGroup in the Snowflake table. |
displayName |
STRING | STRING | The display name of the metadata field in Pendo. Maximum character limit is 5000. |
type |
STRING | STRING | The data type of the field. Options include string, int, float, boolean, time, or list. |
elementType |
STRING | STRING | The data type of each element, if type is list. Empty if not applicable. This value only contains lowercase letters (a-z) and has a maximum length of 7 characters. |
elementFormat |
STRING | STRING | The data format of the metadata field. For example, if type is time, then elementFormat may be milliseconds. Empty if not applicable. Maximum character limit is 40. |
isDeleted |
BOOLEAN | BOOLEAN NOT NULL DEFAULT FALSE | True if the field has been deleted. |
isPerApp |
BOOLEAN | BOOLEAN NOT NULL DEFAULT FALSE | True if the field can exist for each application (only possible for multi-application subscriptions). |
value |
n/a | VARIANT | Per-entity metadata value. Snowflake only. For cloud storage, per-entity values are stored as dynamic columns in accounts.avro or visitors.avro (agent group fields as agent_<fieldname>; other groups as a JSON-encoded string column named after the group). |
Reproduce Snowflake metadata tables from Avro exports
If you use cloud storage exports and want to build the equivalent of Snowflake's VISITORMETADATA or ACCOUNTMETADATA tables, you can reconstruct them from the Avro files using the following steps.
- Load
visitors.avrooraccounts.avro. Each file contains one row per entity, with metadata values as dynamic columns. - Load
visitormetadata.avrooraccountmetadata.avro. Each file contains one row per metadata field definition. - Unpivot the dynamic columns from step 1 into rows with the following structure:
(visitorId/accountId, group, name, value).-
agent_<fieldname>columns map to theagentmetadata group, where the field name is<fieldname>. -
<group>JSON string columns require parsing the JSON to extract individual field names and values.
-
- Join the result of step 3 to step 2 on
avroFieldNameto adddisplayName,type,elementType,elementFormat,isDeleted, andisPerAppto each row. - The result matches the structure of the Snowflake
VISITORMETADATAorACCOUNTMETADATAtable.
Type mapping
Pendo field types can be mapped to Avro and Snowflake data types in the following way:
| Pendo field type | Field type in Pendo Data Mappings | Avro type | Snowflake type |
|---|---|---|---|
| string | Text (string) | STRING | STRING |
| int | Number (int) | INT | NUMBER(38,0) |
| float | Number (float) | FLOAT | FLOAT |
| boolean | Boolean (boolean) | BOOLEAN | BOOLEAN |
| time | Date (time) | LONG (with logical type of milliseconds) | NUMBER(38,0) |
| list | List (list) | ARRAY | ARRAY |