Pendo Data Sync allows you to export avro files to Google Cloud Storage (GCS) bucket or Amazon S3. We use the avro file type because it's well-suited for use in "extract, transform, load" (ETL) pipelines. This article describes the avro files that are exported as part of the Data Sync process and the values contained in each type of avro file.
Data sync exports
Exports include event files and event type definitions, as well as contextual information so that you can get full value out of the events without having to do additional lookups or API calls. To this end, exports contain:
- An event file for all events, even those your team has yet to define.
- An event file for each defined event type ("matchable"): Page, Feature, and Track Event.
- A definition file for each event type, plus Guide Event, including metadata and details about each defined event.
- An export manifest that references all the files outlined above. This is to help you load data from your cloud storage into your data warehouse.
Dates and timestamps are in Coordinated Universal Time (UTC).
Events file schema
Our events file schema defines the structure and format of the events data we export in avro files for Data Sync. The schema applies to all types of event file:
- All Events
- Page
- Feature
- Track Event
Guide events are contained within the above event-type files, and so don't have their own event file, but do have their own definition file.
The schema definition that follows includes information about the different types of events that can be generated, the fields that are associated with each event type, and the data types that are used to define each field.
Metric | Type | Description |
---|---|---|
matchableId |
STRING | Reference to the matchable (Page, Feature, or Track Event) that has a rule matching this event. Not present in the All Events file. |
periodId |
INTEGER | Convenience field to assist in data warehouse loading, equal to the date portion of the browserTimestamp for a given event. All dates are in UTC. |
visitorId |
STRING | The Visitor ID for the event. |
accountId |
STRING | The Account ID for the event. An empty string is used when no account information is available. |
browserTimestamp |
INTEGER | Timestamp of the event. These can be loaded into a data warehouse as dates with the use_avro_logical_types flag. |
country |
STRING | Country associated with the remoteIp . This field can be blank. |
destinationStepId |
STRING | Relates to guideAdvanced events that specify the ID of the destination step in the guide showing flow. This can be the previous or next step ID. This shows up in the singleEvents and guideEvents sources. |
elementPath |
STRING | For web events. This is either empty or a CSS-style string specifying the DOM element related to the event. For mobile, this is a JSON-formatted description of the widget related to this event. |
eventClass |
STRING |
ui or track . |
eventId |
STRING | Unique event identifier. |
eventSource |
STRING | Event sources include: email , events where an actual email was how the event originated from; mobile , events originating in a mobile app; pendo , events generated on the Pendo platform; web , events originating in a web page. |
eventType |
STRING | Type of event. Examples include: change , click , focus , group , guideActivity , guideDismissed , guideSeen , identify , load , meta , pollResponse , recording , and synthetic . |
guideId |
STRING | Unique identifier of the guide generating guide- and poll-related events. This field can be blank. |
guideSeenReason |
STRING | For the guideSeen event type, why the guide was displayed to the user. |
guideSeenTimeoutMS |
INTEGER | For guideTimeout events, the amount of time that the agent waited to show a specific guide step before sending a guideTimeout event. |
guideSessionId |
STRING | Identifiers of the list of guides and other deliverables that were loaded together. This ID changes every time guides are requested by the client and events that happen between each load carry the same ID. |
guideSnoozeDurationMS |
INTEGER | For the guideSnoozed event type, the amount of time the guide was snoozed for in milliseconds. |
guideStepId |
STRING | Unique identifier of the guide step generating the guide- or poll-related event. This field can be blank. |
language |
STRING | For the guideSeen event type, the language of the guide being displayed. This field can be blank. |
latitude |
FLOAT | The latitude of the remoteIp for the event. This field can be blank. |
longitude |
FLOAT | The longitude of the remoteIp for the event. This field can be blank. |
loadDurationMS |
INTEGER | For load events, the amount of time it took for the webpage to render in milliseconds. This doesn't include the time it takes for dynamic parts of the page to load. |
pollId |
STRING | The identifier of the poll that generated any pollResponse events. |
pollResponse |
STRING | The JSON-formatted response to the poll that generated a pollResponse event. This can be an index into data that's only available in the poll itself. |
pollType |
STRING | The type of poll that generated a pollResponse event. This could be NumberScale , PositiveNegative , FreeForm , or PickList . |
propertiesJson |
STRING | A JSON-formatted map of all the user-defined event properties for this event. |
region |
STRING | The United States region, presented as a two-letter code for the state, of the remoteIp for the event. This field can be blank. |
remoteIp |
STRING | The remoteIp that generated the event. If not collecting this data, 0.0.0.0 is stored instead. This can also be an ipv6 address. Some proxies and mobile networks prevent useful IP addresses being collected. |
sever |
STRING | The server name portion of the URL for the event. This field can be blank. |
uiElementActions |
STRING | The element actions, such as openLink or guideSnoozed , associated with a guide interaction where a guideActivity event is sent by the agent. This appears in the following sources: singleEvents and guideEvents . |
uiElementId |
STRING | The guide element's unique identifier when a UI element inside a guide is clicked on, which sends a guideActivity event to the agent. This appears in the following sources: singleEvents , guideEvents , guideElementClick , and guideElementClickEver . |
uiElementText |
STRING | The guide element's text for when the agent sends the field as part of a guideActivity event. When a guideActivity event is sent with the guide element's text, ui_element_text should appear in the following sources: singleEvents and guideEvents . If a guideActivity event is sent without the ui_element_text field, we still process it. If a subscription has opted to exclude all text, ui_element_text isn't stored, even if the event is sent with it. |
uiElementType |
STRING | The type of element that was clicked when a guideActivity event is being sent. This appears in the following sources: singleEvents and guideEvents . |
url |
STRING | The normalized URL for the page that generated a web event. For mobile events, the URL is a JSON representation of the screen structure. |
userAgent |
STRING | The user agent from the HTTPS request when a web event is received. For mobile events, the userAgent is the textual representation of the device type that generated the event. Both types of value are properly parsed by the user agent parsing functions with aggregations. |
Event type definitions
The definitions for Pages, Features, and Track Events are included in their own avro files, which can be added to the event data contained in the relevant event file.
Pages
Metric | Type | Description |
---|---|---|
pageId |
STRING | Page identifier. |
kind |
STRING | Description of the type of object. This will always be Page . |
lastUpdatedAt |
INTEGER | Epoch timestamp for when the Page was last updated in milliseconds. |
createdAt |
INTEGER | Epoch timestamp for when the Page was created in milliseconds. |
rulesJson |
STRING | The regex rules that define the Page. |
name |
STRING | The name given to the Page. |
isCoreEvent |
BOOLEAN | Whether the event is a Pendo Core Event. |
Features
Metric | Type | Description |
---|---|---|
featureId |
STRING | Feature identifier (unique per subscription). |
kind |
STRING | Description of the type of object. This will always be Feature . |
lastUpdatedAt |
INTEGER | Epoch timestamp for when the Feature was last updated in milliseconds. |
createdAt |
INTEGER | Epoch timestamp for when the Feature was created in milliseconds. |
pageId |
STRING | The Page identifier containing the Feature. |
name |
STRING | The name given to the Feature. |
isCoreEvent |
BOOLEAN | Whether the event is a Pendo Core Event. |
Track Events
Metric | Type | Description |
---|---|---|
trackTypeId |
STRING | Track Event identifier (unique per subscription). |
kind |
STRING | Description of the type of object. This will always be TrackType . |
lastUpdatedAt |
INTEGER | Epoch timestamp for when the Track Event was last updated in milliseconds. |
createdAt |
INTEGER | Epoch timestamp for when the Track Event was created in milliseconds. |
eventPropertyNames |
STRING | The names of the Track Event properties included. |
name |
STRING | The name given to the Track Event. |
isCoreEvent |
BOOLEAN | Whether the event is a Pendo Core Event. |
Guides
Metric | Type | Description |
---|---|---|
guideId |
STRING | Guide identifier (unique per subscription). |
kind |
STRING | Description of the type of object. This will always be Guide . |
lastUpdatedAt |
INTEGER | Epoch timestamp for when the guide was last updated in milliseconds. |
createdAt |
INTEGER | Epoch timestamp for when the guide was created in milliseconds. |
state |
STRING | The visibility state of the guide: draft , staged , public , or disabled . |
name |
STRING | The name given to the Guide. |
emailState |
STRING | The state of email backup for NPS: draft when disabled, and public when enabled. |
launchMethod |
STRING | The set of launch methods a guide might use, delineated by a hyphen. |
isMultiStep |
BOOLEAN | Whether a guide has more than one step. |
isTraining |
BOOLEAN | Whether the guide belongs to an "Adopt for Partners" end-user application. |
recurrence |
INTEGER | The recurrence period for an NPS guide in milliseconds. |
recurrenceEligibilityWindow |
INTEGER | The length of time in milliseconds for which an individual visitor is eligible for an NPS guide when even distribution is enabled. |
attributJson |
STRING | JSON representation of guide attributes, including the type of guide, the badge description, the types of devices the guide is enabled for, and the last version of the Visual Design Studio that the guide was edited on. |
audienceUiHint |
STRING | A more human-readable representation of the segment that was applied to the guide. |
resetAt |
INTEGER | The timestamp for when the guide was last reset. |
publishedAt |
INTEGER | The timestamp for when the guide was most recently published. |
steps |
RECORD | Guide steps containing STRING values for guideStepId , name , pageId , and appRelayUrl . |
Example
The events file schema applies to all files for each of the event types: All Events, Pages, Features, and Track Events. There's no Guide-specific file. Instead, guide events are included in the All Events file, as well as the relevant Pages, Features, and Track Events files.
For example, if "Guide Y" is launched on "Page X", a guideActivity
event would be present in the All Events file and in the event stream for Page X with guideId
set to Y. If an event is not a guideActivity
event, the fields associated with guide events are blank.
Additionally, there can be more than one event file for each type of event. For example, if your application has three tagged Pages, two tagged Features, and two defined Track Events, you would receive 12 avro files per export:
- Guide definitions (allguides.avro)
- Page definitions (allpages.avro)
- Feature definitions (allfeatures.avro)
- Track Event definitions (alltracktypes.avro)
- All Events file (allevents.avro)
- Three Page event files (page1.avro, page2.avro, page3.avro)
- Two Feature event files (feature1.avro, feature2.avro)
- Two Track Event files (tracktype1.avro, tracktype2.avro)
The file names in this list are for illustrative purposes. The Page, Feature, and Track Event file names would reflect the appropriate ID that can be found in the billofmaterials.json for each export.