Internet-accessible applications are vulnerable to web crawlers. Large amounts of bot traffic can skew analytics for anonymous visitors and inflate your Monthly Active Users (MAU) count. This article provides suggestions to consider for preventing bot traffic and for excluding bots from your usage data.
Prevent bot traffic
If you’re encountering a significant amount of bot traffic, you can filter out new bot activity by turning on Block data from known bots in your subscription settings. For a list of known web crawlers that Pendo drops when this setting is turned on, see Known bots dropped by Pendo.
To turn on this setting, you must also have at least one the following subscription settings turned on:
- Anonymous visitor tracking
- Show anonymous visitor data
- Identity mapping
This setting doesn’t retroactively delete past bot traffic data. If you want to remove historical bot data, follow the instructions outlined in Bulk-delete accounts and visitors through the API.
After turning on this setting, you can also block up to 20 custom bots by selecting Manage additional crawlers. This opens the Suspected bots and crawlers table, which displays all suspected bots found in your subscription’s event data over the last 90 days. After a bot is blocked, data from that bot is no longer collected. If reactivated, missed data won’t be restored, and the bot may not reappear in the table if it hasn’t sent traffic recently.
You can resume data collection by either turning off the Block data from known bots setting in your subscription settings, or by reactivating custom bots you previously blocked.
Use exclude lists
When you add an item to your subscription's exclude list, Pendo continues to collect events and show guides, but the excluded events only appear in the Excluded Accounts & Visitors segment (or Excluded Visitors if account-level analytics isn't enabled) in Pendo.
- Exclude by IP. If you notice bot activity coming from a specific IP address, add it to the Exclude List.
- Exclude by Visitor ID or Account ID. If you notice particular visitors or accounts with abnormally high event counts and suspect that they might be bots, you can add specific Visitor IDs or Account IDs to the Exclude List. Whenever possible, we recommend using wildcards (*) to exclude a certain Visitor ID or Account ID pattern.
For more information, see Exclude and include lists.
Mark as Do Not Process (DNP)
If you suspect specific visitors or accounts could be a bot, you can mark those visitors or accounts as Do Not Process. This setting stops Pendo from collecting events for that visitor or account, and guides being displayed to that visitor or account.
For information on how to set DNP, and how to validate that it's working by retrieving DNP records, see Opt-out of tracking with DNP.
Set conditional initialization
If you use the install script to implement Pendo, your developers can add conditional logic to the Pendo initialization code to identify potential bots and to determine whether or not Pendo should initialize (turn on and start collecting data and displaying guides). For more information, see Conditionally initialize Pendo.