How to Remove Personally Identifiable Information (PII) from Google Analytics

November 14, 2019

In the face of increasing privacy regulations such as?the GDPR, protecting the identity of your website’s users has never been more important than it is today.

If you have Google Analytics set up on your site and believe there is a chance that PII data in the URL could be sent to Google, it is crucial for that information to be redacted. In addition to simply following best practices and the law, Google policy requires that marketers must restrict all PII from being passed to Google. The most common information includes email addresses, phone numbers, full names, usernames, passwords and zip codes.

This how-to post breaks down how marketers and data professionals can find and remove personally identifiable information from Google Analytics data.

How do you find PII data in your Google Analytics reports?

The simplest way to check for an email address is to navigate to “Behavior” > “Site Content” > “All Pages.” Then add a filter using the ampersand signal @.? This will show any pageviews that have common emails in them.

How to redact PII using Google Tag Manager

Follow the instructions below to learn how to use Google Tag Manager to redact PII from the URL before the data is sent to Google.

1. Create a custom JavaScript variable in GTM with the following code:

function() {

?return function(model) {

??? try{

????? // Add the PII patterns into this array as objects

????? var piiRegex = [{

??????? name: 'EMAIL',

??????? regex: /[^\/][a-zA-Z0-9._-]+(@|%40)(?!yoursite\.com)[^\/]+[a-zA-Z0-9._-]/gi,

??????? group: '' },{

????? name: 'SELF-EMAIL',

??????? regex: /[^\/][a-zA-Z0-9._-]+(@|%40)(?=yoursite\.com)[^\/]+[a-zA-Z0-9._-]/gi,

??????? group: '' },{

??????? name: 'TEL',

??????? regex: /((tel=)|(telephone=)|(phone=)|(mobile=)|(mob=))[\d\+\s][^&\/\?]+/gi,

??????? group: '$1' },{

??????? name: 'NAME',

??????? regex: /((firstname=)|(lastname=)|(surname=))[^&\/\?]+/gi,

??????? group: '$1' },{

??????? name: 'PASSWORD',

??????? regex: /((password=)|(passwd=)|(pass=))[^&\/\?]+/gi,

??????? group: '$1' },{

??????? name: 'ZIP',

??????? regex: /((postcode=)|(zipcode=)|(zip=))[^&\/\?]+/gi,

??????? group: '$1' }

??? ];???????????????????????????? ???

????? // Fetch reference to the original sendHitTask

????? var originalSendTask = model.get('sendHitTask');

????? var i, hitPayload, data, val;

????? model.set('sendHitTask', function(sendModel) {

????????? hitPayload = model.get('hitPayload');??

????????? //? Let's convert the current querystring into a key,value object

????????? data = (hitPayload).replace(/(^\?)/,'').split("&").map(function(n){return n = n.split("="),this[n[0]] = n[1],this}.bind({}))[0];

????? ? //? We'll be looping thu all key and values now

????????? for(var key in data){


????????????? // Let's have the value decoded before matching it against our array of regexes

????????????? piiRegex.forEach(function(pii) {???????

??????????????? var val = decodeURIComponent(data[key]);???????????????????????????????????????????? ??????? ??????

??????????????? // The value is matching?

??????????????? if(val.match(pii.regex)){

????????????????? // Let's replace the key value based on the regex and let's reencode the value

????????????????? data[key] = encodeURIComponent(val.replace(pii.regex, + '[REDACTED ' + + ']'));??????????????????????????????????????????? ?????????

??????????????? }??????????? ????????????

????????????? });????????????? ???????????????????????????????? ????

????????? }???????

????????? // Going back to roots, convert our data object into a querystring again =)???

????????? sendModel.set('hitPayload', Object.keys(data).map(function(key) { return (key) + '=' + (data[key]); }).join('&'), true);

????????? // Set the value

????????? originalSendTask(sendModel);

????? });???

??? }catch(e){}

? };


You can also get a copy of this code from Brian Clifton’s blog article “Remove PII from Google Analytics.” I made a few changes in the email and self-email regular expression so that the entire email address would be redacted instead of only the four characters before and after the @ symbol. You could also update the other regular expression to match the parameters you pass in your URL.? For example, you may want to add the name parameter to the regex if your site uses the name parameter in your URL.


2. Update your Google Analytics setting to add a custom task to call the new variable you created from the previous step.

Once you have the GTM setting and variable in place, you can set up a “Site Content” report in Google Analytics to check on any URLs that contain redacted PII data by adding a page filter that matches the word “REDACTED.”? You can check the “Site Content” report to see if you are passing any PII data in your URLs and take action on your site to remove that data from the source.

Follow ForwardPMX

Our Newsletter

Sign up to receive our monthly insights.

  • This field is for validation purposes and should be left unchanged.

You May Find These Interesting

3 Ways to Spark Creativity

What do you do when the creative well runs dry? Any routine, if stuck with long enough, can cause a rut. And for those of us working from home these days, creativity can be especially hard to find. But while difficult, it's not impossible. Use these three tips to...

read more
Our Perspective: #GivingTuesdayNow

Our Perspective: #GivingTuesdayNow

“Hope isn’t canceled” … “We’re in this together”… “Now more than ever” These phrases have been popping up over the last couple of months, signifying a collective desire to do something – anything – in the face of the COVID-19 pandemic. And in early April, our friends...

read more

Free Google Shopping Ads – Notes from the Field

Last week, Google launched free Shopping listings as an extension of their Surfaces Across Google Merchant Center program. While there are a few policies on what products you can and cannot advertise, the program is available to nearly all retailers in the US, with...

read more