Next-generation tracker: how a URL communicates your data to marketers

CarderPlanet

Professional
Messages
2,556
Reputation
7
Reaction score
586
Points
83
73% of the site uses a little trick to make big money on users.

Adding parameters to URLs, a technique for adding data to the end of web links, has become a major issue for user data privacy. This method is used to transmit data, including email addresses, to advertising companies to track user activity.

Despite plans to restrict the use of third-party cookies in Chrome next year, data scientists continue to find new ways to track users. According to a study by Shaur Munir, a graduate student at the University of California, Davis, about 73% of the 20,000 sites analyzed add data to the URL to track user activity.

Munir introduced a machine learning-based tool called PURL, which helps identify and neutralize link changes used for tracking. According to Munir, PURL works more effectively than other anti-tracking tools.

The added parameters in the link include the resource path, request parameters, and fragments. They are used for storing and transmitting data, which puts users ' privacy at risk.

Munir gave an example of a URL with a link design that contains tracking parameters:

Code:
http://go[.]artinstitutes[.]edu/search/brand/local/PSGLC?source=BGNAG&ven=search&amp.... =Exact&gclid=KjwKEAjwq6m3BRsdfdfsdfCP7IfMq6Oo9gsdfACRc0bN3J-fcQ1t1DdfO5AyuTfKIyFbg TFPfCmPXyGdrKRBoCmv3w_wcB

In this URL, the part that starts with the "gclid" key contains the tracking ID.

Another example:

Code:
https://example[.]com/page?utm_source=newsletter&utm_medium=email

Here, utm_source and utm_medium are the added parameters for tracking the traffic source.

According to Munir, 69.4% of the sites reviewed transmit information stored in cookies by adding parameters to the URL. The problem is that parameters can have both functional and tracking purposes. While the use of additional tracking data is not a new problem, it has become particularly relevant in recent years.

Due to restrictions on the use of third-party cookies, advertising specialists have started using other techniques – primary cookies and digital fingerprints. Munir noted that platforms such as Google Analytics are actively using primary cookies, and tracking by email addresses and phone numbers is becoming more common.

According to a study based on a sample of 20% of the top million sites, about 45 million link changes were detected. Moreover, about 45% of them were identified as advertising and tracking.

However, simply deleting these parameters may affect the site's performance. For example, PURL found that email addresses entered on web pages are often transmitted to third parties, sometimes even in unencrypted form. In addition, some platforms rely on email addresses to identify users.

Munir said that an automated PURL approach using machine learning is necessary to deal with the large-scale tracking problem on the web. PURL showed an accuracy of 98.74%, which makes it one of the most effective tools in this area. Munir also stressed the need to implement automated solutions to counter the complexity and scale of modern tracking methods.
 
Top