BadB
Professional
- Messages
- 1,709
- Reaction score
- 1,697
- Points
- 113
When visiting websites on the Internet, it is increasingly common to see a notification about the use of cookies. If you consent to the use of cookies, the website creators get access to information about the time that a person spent on their portal, what exactly attracted their attention, what hyperlinks they used to get to the site, and whether they bought something. Then, using this information, developers calculate the site's performance and improve it if necessary.
Websites may collect user data using cookies
People don't read instructions. You almost certainly haven't read the Windows license agreement, read the iTunes license agreement, or read the terms of the Linux GPL or any other software. This is normal. This is our nature.
The same thing happens on the Internet. Recently, thanks to the GDPR and other laws, you often see pop-up messages where you are asked for permission to use cookies.
Most people click "Agree" and continue to live as if nothing has happened. No one reads the privacy policy, right?
Developer Conrad Akunga (Conrad Akunga) decided to find out what specific conditions are provided for in the usage agreement. For example, he took the Reuters news site. This is a completely arbitrary example, and most other sites also have their own rules.
These are the rules:
Pay attention to the scroll bar. Then there is a continuation.
Six more screens with text
You can also not completely disable ads. So your only choice is to either watch ads that are randomly selected, or ads that the provider thinks might have something to do with you.
And one more point about the partners to whom your personal data is sold. The list of partners is common for all sites that cooperate with IAB.
Who are these "partners"?
If you click on the corresponding button, the following window will appear:
Notice how small the slider on the scroll bar is. There must be hundreds of them. Under the name of each company a link to the privacy policy.
These are not the same link, but different ones! Each of them leads to a unique privacy policy of each partner. How many people will actually follow these links manually to read the terms? This is simply unrealistic.
Conrad Akunga used the Chrome developer tools to extract a real list of partners with links to the privacy terms of each of them.
He pasted the copied list into VSCode and got a huge file with 3,835 lines, which, after formatting (Alt + Shift + F), broke into a monster of 54,399 lines.
Conrad wrote a program that uses regular expressions to extract the necessary data fragments company names with URLS and generates the result in Markdown format using a template.
The result is a list of all partners, and each of them has its own unique document with the terms of confidentiality. Here is the list: vendors.md.
It has 647 companies.
Obviously, no one will be able to read all these terms and conditions before clicking the "Agree" button, the author concludes.
Keep in mind that these advertising providers provide the same services to different sites. They uniquely identify your browser and device, so they can analyze and track your actions on different sites to create the most accurate profile possible. Large amounts of data are collected for each supposedly anonymous user .
The parsing code from this article is published on Github.
Websites may collect user data using cookies
People don't read instructions. You almost certainly haven't read the Windows license agreement, read the iTunes license agreement, or read the terms of the Linux GPL or any other software. This is normal. This is our nature.
The same thing happens on the Internet. Recently, thanks to the GDPR and other laws, you often see pop-up messages where you are asked for permission to use cookies.
Most people click "Agree" and continue to live as if nothing has happened. No one reads the privacy policy, right?
Developer Conrad Akunga (Conrad Akunga) decided to find out what specific conditions are provided for in the usage agreement. For example, he took the Reuters news site. This is a completely arbitrary example, and most other sites also have their own rules.
These are the rules:

Pay attention to the scroll bar. Then there is a continuation.
Six more screens with text
- In short, the document informs the user about several things. What the website collects and processes data about.
- That it works with various partners to do this.
- That the site stores some data on your device using cookies.
- That some cookies are strictly necessary (determined by the site). You can't disable them.
- Some personal data may be sold to partners to provide relevant content.
- You can personalize your ad, but not delete it.
You can also not completely disable ads. So your only choice is to either watch ads that are randomly selected, or ads that the provider thinks might have something to do with you.
And one more point about the partners to whom your personal data is sold. The list of partners is common for all sites that cooperate with IAB.
Who are these "partners"?
If you click on the corresponding button, the following window will appear:

Notice how small the slider on the scroll bar is. There must be hundreds of them. Under the name of each company a link to the privacy policy.
These are not the same link, but different ones! Each of them leads to a unique privacy policy of each partner. How many people will actually follow these links manually to read the terms? This is simply unrealistic.
Conrad Akunga used the Chrome developer tools to extract a real list of partners with links to the privacy terms of each of them.

He pasted the copied list into VSCode and got a huge file with 3,835 lines, which, after formatting (Alt + Shift + F), broke into a monster of 54,399 lines.
Conrad wrote a program that uses regular expressions to extract the necessary data fragments company names with URLS and generates the result in Markdown format using a template.
Code:
Log.Logger = new LoggerConfiguration()
.WriteTo.Console()
.CreateLogger();
// Define the regex to extact vendor and url
var reg = new Regex("\"vendor-title\">(?<company>.*?)<.*?vendor-privacy-notice\".*?href=\"(?<url>.*?)\"",
RegexOptions.Compiled);
// Load the vendors into a string, and replace all newlines with spaces to mitigate
// formatting issues from irregular use of the newline
var vendors = File.ReadAllText("vendors.html").Replace(Environment.NewLine, " ");
// Match against the vendors html file
var matches = reg.Matches(vendors);
Log.Information("There were {num} matches", matches.Count);
// extract the vendor number, name and their url, ordering by the name first.
var vendorInfo = matches.OrderBy(match => match.Groups["company"].Value)
.Select((match, index) =>
new
{
Index = index + 1,
Name = match.Groups["company"].Value,
URL = match.Groups["url"].Value
});
// Create a string builder to progressively build the markdown
var sb = new StringBuilder();
// Append headers
sb.AppendLine($"Listing As At 30 December 2020 08:10 GMT");
sb.AppendLine();
sb.AppendLine("|-|Vendor| URL |");
sb.AppendLine("|---|---|---|");
// Append the vendor details
foreach (var vendor in vendorInfo)
sb.AppendLine($"|{vendor.Index}|{vendor.Name}|[{vendor.URL}]({vendor.URL})|");
// Delete existing markdown file, if present
if (File.Exists("vendors.md"))
File.Delete("vendors.md");
//Write markdown to file
File.WriteAllText("vendors.md", sb.ToString());
The result is a list of all partners, and each of them has its own unique document with the terms of confidentiality. Here is the list: vendors.md.
It has 647 companies.
Obviously, no one will be able to read all these terms and conditions before clicking the "Agree" button, the author concludes.
Keep in mind that these advertising providers provide the same services to different sites. They uniquely identify your browser and device, so they can analyze and track your actions on different sites to create the most accurate profile possible. Large amounts of data are collected for each supposedly anonymous user .
The parsing code from this article is published on Github.