If you’ve ever used the internet (which I have to assume includes everyone reading this article on a news website), you’ve probably noticed that the things you do on one website tend to follow you around on others, or that certain social media platforms know a whole lot more about you than you thought you revealed. Meanwhile, you likely have no idea who knows what about you, or how they got that information. Data collection is the backbone of the internet ecosystem, but it’s largely invisible to you, the average user, until you see its end result: an ad so uniquely targeted to you and your interests that you swear Facebook must be listening to your conversations through your phone (it probably isn’t).
Several companies and organizations are trying to make that world a little less opaque to users like you. One of them is The Markup, a nonprofit investigative news site. It just released a tool called Blacklight, and it’s designed to present all of this information in a way that’s easy to understand. If you want to know how the ad technology that knows everything about you works, it’s a great place to start. If you just want to know who might find out that you visited a potentially embarrassing or deeply personal website before you go there, it’s good for that, too.
There are a few similar tools — Apple’s newly released Safari 14 browser update, for example, will tell you which trackers are on a website you visit. But with Safari, you have to actually visit the site first, and its list of trackers doesn’t include context about which companies are associated with which trackers and what those companies do. For instance, Safari will tell you that Vox has a tracker called “agkn.com,” but Blacklight will tell you agkn.com is owned by Neustar, which specializes in “accurate targeting” based on a “wide range of attributes” gleaned from your behavior both on- and offline. And now that you know Neustar exists, you can make an informed decision to opt out of being tracked by it.
Blacklight serves more as an information tool than something you’d use in real time as you browse the internet because you have to go to Blacklight’s site and enter your desired website address in the prompt. Blacklight then scans the site and tells you how many trackers are on it, what they do, and who they’re potentially sending your data to. Some of those names you might recognize, like Oracle and Verizon. Others you likely won’t, like LiveRamp or Criteo. But it’s safe to say that all of them know a lot about you.
I tried Blacklight out for myself to see what websites might be telling those companies about me. Vox, the site you’re reading right now, is largely ad-supported. Perhaps unsurprisingly, Blacklight found a lot of ad trackers (31) and third-party cookies (54) on it. Vox also uses Facebook’s Pixel and Google’s analytics trackers, which tell those platforms that your device visited Vox. Facebook and Google trackers in particular are very common on websites, and allow Facebook and Google to connect your behavior across all of those sites to your user profile on their platforms, giving them lots of data about you and your interests for ad targeting purposes.
Vox is not unique in this regard. Its tracker load is comparable to what Blacklight found on other ad-supported national news sites, including Slate (38 trackers, 6 cookies, Facebook), Mashable (24 trackers, 33 cookies, Facebook and Google), and Politico (33 trackers, 60 cookies, Facebook).
Some sites have more advanced tracking technology. On Breitbart, for example, Blacklight found 26 trackers, 15 cookies, Facebook and Google trackers, as well as a script that enables what’s called “canvas fingerprinting,” which can be used to track you even if you block cookies. Time magazine’s site has 14 trackers, 25 cookies, Facebook and Google trackers, and, Blacklight found, it uses a session recorder that can detect things like mouse cursor movements, clicks, keystrokes, and page scrolls while you browse the site. That might sound creepier than it actually is: Websites can use session trackers to get granular data about their visitors’ behavior on their site to improve how the site itself looks and works. But they can also watch a specific user’s interactions on their site and attach it to identifying information, if they have it, to make inferences about that user. (The Markup, which is a nonprofit and relies on donations rather than ads for support, doesn’t have any trackers.)
Maybe you don’t care if a national news website knows what you’re looking at and when, but you might feel differently when it’s a site that deals with more sensitive information. On WebMD, Blacklight found 26 trackers, 31 cookies, and a Facebook tracker. A website for a medication for autoimmune diseases sent data to a variety of companies, including Facebook. A site that sells STD testing kits had 13 ad trackers, 25 cookies, Facebook and Google trackers, and a session recorder. Even if you trust those sites to respect and maintain your privacy, you’re also trusting the third parties they allow to collect your data on their website, and you’re trusting whatever companies those third parties might sell your data to. You also probably have no idea who those companies even are.
The Markup pointed Recode to Airbnb and M&Ms’ websites as examples of major websites with potentially concerning tracking behavior. Blacklight found that Airbnb has canvas fingerprinting and logs the keystrokes you type in certain text fields. It also uses Facebook’s “advanced matching” feature, which can share data with Facebook even if you’ve blocked Facebook’s cookies. On M&Ms’ site, Blacklight found 31 trackers, 67 cookies, Facebook and Google trackers, a session recorder, and that it was logging keystrokes in the email and password fields.
There may be legitimate reasons for these scripts; canvas fingerprinting is sometimes used to detect fraud, so it makes sense that it would be on a site like Airbnb. And the keystroke logger could be used to auto-complete the email and password fields, making logging into your M&Ms account easier. But it also means the site may be recording what you type in submission fields before you click the “submit” button. Either way, now you know it’s there.
Blacklight says not to take its scan as the final word on the trackers a website does or doesn’t have — there may well be some that evade detection. It’s really more of a guide to help you make more informed decisions about your internet experience. So, now that you know how your favorite websites might be tracking you and which companies they might be sending your data to, what can you do to stop it?
There are relatively simple ways to minimize the information websites can get about you, and they don’t require much technical know-how:
- Turn off ad personalization wherever possible. You can do this on Facebook, Google, and Twitter, for instance.
- Use a more privacy-conscious browser. You should specifically look for a browser that rejects third-party cookies, which are often used to track you online. Safari and Firefox browsers block third-party cookies by default, and both feature “privacy report” functions that list what they’ve blocked for you; you can find those by clicking on the little shield icon to the left of the browser bar. Google’s Chrome has a setting that will allow you to block third-party cookies, and the company says it will be blocking third-party cookies entirely by 2022.
- Add tracker blocking extensions to your browser. Privacy Badger, Ghostery, and DuckDuckGo’s Privacy Essentials are three good examples. They’ll tell you how many trackers they blocked and what they are. Ad blockers like uBlock Origin, AdBlock, and AdBlock Plus will also block trackers. These extensions may compromise the functionality of some websites, and keep in mind that you are blocking the ads that many of them rely on for income.
These are just a start, and there is no foolproof way to prevent all tracking on the internet. Again, some of these trackers will help you use the site you’re on; others will help pay for its existence. The best thing you can do is be as aware as possible of what websites can know about you and who else might be watching.
Open Sourced is made possible by Omidyar Network. All Open Sourced content is editorially independent and produced by our journalists.