This article was originally published on The Drum, March 17, 2023.
“Senator, we run ads.”
This response by Mark Zuckerberg to a question from Senator Hatch in 2018 is fundamental to how the US Government needs to start in understanding TikTok.
TikTok, like its contemporaries – including Google, Facebook, Snap and others – makes money from advertising. There is also no question that this advertising-led business model has led to entrepreneurialism, economic growth and consumer benefit from the introduction of new ad-funded products, services and content. But to say platforms make money from ‘advertising’ is an oversimplification – these companies make money from data-driven advertising, linked to data gathered about their users while they use their platforms.
This means it’s critical for the US government and other regulators to understand the modern data-driven economy and advertising industry. Advertising serves people, competition and the economy.
Further, it’s time to reckon with the fact that big platforms all collect and use data (almost) the same way. In very plain terms, we are in an information economy, dealing with data infinity, accelerating into the digital age. And the need for federal legislation that delivers personal data rights for all, protects competition, governs the uses of data and requires companies to detect and prevent harms – including modern harms – is urgent. Not just for TikTok, but for everyone.
All the big platforms’ advertising businesses start with their own first-party personal data, used to build an identity graph, built with names, emails, phone numbers, gender and dates of birth. This core information is then expanded to include the platform’s social graph, comprised of data signals that include a users‘ contacts and the friends, family, colleagues, companies and brands that a given user follows and interacts with. Each like, comment, reaction or interaction is a data point that is captured by the platform as part of its first-party data asset.
Platform data collection doesn’t stop there. In the mobile-first and app-driven world, a user‘s mobile device collects and shares information to these platforms – specifically, location information. The location can be as general as cell tower triangulation or as specific and accurate as GPS coordinates. This means the platforms are able to know where a user is, where they go, and, based on their social graph, who they were with at the time.
As each of these platforms also allows for posting, sharing and hosting of content, this becomes another rich source of data for the platforms to mine. The posts that a user likes, the videos they upload or watch, the length of time they view or watch content, the subject, objects and theme of the content, and whether the user shared, liked, disliked, commented or remixed the content are all captured by the platforms as data points to be associated with a personal profile. The granularity of platforms’ first-party data collection should not be underestimated.
Every single one of these data points about a user comes together to form insights that each of the platforms uses to feed their machine-learning algorithms. The algorithms use this to make decisions about what content they should show the user, optimizing for the best outcome. There are frequent debates about the definition of ‘best outcome,‘ but the short answer is to engage users and build an advertising business.
And this is where the conflation happens. While showing a user an ad for a vacuum cleaner may be beneficial to that user and cause no harm, not all ads are for vacuum cleaners – the content of the ad, the person funding the advertising and the algorithmically-driven content that keeps the user‘s eyes glued to the screen and scrolling may or may not be harmless. This is the real point of concern – not advertising itself.
To be clear, while each of these platforms may be in the advertising business, at their core, they are in the data collection business in service of their advertising business. There are no other entities on planet Earth that have ever existed in the history of humankind that have collected the amount of data on each of us as these big platforms. Using this vastness of data, plus ever-more-sophisticated algorithms to determine what content users see – whether it’s content from another user, a friend, or an activist group or an advertiser – needs governance.
This is true of all platforms. Not just TikTok.
Where in the world all the data collected by platforms about each user is physically stored is likely moot, notwithstanding various regulations around the world that may require location sovereignty. It‘s likely moot because whether a server is located in Texas, Toronto or Toowoomba, companies all over the world trying to advertise to a US audience can use platforms’ ad-buying tools to obtain insights and build an audience against the platform data attributes and get content, and ads, in front of that audience.
This a truth that the US government and regulators need to face or admit that they don’t fully understand.
It doesn’t matter if TikTok has a US board of directors, has its code audited, that servers are owned and operated by Oracle in US data centers and there are security protocols that prevent a random employee from querying the details of a former romantic partner or federal employee. It doesn’t matter that the company’s security team focuses on preventing data breaches and attacks. When the reality is that a qualified customer can go into TikTok – or any similar platform – and use these media buying tools to build data targeting parameters that identify and reach those people and audiences.
Regulators also need to understand that potentially sensitive personal data about individuals resulting from the use of the platforms is part of the platforms’ data stack. Data is an abstract of a person and requires protection and stewardship. This requires a high bar – and one that‘s getting higher. We need to establish accountability obligations, oversight and enforcement and include these constructs in a US Federal law. Some data and some uses of data can be sensitive, including what could be considered sensitive data from a national security perspective. There are and should be legitimate concerns that data could be used either to manipulate individuals or even as a point of leverage against an individual.
Every platform will say they have systems and controls in place to limit or prevent such behavior, but these systems require constant improvements, and evolution to keep pace with bad actors that exploit these platforms for bad purposes.
It is indeed an arms race between the good actors and the bad. We need effective controls and governance of all platforms’ data-driven audience tools and ad-buying systems. We all have to be aware that there is a steady precedent of bad actors attempting to mine those platforms’ advertising APIs. Cambridge Analytica did this. And Twitter’s ‘data breach’ serves as a prime example of when data intended for advertisers is misused and abused.
What the US government and every regulator needs to understand is that TikTok’s ‘Project Texas’ – its plan to collaborate with Oracle to manage and store US user data in the US – won’t solve this data governance problem. The data that regulators worry about being in China’s hands may already be available to China and any theoretical adversary if they are able to overcome credentialing, and current security controls and game the platforms’ governance systems.
In the absence of a US federal privacy law that designates that some data is too sensitive to ever collect, use, or be sold, China, Russia, North Korea or any foreign adversary can – and probably does – have cybersecurity forces fraudulently gaming the platform’s systems, buying data on the open market where the controls can be overcome and leveraging it for their own foreign intelligence purposes. It’s also worth asking: if China has laws that govern data and algorithmic use that protects their national security interest, why doesn’t the US?
What this means is that despite the hyperbole, it’s not just TikTok that is a national security risk. In the information economy, access to data by bad state actors is a risk to all connected companies, and a national security risk that regulators must and should address.
There is an urgent and desperate need to have a federal US law in place that requires accountability and strong protections for data, especially for certain types of data that may be sensitive from a national security standpoint and impose substantive obligations and protections on its collection and use. There also needs to be proper regulatory oversight into understanding who is buying and selling such data, what content is being delivered and how the algorithms that decide the content are formulated and governed. This includes examining the need for foreign export controls on what data can be collected and to who it can be sold. This also requires a balancing act, that preserves the economic benefit of a data-driven, ad-supported digital economy.
All of this is not to downplay the potential national security risks of a platform like TikTok – one that sits on a wealth of first-party user data, a social graph and intimate content insights on each user. A platform that is owned and originates from a country with a different understanding of freedom, transparency, governance, and world order than our own. But unless we anchor the conversation about TikTok in a real discussion about data collection for advertising and the potential for data use, misuse, abuse and weaponization, we are doing it wrong.