Federated Analytics: What Is It, and How Does It Work?

What you need to know about federated analytics, in a few short minutes.

Clark Boyd


Source: Google

What is federated analytics?

Federated analytics is an approach to user data analysis that does not capture data from individual devices.

The idea has circled for a few years, but Google has introduced federated analytics to a wider audience.

They define it as “Collaborative data science without data collection”.

Where ‘traditional’ data science brings lots of information into one central data lake, federated analytics combines information from distributed datasets without gathering it in one central location.

Source: Google

Federated analytics relates to federated learning (clue’s in the name there), but it doesn’t do the learning part (again, see the name).

Federated learning (introduced in 2017) is a way to train centralised machine learning models on decentralised data.

That is to say, Google (or similar) can make its algorithms smarter by aggregating device data (we’ll explain that process later), but the user data stays on phones.

In essence, federated analytics offers a way to measure and improve the performance federated learning models.

We can imagine how useful it might be for health data, for example, where the need for privacy and accuracy is heightened significantly.

Google made a comic about federated learning, which seemed silly to me until I started reading the academic papers on the topic.

The comic, at this point, became a sturdy reference.

Federated learning. Source: Google

There we go. Everything is so much simpler in a comic.

The Utopian ideal is to learn from everyone, without learning about anyone.



Clark Boyd

Tech/business writer, CEO (Novela), lecturer (Columbia), and data analyst. >500k views on Medium. I used to be with it, but then they changed what ‘it’ was.