Why Is Instagram Search More Harmful Than Google Search?
On Instagram’s Decisions to Disable Search for Sensitive Topics and the Instagram We Can’t Have
By Jeff Allen, Integrity Institute Chief Research Officer and Co-Founder
[Note: This article discusses sensitive topics, particularly eating disorders. I do my best to talk about it as abstractly as possible, but want to give notice to people sensitive to the topic]
On January 8th, 2024, Instagram began disabling search for sensitive queries around eating disorders and self injury. This is probably for the best. And I am very glad that there are people inside the company studying these issues, asking if Instagram is having a negative impact on people, and are brave enough to say “our product as it currently exists is causing people harm, and we need to disable it”. But it is also a reminder of how broken Instagram is as both a platform and a company.
This is a bit personal for me. When I worked for Instagram, I was on a team trying to clean up all the violating content in Intagram’s various content recommendation surfaces, like the Explore tab and content on hashtag pages. And while there were a number of wins for that team, I did consider my time on Instagram a personal failure. It was clear to me that the road to Instagram being a healthy content recommendation system was going to be a long and uncertain one, and I just wasn’t up to keep trying at it. (And of course, in hindsight, this is fine. Integrity and Trust & Safety work can be hard! Sometimes, you need a break.)
Content absolutely exists on Instagram that would be appropriate, or even helpful, to show for searches around self injury and eating disorders. But Instagram search is broken at its core, and it seems Instagram is unable to build it in a sensible and responsible fashion. By disabling search for sensitive topics, Instagram is essentially throwing in the towel and admitting that they can’t build safe search engines or content ranking systems (And, given their decision to not recommend “political” content on Threads, “just turn it off” seems like the solution du jour at Meta). And so, while disabling it is more responsible than maintaining the status quo, it is also a reflection of a lack of vision among Instagram leadership about the positive role Instagram could play in the world.
Instagram search frequently returns inappropriate and harmful content in response to very basic search queries around eating disorders. Instagram blocks all content for searches for “bulimic”, as highlighted in their blog post. But you can still search for “bulimia”, and you can click through the help screen to “Continue to search results”. In those search results, there is an alarming number of posts that will be triggering to people who suffer from an eating disorder. Three of the top 15 results for “bulimia” contain images that violate the National Eating Disorder Associations guidelines for sharing content online, including images of people while they suffered from eating disorders, or quantifying actions they took as a result of the eating disorder. There are also posts advocating for strange, heavily restricted diets. Based just on this one example where 20% of the results were problematic, we can actually say that, with 95% confidence, more than 5% of the returned posts from just a basic “bulimia” search will be problematic, a stat which holds up based on my tests. You can also see posts that have content screens on them, meaning Instagram knows they are likely inappropriate. This is all totally unacceptable. It is so disappointing that a search engine used by billions of people around the world would just completely fail to be safe for such a basic search query.
It doesn’t have to be this bad. For example, you can search for “bulimia” on Google. And what do you get? The top results are from the Mayo Clinic, the National Eating Disorder Association, the NHS in the UK, Johns Hopkins Medicine, and the Cleveland Clinic. Which makes a lot of sense! The top results on Google are almost entirely medical organizations. Now, you might say, “what if those orgs just don’t have content on Instagram?”. Well, you can also use Google to search Instagram. If you search for “site:instagram.com bulimia” on Google, you will get Google’s take on the best bulimia content on Instagram. This gets you results from the Child Mind Institute, the Limi Hospital, Promises Healthcare, the National Eating Disorder Association, and the UCSD Eating Disorders Center. Again, a lot of really great results. So, the problem isn’t that there is no good content on Instagram to return for sensitive search terms. The problem is that Instagram's search algorithms are completely unable to surface good, helpful content, because Instagram search has been built in a knowingly irresponsible way.
Which is sad for the obvious reason: Instagram is unnecessarily showing people lots of harmful content. But it’s also sad for the less obvious reason: Instagram search could be good for people. It could connect people to trustworthy organizations and people that understand the difficult situation they might be in and know how to help them recover. We could live in a world where Instagram was seen as a helpful tool for people with eating disorders. Instead, Instagram seems to have given up and accepted that the best it can do is not be as terrible for people as it has in the past.
You can also search for “bulimia” on different platforms. Searching for it on YouTube will, like Google, get many posts from medical organizations or major news outlets. TikTok seems to be much more aggressive about blocking sensitive searches, and neither “bulimic” or “bulimia” show any results on TikTok. And to be fair to Instagram, it also gets worse. Searching for “bulimia” on X/Twitter or Tumblr returns content actually advocating for bulimia, which is just insane. Things can always be worse.
So, why is this? Why is it that Instagram returns inappropriate and harmful content instead of content from the many health organizations on Instagram combating eating disorders? As a data scientist, I will focus on how the content ranking systems differ between Google Search and Instagram. There are two fundamental reasons: Instagram doesn’t know how to build search engine algorithms or content ranking systems and Instagram has no vision for how Instagram could play a positive role in people's lives.
How Instagram Works and How Instagram Could Build Its Search (And All Content Ranking)
So how does Instagram search work? I haven’t worked there in a few years now, so I don’t know in detail. But Instagram has been fairly transparent about how their content ranking and recommendation systems work. Instagram search roughly looks like:
Find all the posts that contain the search term
Estimate the probability that you will engage with the post (dwell on the photo, like, or comment)
Estimate the probability that the content violates any of Instagram's policies
Sort the content by some combination of the probability that you will engage with the content (like or comment) minus the probability that the content is violating
The final ranking score might be "probability(Engagement) - 10 * probability(Violating)" or some comparable formula. This is your standard "engagement based ranking" framework, and it is known to amplify harmful content (a quick summary of how it amplifies harms is in an appendix). That engagement has a correlation with low quality, harmful, and violating content is a growing consensus in the industry. There's a lot to be said about the harms that engagement based ranking causes (And I have a ~20 page paper covering it all I'm trying to wrap up and will link here when done...), but the eating disorder case on Instagram search is an illustrative example where it clearly fails. Engagement based ranking is bad, but engagement based ranking on sensitive searches is insane.
So, what is the alternative to "engagement based ranking"? It's not like it's some great secret. Google Search is so old that the founding patents have expired. Google Search has explainer after explainer of how it works. And the results are that Google Search, when restricted strictly to results from Instagram, returns appropriate and helpful content, even for sensitive queries. I don’t want to say Google Search is perfect in every way, because there are definitely shortcomings (discussed in an appendix). But Google does provide a very clear and obvious alternative to engagement based ranking.
Google Search, very roughly speaking, evaluates results along two dimensions: quality and relevance. Results at the top of the Google Search are ones Google's systems estimate are high quality and highly relevant.
Quality
Google's definition of quality is public; they publish it in their Search Quality Rater Guidelines, which they have done since 2015. Platforms don’t need to be “scared” of making quality assessments of content and doing so is actually important and valuable for their business. Google predicts quality using a wide variety of signals, including very long established information retrieval signals. The most famous of these is PageRank, the founding algorithm of Google.
I have used the Common Crawl to create a calculation of PageRank for 20 million Instagram accounts (Which again, will make public when I have time), and so we can look at what Instagram search might look like if quality and PageRank were worked into the system. For the 20M Instagram accounts we have, we calculate a score based on what percentage of Instagram accounts have a PageRank score lower than the given account. So, a 100 means that the account has a PageRank that is higher than any other account in our dataset. A score of 10 means the account has a PageRank higher than 10% of the accounts in our dataset. And a score of 0 would mean the account didn’t show up in our dataset at all, and thus has a PageRank that is too low to easily estimate.
We can compare what the PageRank scores for results looks like when using Google to search Instagram for “bulimia”, using Google Search and mapping results to Instagram profiles, and using Instagram search directly.
You can see that when using Google to search Instagram, the majority of accounts that Google returns have a pretty decent PageRank score. Some are among the most authoritative accounts on the platform. So, there are plenty of Instagram accounts that have good PageRank scores and also have content related to bulimia.
We can also look at the organizations that Google returns for the basic "bulimia" query.
The organizations that Google returns for a general “bulimia” query almost all have an Instagram account, and their Instagram accounts have some of the highest PageRank scores on the platform. There is a lot of highly authoritative health information on Instagram.
What are the PageRank scores for the accounts that Instagram returns for "bulimia"?
Basically none of the accounts that Instagram returns have any PageRank in our calculation. There’s only one result from an authoritative source (Which actually turns out to be an influencer account, so, if you think ranking by quality and PageRank will eliminate all the smaller, independent content creators, that is not the case!). And, of course, none of the three accounts that posted inappropriate content, according to NEDA standards, had any PageRank. So, if you want to know a basic difference between Instagram search and Google Search, the first to note is that Instagram doesn’t use PageRank or any similar signals.
PageRank turns 26 this year. It is so old that the patent has expired and it is now free for anyone to use. When you use PageRank to rank content on Instagram related to bulimia, you get results from trusted medical organizations. But Instagram has chosen to ignore decades of history about how to build search engines. As a result, Instagram’s “highly engaging” results are much less likely to be safe, let alone helpful, to people suffering from or recovering from eating disorders.
Relevance
Google's systems for relevance are also very helpful in keeping their search results safe. When you search for "bulimia" on Google, Google doesn’t simply return all content with the word "bulimia" in it. Google Search understands bulimia as a topic, and it knows what subject areas bulimia lies in. Google has a comprehensive taxonomy that covers many different subject areas. Google Search knows, for example, that "bulimia" is a type of eating disorder. And it knows that eating disorders are a type of mental disorder. And it knows that mental disorders are one aspect of mental health. And that mental health is an aspect of overall health.
Actually understanding content and being able to place it in a hierarchical taxonomy like this helps Google ensure that the results they return tend towards high quality, no matter how sensitive the subject area. For example, if Google runs out of high quality results for "bulimia", they can make the choice between showing medium quality results for bulimia or to start showing high quality results for eating disorders overall. For sensitive topics, such as health, Google can decide that it is more important to show high quality results than stay perfectly on topic. For less sensitive topics, like entertainment or hobbies, they can decide that medium quality content is fine, and it's more important to stay on the topic. Having a strong understanding of content allows them a lot of flexibility in how they tune the search engine.
Instagram search performs as if Instagram has very little understanding of content topics and how they connect. And this makes it much more likely that low quality, inappropriate, and harmful content will show up in their results. Once Instagram runs out of “safe” posts with the word or hashtag “bulimia” in them, Instagram search has nothing left to offer except the “unsafe” posts. And the system has little flexibility to be tuned to favor quality over relevance, or relevance over quality.
Demoting Violating and Borderline Content Doesn't Work
Now, Instagram does attempt to make their search results safer by using machine learning classifiers that try to identify violating content. Instagram calls these “Integrity Demotions”, and Instagram uses these to demote potentially violating content in search. But, as the results you see when you search "bulimia" show, they just don't solve the problem.
They don't work because the online community that is pro eating disorder is highly adversarial. The words they use and their imagery will be constantly shifting to avoid the classifiers that Instagram makes. And even the pro recovery communities will still have to use the slang and “algospeak” that will prevent them from being caught in Instagram's algorithms. And these shifts will largely be successful. Violating content classifiers are always backwards looking, only trained on the harms that have already existed long enough for the platform to detect them, label them as violating, and retrain the classifier.
This framework for removing harmful content doesn’t incentivize people to produce content that is helpful and not harmful. It doesn’t incentivize creators to have a more positive impact on their audiences' health. It just incentivizes people to come up with new terms, imagery, and tricks to evade the violating content classifiers.
So, the framework of "boost all engaging content, which will give high scores to harmful content, but then demote the subset of harmful content that are classified as violating" just doesn't work. Mark Zuckerberg first laid out this framework in 2018. I was an employee at the time, and I said then that it was a broken framework that wouldn’t work. And 6 years later, with Instagram turning off search for sensitive topics, it seems that Instagram is agreeing that it doesn’t work.
Instagram Doesn't Have a Vision for the Positive Role It Could Play in People’s Lives
Why did Google and Instagram end up with such different frameworks for ranking content? It’s not exactly like the people working on Instagram search have never heard of PageRank; they almost certainly have. The reason why they don’t incorporate PageRank into Instagram search is because Instagram doesn’t have any vision for what positive impact the platform could have in people’s lives, and without that positive vision, PageRank isn’t particularly useful in ranking content. To be clear, there are certainly people working at Instagram who do have that positive vision, but leadership doesn’t seem to be included there. There doesn’t seem to be a positive vision consistently expressed throughout the platform or in public company statements. Instagram's content philosophy is basically a bag of synonyms for "engaging" (interesting, meaningful, relevant, etc).
Google does have a positive vision for how they want the platform to help people. Google’s mission is “to organize the world's information and make it universally accessible and useful”. And the key word there is actually “useful”. “Useful” tells them how to define content quality. This lets them create both positive, neutral, and negative tiers of content. High quality content is useful, low quality content is not useful.
The exact description and definition of these tiers can be found in their Search Quality Evaluator Guidelines (And if you’re reading this article but haven’t yet read Google’s quality guidelines, please stop! And read Google’s guidelines! They are amazing and should be copied throughout the industry!). PageRank is very helpful and effective at separating useless content from highly useful content. And so, because Google has a clear positive vision for the content that they want to connect people to, PageRank and other similar information retrieval signals end up being very important in their ranking systems.
I don’t know the exact tiering system that Instagram has. But based on how they describe their ranking systems, it probably looks something like this:
Basically, Instagram only divides content into either “okay” and “bad” buckets. There is little sense of what higher or lower quality content might look like in the “other” bucket.
And it gets worse, because the only real goal Instagram has is to increase usage and engagement with the platform. Increasing engagement means that any violating content that has slipped through the large cracks in Instagram's detection systems will be ranked very highly and likely to be shown. So the framework for ranking content at Instagram really looks like this.
There’s nothing in this framework that would help you distinguish between content from NEDA and random accounts pushing strange diets and practices. There’s nothing in this framework that would help you distinguish between content from the Mayo Clinic and content from random accounts that just barely avoid the borderline and sensitive content definitions and classifiers. It does nothing to separate out and elevate great, helpful, and safe content.
Instagram’s mission statement when I was there was “to connect you with the people and things that you love”. Which is a pretty good mission. There are a lot of lovable people and things that Instagram can deepen people’s connection with. Now, people do “love” things that are bad for them. People in eating disorder communities can, unfortunately, be deeply invested in them. But eating disorders do not love people back, they solely want to sap your mental and physical wellbeing.
When I was at Instagram, I would use the example of nicotine. When I was a smoker, I definitely “loved” nicotine. However, nicotine never loved me back. All it wanted was to take years off my life and money from my wallet. I also love math. And math absolutely does love me back. My love of math took me to a good college and has gotten me great and fulfilling jobs. I would probably use Instagram way more than I currently do if it, as a platform, realized that “Jeff really loves math, and we should connect him to the content creators active on Instagram that are creating math related content and care about enriching the lives of their audience”.
It’s pretty easy to develop a content quality framework that would better align with Instagram’s values and mission, and would look something like having the highest quality content come from people who know and care deeply about the subject area of their Instagram content and also care about having a positive impact on their audience. And low quality content could be defined as coming from accounts that don’t particularly care about their topic, don’t care about their audience, and are primarily running the Instagram account for their own self interest (Audience farming to make a quick buck). It is pretty easy to make objective evaluations of content and accounts to assess this, and the Google guidelines are a great starting point. If Instagram had a framework like this in place, then PageRank would pop out as being a very useful signal. Most of this is just bringing common sense into ranking. If you asked any reasonable person "what is safer and more helpful to show when someone searches for 'bulimia': a post about bulimia from the Mayo Clinic or a post about bulimia from a random account that has no medical expertise or training?", I would wager everyone would pick Mayo Clinic. And if the follow up was "but wait, what if the post from the random account had a lot of engagement on it?" I would still wager than everyone would pick the Mayo Clinic. But it isn't always about big organizations vs. independent creators. If the question was, "what is safer and more helpful: a post from an independent creator account that has medical expertise and training, or a post from a random account that has no medical expertise and training?", I think the creator with medical expertise is going to win. So it’s not impossible for Instagram to develop a framework for ranking content that is comprehensive and would push them away from engagement based ranking and towards a more quality focused ranking framework. They just consistently choose not to, at the expense of the wellbeing of their users.
What Should Instagram Do?
Let’s imagine that Instagram actually wanted to change and to build a platform that wasn’t just safe but lived up to Instagram’s mission and values, what are the first steps they should take?
First, Instagram should actually take the harms seriously and staff up the teams that are testing and improving the safety of the platform. The current experience is pretty inconsistent, as discussed in an appendix. And given the current state of Instagram’s safety, there needs to be many more people on the safety teams, finding those inconsistencies and all of the gaps in the current safety systems. Instagram should also ensure enough subject matter experts are on the staff of these teams and are properly empowered. This would help with problems like “bulimic” being blocked as a search term but not “bulimia” and will also capture all the algospeak terms that are currently not treated, which are very easy to find. This should also including making sure there is enough topic area experts in partnerships.
Second, Instagram needs to transform how it does content and account ranking and recommendation. This begins with building out a system that can actually evaluate both the quality and safety of Instagram’s results. I personally would start by building a system that could measure the safety and appropriateness of search results and benchmark them against Google’s results for the same query. This would enable “Are Instagram’s results as safe as Google’s?” to be an actual metric that gets tracked and even used in tests of new ranking changes.
Instagram should then build out an actual framework of content and account quality, as discussed in the previous section. The easiest place to start is with the Google Search Quality Rater Guidelines. After a decent collection of content has been evaluated for quality, Instagram can build systems to predict quality, and use that in ranking, instead of the current engagement focused framework.
And finally, Instagram should actually build out a real system for understanding the topics of content posted to the platform. This would enable Instagram to be able to make the same informed decisions of how to balance relevance and quality in ways that make the fundamental ranking systems much safer and less likely to return harmful content.
If all this seems like a lot, that’s because it is! Building safe content ranking and recommendation systems is hard! The Google Search Quality engineering team is full of people with advanced degrees and specialties in niche fields like information retrieval and information theory. It’s hard, and there’s no right answer. Which is ultimately what makes it such a challenging, but fulfilling, problem space.
Conclusion
But Instagram has chosen a different path. Instagram has effectively killed search for sensitive topics. Given the current state of Instagram, and their hopeless devotion to engagement based ranking systems, which tend to show people all the harmful content which slips through their detection systems, I think this is for the best. And I’m grateful for the people inside Instagram who have been brave enough to say that the current state of Instagram search is doing more harm than good.
But it is also sad and disappointing. Instagram has a wide variety of content across many sensitive topic areas that could be helpful to people. You can just look at the National Eating Disorders Association Instagram account. There is plenty of engaging content there that is well designed and crafted to help people recovering from eating disorders. NEDA includes people sharing their personal stories in ways that support others in their recovery journey. It’s entirely possible for Instagram to build a search and content ranking system that aligns with that, so that we could feel good about people recovering from eating disorders using Instagram. Hopefully someday Instagram, as a company, wakes up and builds that.
But until then, will just have to keep trying to spread the message:
Appendix A: There is a risk that Instagram's eating disorder content classifiers might not be working or are performing poorly in Spanish
When I search for “bulimia” on Instagram, 11 of the top 15 results are in Spanish. Which is strange, because my language settings for the app are all English. Why would Instagram return so many Spanish language results for bulimia? There could very well be a benign explanation. Perhaps there is much more content in Spanish on the subject. Or Spanish language content on the subject gets much more engagement. Both totally possible and reasonable.
But, there could also be a very alarming reason. As discussed earlier, Instagram’s framework for content ranking is “push up all the engaging stuff, but then demote things that are predicted to be sensitive or violating”. This means that, to remove sensitive content about eating disorders, you can’t have just one classifier, you need a classifier for every language on earth. If the English language “sensitive or violating content about eating disorders” classifier was working and set to an aggressive threshold, but the Spanish language “sensitive of violating content about eating disorders” classifier wasn’t working or was set to a lower threshold, then you could end up in the situation where all the English language content about eating disorders get demoted heavily, but Spanish language posts aren’t.
This would mean that Instagram search was much more dangerous outside of the English language, because Instagram’s content classifiers aren’t performing as well as those for English. This again highlights the weakness of Zuckerberg’s framework for demoting harmful content. It requires you to have harmful content classifiers for every language in the world. And it is inevitable that the harmful content classifiers for many of those languages will perform poorly.
Appendix B: The overall safety experience on Instagram is inconsistent
I used eating disorders as the example of sensitive content in this article specifically because “bulimic” was a term that Instagram highlighted in their blog as a term that was now fully blocked from showing results. Instagram was saying that it was now safer to search around sensitive topics, so I decided to try the topics they explicitly mentioned as safer in their blog post. I confirmed “bulimic” was blocked. Then I checked if other terms were also blocked. “Bulimia” was the first term I tried, and I was able to see results for it.
There could be a fairly benign reason for this. I would wager that the way they determine which searches to block is based on an estimate of the “sensitive content rate” for results for the search term. Maybe “bulimia” just has a lower sensitive content rate and so doesn’t get the blocking treatment, and that was an intentional choice of Instagram.
But if it wasn’t an intentional choice to allow “bulimia”, and if Instagram intended to be more comprehensive in blocking all terms related to eating disorders, then it does suggest a pretty inconsistent experience. As mentioned, TikTok is much more aggressive and comprehensive in blocking sensitive searches. We can check, for a handful of searches, which are blocked on Instagram and TikTok:
And that’s just me playing around with “hashtag generator” type tools, and using Instagram’s search autocomplete, for 30 minutes. The motivated people in the pro eating disorder community are almost assuredly using different terms and are much better at avoiding the “integrity demotions”. Overall, it doesn’t seem like the philosophy behind which terms to block or allow on Instagram is coherent. Why is “bulimic” blocked in English but the feminine Spanish equivalent, “bulimica” is allowed? Why is the masculine Spanish form, "bulimico" blocked? If it’s all just left to algorithms using sensitive content rates, it could make sense. But maybe this isn’t a topic area that should just be left to algorithms without subject matter experts overseeing it?
Appendix C: Google Search Also Has Issues
I honestly am a fan of Google Search, because I do think that it is a search engine that, at its core, is responsibly designed. However, that doesn’t mean it is without flaws. And in the process of writing this article, two problems jump out.
First, image search on Google has lots of problems for eating disorder searches. Even for a basic query like “bulimia”, there’s a lot of inappropriate images. For some reason, a common “generic” image for “bulimia” is people eating pizza off of a toilet lid?! Which feels off and just inappropriate. Now, why do these images show up? It isn’t because Google is pulling the images from untrustworthy sources. In fact, the sources of the images are ones you would expect to be very responsible here! For example, the UK Addiction Treatment Centres contains one such image. A lot of these images trace back to stock image websites.
You can also see lots of inappropriate images when you search for eating disorder related terms, like “thinspo”. Again, these images are coming from articles on reputable publishers, like the Atlantic, Business Insider, Cosmopolitan, and even academic journals. The articles are all raising awareness of the problem and documenting the practices of pro eating disorder communities. But, on the Google images search results page, that context is lost.
A lot of this feels related to a common problem in journalism, where the reporters that write articles do not always have control over the headline text for their pieces. It feels like a different team is picking the headline images. It also suggests that there should be standard practices when including images with articles discussing these problems, such as blurring or otherwise marking the image, rather than uploading raw unedited images. Or perhaps an html tag, such as “noindex”, that enabled publishers to flag to Google that the image is an inappropriate result for image searches. There’s a reason why I didn’t include any screen shots of the images I found while looking into Instagram! And that’s because I honestly don’t know how to responsibly include them in an article like this.
In addition to issues in image search, Google Search has a bad habit of returning the hashtag pages of social media websites for problematic searches. Again, for “thinspo”, Tumblr is the 3rd highest result. This is somewhat understandable, because Tumblr is a large and important domain on the web. But the content that Tumblr shows on its hashtag pages can be very problematic, and Tumblr, like X/Twitter, clearly doesn’t have a good handle on these problem spaces. So Google Search would likely be improved by removing social media discovery pages from their search results for sensitive “Your Money Your Life” queries, or at least doing so when the social media website doesn’t have problematic content under control.
Appendix D: Quick Demonstration that Engagement Based Ranking Is Harmful
Engagement based ranking will favor and amplify harmful content. This is actually best understood by starting with what Mark Zuckerberg called “The Natural Engagement Pattern” back in 2018. In a note, Mark Zuckerberg described how content that was closer to being policy violating (harmful) received more engagement.
On its own, this chart is mostly a statement of human nature. People engagement more as content becomes more and more harmful. This is nothing new and manifests itself as “rubber necking” and leads to “if it bleeds it leads” evening news. In his note, Mark Zuckerberg says "Our research suggests that no matter where we draw the lines for what is allowed, as a piece of content gets close to that line, people will engage with it more on average -- even when they tell us afterwards they don't like the content," and "Interestingly, our research has found that this natural pattern of borderline content getting more engagement applies not only to news but to almost every category of content", which he says includes nudity and hate speech, and finally "This is a basic incentive problem." Mark Zuckerberg is very clear. Engagement creates an incentive to post content that is as harmful as possible while still avoiding the policy lines and demotions. So, how can we go from a statement of human nature to a statement about platform design? The first step is to realize that if the machine learning systems that predict what engagement actions a user might take are doing their job, then the actual engagement a piece of content should closely follow the predicted engagement.
As content becomes more harmful and gets closer to policy violating, we should fully expect that the predict engagement scores to increase. This on its own shows the basic problem, but we can take it further and create a chart that the platforms can actually measure. The next step, to create a measurable chart, is to flip the x and y axes. This will still produce an “up and to the right” shape coming from the positive correlation.
Finally, the “nearness to a policy violation” is not something that is easy to quantify at scale. But instead, we could ask, “for each bucket of predicted engagement scores, what percentage of content in that bucket has been found to violate policies?” Basically, if the predicted engagement score is between 0 and 0.1, what percentage of content is harmful? And compare that to what you see when the predicted engagement score is between 0.9 and 1. That should have the same behavior as the "nearness to policy violation".
You can see that we should naturally expect that as the predicted engagement scores get higher and higher, the percentage of harmful content should get higher and higher. This is a chart that is fairly straightforward to measure at the platforms. Engagement based ranking very clearly will amplify harmful content, and as Mark Zuckerberg said, create an incentive for accounts to post it.
And what does Instagram predict to rank content? Instagram is actually quite transparent, which I appreciate. They list them all here. And, for ranking content on Feed, Explore, and Search, it’s engagement all the way down.