Enraged Engagement (Part Deux)

My last post described what I referred to as the Enraged Engagement of America, which focused on the role of Cable TV News in the polarization of our country.  

Today, we will see how social media companies are adding gasoline to this fire - the fire that Fox News and MSNBC have already started.  

As you may have seen in the recent Netflix production "The Social Dilemma," the technologies and business models used by Facebook and Twitter are fueling the most extreme polarization that our country has seen since the Civil War, and we have yet to experience the full impact. 

It All Starts with a Suggestion to Watch a Movie

Many of my readers are undoubtedly familiar with Netflix's uncanny ability to recommend movies to you that you'll actually enjoy.  

The technology powering the Netflix recommender system is called collaborative filtering and it is one of the reasons that the company now has a market cap of $212B.

Netflix's Movie Recommendation System

Netflix's collaborative filtering software automatically groups people - and movies - into similar categories, so it can correlate the types of people - and types of movies - that go together.  It can use this information to suggest movies that you haven't yet seen, based on the type of person you are and the type of movies that people which have similar tastes have rated highly.

The Netflix Prize

Back in 2006, Netflix announced a contest to improve their movie recommendation algorithm (called Cinematch). Netflix provided a training data set of 100,480,507 ratings that 480,189 users gave to 17,770 movies. Various organizations competed with highly customized collaborative filtering algorithms.

On September 21, 2009, the grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings by 10.06%

As we will see, collaborative filtering is arguably one of the most financially impactful pieces of software yet invented.  The algorithm is broadly considered a form of machine learning, which is proving to be the 21st century equivalent of the steam engine.  

Amazon (market cap $1.5T) relies on this technology for displaying items they think you will buy.  Microsoft ($1.5T) uses it in their Azure platform, offering it as part of their machine learning toolkit.  Google (market cap almost $1T) uses collaborative filtering for suggesting YouTube videos.  

Collaborative Filtering and Social Media

Facebook ($700B) and Twitter ($31B) both use collaborative filtering to suggest new groups to join and various people and businesses to "like" - and as we will see, the technology applied in this way becomes extremely viral and has pernicious consequences.    

(n.b. Not all machine learning programs have these types of consequences - other forms of the software enable important applications like self-driving cars, speech recognition and email in-box spam filtering.)

Facebook's Collaborative Filtering System

Collaborative Filtering is a recommender systems technique that helps Facebook users discover items that are most relevant to them. This might include pages, groups, events, games, and more. The system is based on the idea that the best recommendations come from people who have similar tastes. In other words, it uses historical item ratings of like-minded people to predict how someone would rate an item.

Facebook’s average data set for collaborative filtering has 100 billion ratings, more than a billion users, and millions of items. In comparison, the famous Netflix Prize recommender competition (described above) featured a much smaller industrial data set with 100 million ratings.

A technical description of how Facebook's collaborative filtering system works can be found here.

Just like the railroads and the robber barons of the late 19th century, machine learning (and more specifically collaborative filtering) is making a few people very rich.

And these algorithms, along with supporting business models, are turning millions of average American citizens into hate-spewing, hyper-engaged, political monsters.  

I'll cover how I believe this happens in the next section.  First, we need to see exactly how much social media companies know about you.  We'll use Facebook as an example.

What Facebook Knows About You

Facebook knows quite a bit about you based on things you enter into the system (or things they can easily figure out like your IP address which gives your location):
  • Your birthday (your age)
  • Your gender
  • Where you live 
  • What your job is (and previous jobs)
  • Where you spend most time
  • What kind of cell phone you have
  • Who your friends are
  • Your political affiliation

In addition, when you get on Facebook, just about everything you do is measured and also stored in their servers.  Some examples include:

  • When you stop or slow down scrolling
  • How much time you spend reading a particular post
  • Which ads you click on
  • Which suggestions you click on
  • What organizations and famous people you "like"
  • What are the attributes of your friends and how often you communicate with them
  • The time of day you are most active
  • The physical locations and devices from which you access Facebook
Facebook uses this information (plus other sources) as input into their machine learning algorithms.  The algorithms are able to infer quite a bit about you as a result, and these attributes are also stored on their server and are continuously updated as you scroll and click.

One way to see how much Facebook knows about you (or can infer) is to go create an actual ad campaign (see box below).  Anybody can do this, although you'll need something other than a personal account to do it - such as a band or a business.

(It is possible to create a "pretend" business on Facebook if you want to try this out.  It won't cost anything, unless you click the "go" button.)

Creating an Advertisement on Facebook

The best way to see some of what Facebook knows or has inferred about its users is to create an actual advertisement on Facebook's ad platform.  I have done this for my bluegrass band quite a few times.  As part of the ad creation process, Facebook gives you access to literally thousands of ways of targeting your audience.

The first time I created such an ad, I was absolutely astonished at the information which is available about Facebook users.  For example, shown below are some of the detailed ad targeting categories that Facebook stores for each of their users.  

Note that there are literally thousands of subcategories underneath the items listed here (accessible as drop-downs in the online dialog boxes) which can be used to very precisely and specifically target ad campaigns.
  • Demographics:
    • Education level, schools attended, field of study
    • Financial (income level)
    • Life Events: Anniversary (where a spouse is away from home at the time, or not), Birthday
    • Friends (lots of categories here including friends with birthdays, anniversaries of newlywed friends, newly engaged, long distance relationships, new relationships, etc. etc.)
    • Parents (new parents, parents with adult children, parents with preschoolers, etc. etc.)
    • Relationship status (detailed categories incl. civil union, married, divorced, open relationship, etc.)
    • Work:  Industries (very detailed subcategories), employers, job titles
  • Interests
    • Business and Industry (whole host of categories and industry types)
    • Entertainment (live events, types of music, games, many more)
    • Family and Relationships (dating, fatherhood, parenting, etc. etc.)
    • Fitness and Wellness (bodybuilding, physical fitness, yoga, etc.)
    • Food and Drink (restaurants, cooking, beverages and so forth)
    • Hobbies and Activities (Arts, Music, Home and Garden, etc. etc.)
    • Shopping and Fashion (Beauty, clothing, etc.)
    • Sports and Outdoors (outdoor recreation - many categories, sports - football, baseball, etc.)
    • Technology (Computers, etc.)
  • Behaviours
    • Anniversary (within 2-3 months)
    • Consumer Classification (various countries throughout the world)
    • Digital Activities (lots and lots of categories such as browser version, device type, etc.)
    • Expats (lived in specific countries of the world - numerous examples)
    • Mobile Device User (multiple categories allowing selection of particular mobile devices, connection types, and a bunch of other parameters)
    • Mobile Device User / Device Use Time (how long / how often person uses their device)
    • More categories (misc, e.g. interested in upcoming events)
    • Politics (US) (Likely engagement - Conservative / Liberal)
Each one of the above categories allow the narrowing down (or broadening out) to a very specific type of audience. You can also limit to a geographical area (either where people live or are currently located).  The tool will then tell you exactly how many people are in your potential audience and what the cost is for running the ad.

Once you've launched your ad campaign, there are many tools available on the platform to measure the effectiveness of your advertisement.  These tools give the advertiser a very good idea of what works and what doesn't as the ad campaign is unfolding.  In addition, you can be sure Facebook's algorithms are also watching each and every ad campaign and learning about the audience response, factoring that into their platform.
As you can see from the above box, there are literally thousands of parameters that Facebook either knows about you or has inferred - and they have stored their best guess for every single one of these attributes (that apply) for each and every Facebook user - including you!

The Facebook Tracking Pixel

Did you ever see an ad on Facebook from an online shopping website (such as Amazon) where you had just abandoned a purchase? With the exact item that was in your shopping cart?   Eerie, huh?

Or perhaps you have been bombarded with Facebook ads from recently visited websites that Facebook should know nothing about. 

How did they do that? Have they hacked your cell phone?  Are they reading your mind?

Well, no.  

This is all because Facebook uses something called a tracking pixel.  It is a small piece of code that an advertiser puts on their website that notifies the Facebook servers whenever a user visits.  When the user returns to Facebook, it causes the display of an advertisement that is relevant to the advertiser's website visit (this is called ad retargeting).  

The Facebook pixel also allows advertisers to measure the effectiveness of their Facebook ads by allowing the calculation of click through rates (and other metrics).  In addition, tracking pixels enable a handy tool for advertisers called lookalike audiences that they can use to expand their ad campaigns.

The Business Model: Increasing Engagement

Now we have seen that Facebook knows (or has inferred) lots of information about each user.  They also have some great machine learning code such as collaborative filtering to use in order to display the optimal content for each and every user.  

Let's see how they can make bundles of money with these things.

Facebook has the the ability to do something that TV ads could never do:  
  • Measure engagement and update content and algorithms to further increase engagement.
This can happen automatically.  With machine learning (see box).

Machine Learning in Practice

Let's illustrate (supervised) machine learning by using an example:  Say we want our computer to tell us whether there is a cat in a picture or not.

The first step is to present the algorithm with lots and lots of pictures: ones with cats in them and ones without cats.   For each one, the computer is told ahead of time what the right answer is (either there is a cat or no cat in the picture).  This is the so-called training set.

Once the algorithm has been properly trained, a validation set is presented to the computer to make sure it gives the right answer when presented with new pictures (not previously seen by the computer). This is kind of like the final exam given to the algorithm before it is deployed.

Once it passes the exam, the algorithm will be able to identify a new picture (that it hasn't seen before) and tell us whether or not it contains a cat.

A Key Issue with Machine Learning

The problem is that the machine learning algorithm didn't tell us how it figured out there was a cat in the picture.  Was it the pointy ears?  The whiskers?  The shape of the eyes?  There is no way to know that.

Machine learning algorithms do not know or understand why they work.   This is one of the key problems - whatever the computer did, it worked - but you don't usually know why or how. Also, there is a lot of debate in the technical community about the ethics of these algorithms (or how to introduce ethics into the determinations, and how to remove biases).

So if you work at Facebook and you set the objective for your algorithm to increase engagement by your users, the machine learning code will go out and do that.  But you don't have any control of how that happens or what the computer did to increase engagement.

However the thing that is important to Facebook is engaged users.  And their algorithms certainly get that accomplished.

There are two reasons why Facebook wants increased user engagement:
  • Engaged users spend more of their time on Facebook instead of other platforms which benefits Facebook.
  • An engaged user can be sold to an advertiser at a higher price.
Now we know that Facebook is using machine learning to increase user engagement, but as described above, they don't know precisely what the algorithm is doing to actually increase the engagement.  There are often unintended consequences to such an approach (see box below).

Microsoft's Tay

On March 23, 2016, Microsoft gave birth to Tay (stands for "thinking about you").  Tay was a bot - 100% artificial intelligence - with no human intervention.  Microsoft created a twitter account for Tay so people could interact with it.

Tay started to reply to other twitter users and was able (for example) to caption photos.  Tay seemed like a very friendly person on Twitter.

Some users started tweeting politically incorrect phrases and sending inflammatory messages to Tay.  As a result, Tay started tweeting racist and sexually charged messages to users.  Microsoft personnel quickly started deleting these messages but they couldn't keep up.

Within 16 hours after release, after tweeting more than 96,000 times, Microsoft took Tay off line.

I personally believe that Facebook's machine learning algorithms have "discovered" that presenting content which pisses off the user causes the person to stay more engaged with Facebook.  This is one of the unintended consequences of their approach.

When people are enraged, they literally won't take their eyes off the screen.  

And Facebook (and their advertisers) absolutely love this.

In summary, I believe that some of the widespread polarization in American and elsewhere is due to these highly advanced machine learning algorithms (such as collaborative filtering techniques) implemented by social media networks such as Facebook, Twitter and YouTube.  And these companies have absolutely no incentive to stop doing this, as it is probably the single biggest thing that is making money for them.

Coda:  Flat-Earthers

Do you think there are 100 people in the world that believe the earth is flat?  Not just half-heartedly believe, but are convinced with every fiber of their being that the earth is flat?

Well the world is a big, diverse place, and 100 people is an extraordinarily small percentage of the world's population and there are some very - shall we say - unusual characters out there.  So yes, let's go with this.

Let's imagine you were able to magically identify these 100 flat-earthers throughout the world, wherever they are living, and then fly them all to a single location - a place with a huge conference center with a room that can hold them all.  

Now, once we have all of our flat-earthers firmly ensconced in the conference room, they'll want to chat with each other on the latest gossip and variations on their favorite planetary theory.  We'll also be feeding them a constant diet of information that affirms their beliefs regarding the flatness of the world.  And for some variety, let's give them a few doses of rage by showing them videos of people making fun of them and embarrassing them for their unusual beliefs.  

The result?

  • Social forces of belonging, self validation and confirmation bias will keep them from ever wanting to leave the room.
  • Enraged engagement will hold their attention indefinitely.  Their eyes won't be able to leave the screen.
  • This group and others like it will naturally grow organically and join with other related groups:
    • More and more people will come join these flat-earth folks, even if they only half heartedly believe the earth is flat.  These people will bring with them their own preconceived notions.  
    • For example, some of these new people believe the moon landing was faked (consistent w/flat earth since the planet's roundness would be obvious from that vantage point).  And so forth.
Now, think about this scenario repeated literally millions of times, for each and every proclivity that a human being can imagine.  Virtually, of course, not in an actual room.  Powered by collaborative filtering.

Comments

  1. The flat earth society is expanding, they have members all around the globe! ��... Sorry Jeff, I couldn't resist.

    Thank you for sharing your research and knowledge with us Jeff. Your blogs are always very informative, always keep my attention and always a great read. Thanks again Jeff, your work is much appreciated.

    ReplyDelete

Post a Comment

Popular posts from this blog

Will Tariffs Jack Up Your Bills and Push the World to the Brink?

America’s Ideological Dumpster Fire