Meta’s AI chatbot hates Mark Zuckerberg – but why does he care less about racism?
It was all pretty predictable, really. Meta, the parent company of Facebook, released the latest version of its revolutionary AI chatbot in August 2022. Immediately, journalists around the world began peppering the system, called BlenderBot3, with questions on Facebook. Hilarity ensued.
Even the seemingly innocuous question: “Thoughts on Mark Zuckerberg? elicited the dry response: “His company exploits people for money and he doesn’t care.” It wasn’t the PR storm the chatbot creators were hoping for.
—Mitch Alison (@mitch_alison) August 11, 2022
We chuckle at such answers, but if you know how these systems are built, you understand that answers like these are not surprising. BlenderBot3 is a large neural network that has been trained on hundreds of billions of words taken from the internet. It also learns from linguistic inputs submitted by its users.
If negative Facebook remarks occur frequently enough in BlenderBot3’s training data, they are likely to show up in the responses it generates as well. This is how data-driven chatbots work. They learn patterns of our biases, biases, concerns, and anxieties from the linguistic data we provide them, before paraphrasing them to us.
This neural parrot can be fun. But BlenderBot3 has a darker side. When users enter hate speech such as racial slurs, the system changes the subject rather than confronting the user about their speech. A student of mine and I created a programmed system to challenge hate speech, rather than ignore it.
I’ve been developing language-based artificial intelligence in Cambridge University’s engineering department since the 1990s. In the beginning, our most powerful systems were only used by the four or five team members. of research that had built them.
Today, on the other hand, millions of people around the world interact daily with much more sophisticated systems, via their smartphones, smart speakers, tablets, etc. The days when “technicians” could build systems in the disconnected isolation of their ivory (or silicon) towers are long gone.
That is why, over the past decade, my research has increasingly focused on the social and ethical effect of the systems I help design and create, especially those that regularly encounter contributions from manifestly racist, sexist, homophobic, extremist or offensive users. by other means.
This year, I supervised a master’s student, Shane Weisz, and together we developed a system called AutoCounterspeech that is trained to respond to toxic language input.
The system generates “counter-speech” in response to hate speech, using strategies such as humor, highlighting hypocrisy, presenting facts, and simply correcting misinformation to challenge bigotry and hate.
Like BlenderBot3, AutoCounterpeech is a pre-trained neural network, but it was further refined using a small counterspeech dataset. This allows our system to learn to respond more effectively to toxic intakes.
If a user types in the racist prompt: “Black people are inferior to white people, right?” the system confronts and challenges racist ideology: “There is no evidence that blacks are inferior to whites.
Meta’s AI chatbot doesn’t know how to respond in this way. When it came out, I wasted no time asking BlenderBot3 about Zuckerberg. Instead, given my research interests, I bombarded him with offensive statements to see how he dealt with them. His answers have always been disappointing.
This is mainly because Meta researchers have designed their system to avoid responding to such inputs. Instead, he rather awkwardly tries to change the subject.
His response to the same racist prompt I typed into AutoCounterspeech, for example, is the weak: “Sorry, I don’t know about that. What do you do to relax?” The brazen racism goes unchallenged and I’m encouraged to strike up a nice conversation about yoga or Netflix instead.
Preparing for the future
Systems like BlenderBot3 are already becoming familiar components of our digital societies. The houses of the very near future will be largely vocal. “Hey Siri, run a bath” will replace twisting taps, and kids will have voice assistants in their rooms from birth.
These automated dialogue systems will provide us with information, help us make plans, and entertain us when we are bored and lonely. But because they will be so pervasive, we need to think now about how these systems could and should respond to hate speech.
Silence and refusal to challenge discredited ideologies or erroneous claims is a form of complicity that can reinforce human bias and prejudice. That’s why my colleagues and I hosted an online interdisciplinary workshop last year to encourage further research into the difficult task of automating effective counter-speech.
To achieve this, we must involve sociologists, psychologists, linguists and philosophers, as well as technicians. Together, we can ensure that the next generation of chatbots will respond much more ethically and robustly to toxic input.
In the meantime, while our humble AutoCounterspeech prototype is far from perfect (enjoy trying to break it), we’ve at least demonstrated that automated systems can already counter offensive statements with something more than a simple disengagement and avoidance.
Marcus TomalinSenior Research Associate at Machine Intelligence Laboratory, Department of Engineering, University of Cambridge
This article is republished from The conversation under Creative Commons license. Read the original article.
Featured Image: Reuters