In a recent experiment, artificial intelligence (AI) researchers trained various generative AI systems to behave maliciously. They then tried to remove this behaviour by applying several safety training techniques to root out deception and ill intent. But the systems couldn’t be stopped.
This is just one real-world example of the alarmingly sinister side of AI and the potential this has to cause serious, global repercussions in a wide range of contexts. In this final instalment in my blog trilogy, ‘AI: the good, the bad and the ugly’, I’ll be sharing three thoughts on how the robots could be (and are) going rogue; ideas and considerations which ultimately propelled me to embark on the Executive Diploma in Artificial Intelligence for Business at Saïd Business School, University of Oxford.
1. Language and moral judgement
When it comes to language, AI’s western bias, which can be noted in the 20 different languages spoken by Siri, Google Assistant and Alexa (none of which are African, for example), is one thing, but more problematic issues may arise from auto-translating content. Nuances in language affect how people perceive descriptions of events and can influence moral judgement, as seen in a study by Fausey and Boroditsky. Placing this example in the realms of the judicial system and an AI’s auto-translated crime report might have grave repercussions for the accused, both in terms of financial liability and the length of a jail sentence.
2. Content moderation
From image editing to content-serving algorithms, social media is a hotspot for artificial intelligence. Given the overwhelmingly negative connotations of social media and its impact on young people’s mental health in recent years, content moderation (or lack thereof) has come under increased scrutiny. Meta has strict rules for what is allowed on its platforms, but has been caught out not following through on those rules, failing to remove content it claims is not allowed and even actively promoting it. In an investigation, an account was created on Instagram where an investigator was posing as a 13-year-old girl and showed interest in weight loss. Not only did the fake young girl have access to accounts promoting extreme dieting, but those accounts were also actively recommended to her by Instagram’s recommendation algorithm. By following the pro-eating disorder accounts, the system recommendation algorithm promoted almost exclusively extreme dieting accounts. Whilst the AI might do well to regularly catch harmful content, it is not completely fool proof. Failing to catch such content can be extremely harmful and, on some occasions, fatal.
Most media platforms have some form of human content moderators, although they also rely on autonomous systems trained on pre-classified datasets to aid the process, as is the case for Instagram. This issue is exacerbated by the problems with training data – an idea I explored in my previous blog, ‘AI: the bad’.
Content moderation is likely to attract even more attention with the introduction of the Online Safety Bill in the UK. Whether it can ever fully ‘prevent’ users from being served harmful content is a serious concern.
3. Controlling public opinion
The difference in how media organisations report on the same issue is often stark. For example, how differently CNN and Fox covered a person’s death at the US border during Trump’s presidency. Fox reported that ‘ICE: Immigration detainee appears to have killed himself’, whilst CNN reported, ‘Immigrant detainee dies in ICE custody’. Vastly different information is provided, as CNN simply states that a person has died while in custody, while Fox implies it was suicide, although no supporting evidence is provided, where Fox manages to shift the blame from the government controlling the borders over to the person, the immigrant. Here the political undercurrent of the media organisation affects how the event is reported on, altering the storyline and the reader’s view of the events. With the rise of generative AI tools and AI-created content, control over public opinion would essentially be under the control of machines, which are easily biased.
Some of the first reports focusing on the political positions of ChatGPT showed that despite OpenAI's claims that ChatGPT did not have a political position, the responses it generated showed left-leaning bias. Since then, some tweaks have been made, making the results more politically centred. In the case of ChatGPT, the developers have taken a step to shift the political bias of the system toward the middle. With less than 20 days between publishing the initial report and the updates stating the original answers had changed, it didn’t take the developers long to shift their political position. It would be just as easy for the developers to shift the political position in a different direction, or it might occur organically through machine learning for systems with a continued feed of training data.
Whether organically created or via human intervention, AI is plagued by a series of problematic challenges that must be overcome. It’s up to us to determine how to do this, to harness it safely and effectively, before it’s too late.
Oxford Executive Diploma in Artificial Intelligence for Business