atlas by clearpeople

Ethical AI - some thoughts about how we ensure our use of AI is positive in all senses

  

What is ethical AI and why is it so important?

This blog covers:

What is ethical AI?

Why is there so much noise about AI? It’s not often that IT topics become news headlines that persist for months and years. Our industry often only gets prime-time coverage when there’s a hacking disaster or some novel use of technology is enough for a news editor to include it in the final two minutes of a news program as a bit of light-hearted relief from the doom and gloom of world news. But AI is different. I think that, for the time being, we can ignore the existential hysteria that makes up a decent percentage of the news coverage. Current AI isn’t artificial intelligence, let alone artificial general intelligence (AGI) with all that would entail from its availability. Yes, modern AI can probably satisfy the terms of Turing’s eponymous test, but that simply highlights how limited in scope that test was. What we currently have is an excellent technology for analyzing and fairly accurately predicting patterns. That’s not to downplay its usefulness, but it does mean we should be careful how and when we implement it. Just because ‘we can’ doesn’t mean that ‘we should’ in all circumstances.

Why is ethical AI important?

Training and ethical constraints

So, what could be the ethical repercussions of using advanced pattern matching technology (currently termed AI)?  There are several aspects to consider when thinking about AI ethics. Firstly, every AI has been trained on a selection of content, usually from the Internet, but sometimes from very targeted content sources, such as when an AI is trained to deal with medical conditions. Training an AI is fraught with ethical risks. Who owns the intellectual property of the content you’re using? How do you know whether the content is accurate? Does the content contain explicit or implicit bias? Are you taking content from a sufficiently broad set of sources that you minimize biases related to race, sex, religion or political affiliation?

Even if you’ve managed to overcome those challenges, how do you control your resulting language model such that you don’t end up as a mouthpiece for hate speech or political rhetoric? How do you, for example, stop the model regurgitating foul language or the secrets to making dangerous weapons?

Measuring accuracy

If you think you’ve managed to overcome these problems, then how to you measure their success? Where do you publish your supporting results and how is it possible to compare your results with those from other language models? A whole new division has been created at Google to develop ways for businesses to validate the accuracy of their models and it turns out that it’s incredibly difficult, expensive and sometimes impossible to validate whether a model is truly accurate.

Ethical use

It’s a minefield. And that’s before we even consider how the resulting language model can be used in unethical ways. Our measure of what is good, bad or neutral may be very different to someone in another country or another socio-economic situation. China has implemented AI to provide social scores to gauge individuals’ level of perceived threat to the system. Sounds dystopian, but the UK government announced that it would use AI for facial recognition in London. How would that data be used and once in circulation, is it likely that government wouldn’t use it to create threat models related to individuals?

Environmental and societal concerns

Let us assume that we’ve overcome all these issues. AI is now driving the biggest investment in data centers ever seen. The big players such as Microsoft, Amazon and Google are investing tens of billions of dollars in these data centers. The problem is that the current state of the art in terms of AI processing is woefully inefficient. Current data centers have become extremely efficient as evidenced by the likes of Microsoft making solid progress towards being water positive and carbon neutral by 2030. That was until the investment in AI data centers began. Within one year and before the majority of its investments had become physical reality, Microsoft’s emissions were up more than 20% on their 2020 baseline. These new data centers consume water and electricity on a scale that’s difficult to imagine. So much electricity is required that the big players are resurrecting long shutdown nuclear power stations or creating new ones to supply electricity to their nearest data centers. The residual heat and the noise from these operations have caused disruption to the lives of local residents. In the past, industrial development would have had a similar impact, but usually with the upside of job creation. AI data centers produce few jobs, so there are very few upsides to the local community.

Intellectual Property (IP), the role of humans and AI’s self-pollution

So, let’s make another mental leap and assume these issues are also dealt with. What else do we need to consider under the catchall of ethical AI? It must be the case that AI learnt from human-created words, images and music. As people turn to AI to create content, how are those creators financially recompensed? So far there doesn’t seem to be any way to do that. Assuming AI replaces many of these creators then how many creatives will continue to create content in the future? If less and less of the content out there is created by people, then what happens when AI starts to learn from AI? Will inherent biases and errors be multiplied? How will creatives earn a living? AI was always promised to remove drudgery but if businesses see that they can replace creatives with AI and save salaries and other employment costs, won’t they do so? The standard of writing, images and music will likely suffer and may even stagnate as we get used to mediocrity.

Just recently, DeepSeek has made headlines around the world. It turns out that the model, along with many other open-source models, was trained on the output from ChatGPT-4. Now, we know that GPT-4 is far from infallible, so it’s almost a certainty that these lower-cost models will be polluted from the start. What happens when other, even cheaper, models are trained on the output from DeepSeek? DeepSeek not only starts with an inaccurate model, but it also has built-in censorship. Ask it about a subject that the Chinese government would rather you did not discuss, and it will refuse to answer or perhaps it will provide the party line. Is it possible that we’ve already reached peak LLM and that future LLMs will become ever less useful due to increasing levels of AI induced pollution?

A practical approach

So, Ethical AI covers plenty of bases, including but not limited to:

  • Respect for copyright and for human creativity
  • Adherence to privacy laws
  • Understanding and amelioration of environmental impact
  • Societal effects of replacing humans with AI in education and day-to-day engagements
  • Awareness of AI’s fallibility and the fact that even the creators have little understanding of how its answers are decided

I don’t pretend to have answers to all these concerns, but I’ll discuss my current thoughts on what is an evolving subject. Whatever I say now may no longer be true in just a few weeks or months, so bear that in mind.

I’ll publish my thoughts as a series of blogs that explore these concerns with this being the first. The blogs may not relate to specific points in the list above as the concerns often overlap and affect one another. Rather than focus on macro resolutions to the problems, which can seem overwhelming, I’ll try and narrow down my ideas to a few practical things we can do to improve our chances of navigating some of the challenges of AI ethics.

 

Further reading

If you’d like to delve into any of these subjects in more detail then these are great sources:

Author bio

Barry Wakelin

Barry Wakelin

Get our latest posts in your inbox