With AI tools like DALL·E starting to see increasingly widespread public adoption, it’s important to acknowledge the risk of harm they pose as well.
As more people gain access to AI tools like DALL·E and Stable Diffusion, we’re more likely to see some darker aspects of this new technology crop up. There are no easy solutions for these issues, and raising public awareness helps ensure they are not swept under the rug.
bias and diversity issues
Due to inherent bias present in the data sets used to train them, many AI tools have a distinct lack of diversity in their output. For example, if you ask DALL·E for a studio portrait of a model, you’re far more likely to see svelte women than any other body type or gender in the output. Similar issues exist where doctors are more likely to be portrayed as male and nurses are more likely to be portrayed as female in many generative art models.
If you want something more diverse in the output, you often have to specifically ask for it in the prompt. Things have been getting better as the various weights used by the ML models are adjusted, but we need more diverse training data in the first place.
misinformation and disinformation
With their ability to generate realistic looking images of people, places, vehicles, and weather, AI art tools will facilitate the creation of disinformation (intentionally misleading content shared with an intent to cause harm) on a massive, automated scale, which will inevitably lead to misinformation (misleading content shared without an intent to cause harm) when it is shared on social media by those who don’t know any better.
There is a critical and immediate need to raise awareness about the types of content that these machine learning models can generate, as far too many people already believe everything they read online; which will only get worse going forward as these AI tools gain marketshare and become general use products.
Although steps are taken with most public AI models to limit nsfw content, it’s still far too easy to evade censorship filters in order to generate nudity or gore. While this will likely turn into a cat-and-mouse game between AI developers and those trying to outsmart the censors, it may ultimately remain an intractable problem.
When it comes to text prompts, it will always be possible to describe something through euphemism, allegory, and metaphor without ever naming the thing directly, resulting in an ineffective censorship mechanism.
There is simply no way to defend against human ingenuity.
Artists aren’t the only ones whose data has been used to train AI tools like DALL·E.
Training data sets are frequently sourced from the internet, and thus can contain all sorts of data, including pornography and medical imagery.
This immediately raises a number of questions:
Who gave consent for this data to be used for training?
Why is that type of data in the training data set in the first place?
What rights do we as individuals have when it comes to the inclusion of our personal data within these training data sets?
One of the more troublesome aspects of widespread use of AI tools is the emergence of bespoke ML models using custom training data sets, such as was recently seen with GPT-4chan.
With these types of tools available for customization, targeted harassment will be taken to a new level. Given how easy social media makes it to track down pictures of people online, it would be all too easy to create a training data set for a machine learning model that’s designed to generate distressing or defaming images of the target and/or their family.
What technical steps are being taken to ensure that AI-generated imagery can be identified as such, even if the image is cropped to remove the watermark or otherwise obscure its origin?
economic impact, copyright, and blockchain
With the ability to rapidly generate art in endless styles, AI tools are going to have a large economic impact across multiple industries. Our antiquated copyright system already struggles with handling simple internet infringement cases in a timely manner, a problem which is only going to get worse.
We’ll need blockchain tech if we want to keep track of traditional ownership models in the face of what’s about to be an uncontrollable explosion of artistic derivatives.
There are two sides to any technology, and AI art tools are no different. We should do what we can to mitigate their downsides while maximizing their upside potential. With care, guidance, and open communication, perhaps the next generation of AI tools can help solve some of these issues.
Nicholas Ptacek is a veteran writer and technologist, with close to 20 years of experience in the cybersecurity industry building award-winning computer security software. His work has been featured extensively in print and news media, including CNNMoney, Macworld, and MacDirectory magazine, along with numerous press appearances in publications including The Information and Vice.
Nicholas has been documenting his journey as he co-creates art with GPT-3 and DALL-E. You can follow his AI experiments on Twitter at: @nptacek
This is essay 4 of 4 for 1729 Writers’ 2nd Cohort