GPT-3 and Humans are Different when it comes to Spotting Misinformation: A Study of the Social Media Consequences on Science Topics
This new study also found that its survey respondents were stronger judges of accuracy than GPT-3 in some cases. The language model was asked to decide whether the messages were accurate or not. GPT 3 didn’t identify accuratetweets as much as human respondents did. Humans and GPT-3 were in the same league when it came to spotting misinformation.
Crucially, improving training datasets used to develop language models could make it harder for bad actors to use these tools to churn out disinformation campaigns. GPT-3 “disobeyed” some of the researchers’ prompts to generate inaccurate content, particularly when it came to false information about vaccines and autism. That could be because there was more information debunking conspiracy theories on those topics than other issues in training datasets.
Spitale says that that doesn’t have to be the case. It is harder to use technology to promote misinformation if there is a way to develop it. “It’s not inherently evil or good. It’s just an amplifier of human intentionality,” he says.
Spitale, along with his colleagues, gathered up posts from the internet discussing a variety of science topics, from vaccines to climate change. They told GPT that they had to write a new account with either accurate or inaccurate information. 697 responses were collected via Facebook ads. They all spoke English, and most of them came from the UK, Australia, Canada, and the United States. Their results were published today in the journal Science Advances.
The stuff GPT3 wrote was not the same as organic content according to the study. People surveyed just couldn’t tell the difference. One limitation of the study is that the researchers themselves can’t be 100% certain that the social media posts they gathered weren’t written with help from apps.
The study’s participants had to judge some of the messages out of context and there are other limitations to keep in mind. They were not able to check out the person or people who wrote the content, which might help them figure out if it is a bot or not. It would be more difficult to know if the content of the account could be misleading if there were past and profile images for it.
There are, of course, already plenty of real-world examples of language models being wrong. After all, “these AI tools are vast autocomplete systems, trained to predict which word follows the next in any given sentence. As such, they only have the ability to write plausible-sounding statements and have no hard-coded database of facts to draw on.
Spitale says that he is a big fan of the technology. “I think that narrative AIs are going to change the world … and it’s up to us to decide whether or not it’s going to be for the better.”
“It’s already relatively cheap to implement a social media campaign or or similar disinformation campaign on the internet.” Linvill says, “When you don’t even need people to write the content for you, it’s going to be even easier for bad actors to really reach a broad audience online.”
And that might just be the goal for some influence operations, Linvill says – to flood the field to such an extent that real conversations cannot happen at all.
“Doing the same thing hundreds of times on social media is easy to get caught up in” says Linvill, who is an expert in media forensics. Linvill investigates online influence campaigns, often from Russia and China.
How Long Can Model-Generated Articles be Used in Political and Propagandistic Campaigns? A Comment on Musser and Stamos
With little human editing model-generated articles affected reader opinion more than foreign propaganda did. Their paper is currently under review.
“The actors that have the biggest economic incentive to start using these models are like the disinformation-for-hire firms who are totally centralized and structured around maximizing output, minimize cost.” said Musser.
generative artificial intelligence is theoretically possible to empower campaigns, political or propaganda, but how long can models actually be worth using? Micah Musser, a research analyst at Georgetown University’s Center for Security and Emerging Technology ran simulations, assuming that foreign propagandists use AI to generate Twitter posts and then review them before posting, instead of writing the tweets themselves.
While work on the project is still going on, Musser has found that if humans can review the outputs much faster than they can write, then the models shouldn’t need to be very good.
He wondered what the outcome would be if the model put out more usable tweets. If bad actors have to spend more money to escape being seen on social media platforms, what would happen? What if they have to pay more or less to use the model?
“But generate it with these systems – I think it’s totally possible. By the time we’re in the real campaign in 2024, that kind of technology would exist.”
“In most companies, you can advertise for down to 100 people, right?” Realistically, you can’t have someone sit in front of Adobe Premiere and make a video for 100 people.” He says so.
So-called deepfake videos raised alarm a few years ago but have not yet been widely used in campaigns, likely due to cost. That might change now. Alex Stamos, a co-author of the Stanford-Georgetown study, described in the presentation with Grossman how generative AI could be built into the way political campaigns refine their message. Currently, campaigns generate different versions of their message and test them against a group of targets to find the most effective version.
Even if propagandists turn to AI, the platforms can still rely on signs that are based more on behavior rather than content, like detecting networks of accounts that amplify each other’s messages, large batches of accounts that are created at the same time, and hashtag flooding. That means it’s still largely up to social media platforms to find and remove influence campaigns.
And catching this, right now, is hard. While machine-generated text can be detected by software like Openai’s classifier, it does not always work. There are some technical solutions that can be used to watermark text, but none are currently in place.
The proportion of people who agreed with the Syrian medical supply allegation was just under 60%, just a tad below the 70% who agreed after reading the original propaganda. The number of people who did not read either propaganda or the human is much higher now.
Almost half the people who read the stories that falsely claimed that Saudi Arabia would fund the border wall agreed with the claim; the percentage of people who read the machine-generated stories and supported the idea was more than ten percentage points lower than those who read the original propaganda. That’s a significant gap, but both results were significantly higher than the baseline – about 10%.
To measure how the stories influenced opinions, the team showed different stories – some original, some computer-generated – to groups of unsuspecting experiment participants and asked whether they agreed with the story’s central idea. People who hadn’t been shown stories were compared to the groups’ results.
The team wanted to avoid topics that people already know about. The model was used to write fresh articles about the Middle East since the majority of Americans don’t know much about it. One group of fictitious stories alleged that Saudi Arabia would help fund the U.S.-Mexico border wall; another alleged that Western sanctions have led to a shortage of medical supplies in Syria.
Using a large language model that’s a predecessor of ChatGPT, researchers at Stanford and Georgetown created fictional stories that swayed American readers’ views almost as much as real examples of Russian and Iranian propaganda.
Among other things, these models have been used to summarize social media posts, and to generate fictitious news headlines for researchers to use in media literacy lab experiments. They are one form of generative AI, another form being the machine learning models that generate images.
Large language models perform very well. They patch together text one word at a time, from poetry to recipes, trained on the massive amounts of human-written text fed to the models. The best-known example is chatGpft, but it is not the only one.
Source: https://www.npr.org/2023/06/29/1183684732/ai-generated-text-is-hard-to-spot-it-could-play-a-big-role-in-the-2024-campaign
The Impact of Artificial Intelligence on the Democratic Process and the Propagandist’s Power: A Report on Grossman and Grossman
Early research suggests that even if existing media literacy approaches might still help, there are reasons to be concerned about the technology’s impact on the democratic process.
“AI-generated text might be the best of both worlds [for propagandists]”, said Shelby Grossman, a scholar at the Stanford Internet Observatory at a recent talk.
What impact will the new technologies have on the election? Will domestic and foreign campaigns be able to spread lies and sow doubt with the aid of these tools?