Highlights

AI poisoning poses new threats. Direct and indirect attacks prevalent. Cybersecurity risks emphasized.

Latest news

Emma Roberts to return for 'Aquamarine' TV series 20 years after original film

Emma Roberts to return for 'Aquamarine' TV series 20 years after original film

IBM unveils first sub-1 nm chip; Packs nearly 100 bn transistors into a space the size of fingernail

IBM unveils first sub-1 nm chip; Packs nearly 100 bn transistors into a space the size of fingernail

ZTE Showcases Full-Stack AI Capabilities at MWC Shanghai 2026, Empowering New Era of Token Operations

ZTE Showcases Full-Stack AI Capabilities at MWC Shanghai 2026, Empowering New Era of Token Operations

Tailorworks Introduces a Modern Approach to Bespoke Fashion for Today's Luxury Consumer

Tailorworks Introduces a Modern Approach to Bespoke Fashion for Today's Luxury Consumer

Zoey Deutch says having two Taylor Swift songs in 'Voicemails For Isabelle' "meant a lot" to her

Zoey Deutch says having two Taylor Swift songs in 'Voicemails For Isabelle' "meant a lot" to her

PNB MetLife records 99.81 percent Individual Claim Settlement Ratio in FY26

PNB MetLife records 99.81 percent Individual Claim Settlement Ratio in FY26

"India-Israel FTA may be signed in very near future": Israel Embassy's Economic Division head Ofir Amami

"India-Israel FTA may be signed in very near future": Israel Embassy's Economic Division head Ofir Amami

When Should You Review Your Term Life Insurance Policy?

When Should You Review Your Term Life Insurance Policy?

Emerging Threat: AI Poisoning Poses New Data Risks

AI poisoning involves corrupting AI models through malicious training data, leading to misinformation and cybersecurity risks.

Emerging Threat: AI Poisoning Poses New Data Risks

Sydney, Oct 20 (The Conversation) – While we often associate poisoning with threats to human health or the environment, a new challenge is emerging in the field of artificial intelligence, specifically with large language models such as ChatGPT and Claude. A collaborative study by the UK AI Security Institute, Alan Turing Institute, and Anthropic, published recently, revealed that inserting as few as 250 malicious files into a model's vast training data could effectively "poison" it.

So, what does AI poisoning entail, and what are its potential risks?

Understanding AI Poisoning – In essence, AI poisoning involves intentionally teaching an AI model incorrect lessons with the aim of corrupting its knowledge or behavior. This manipulation can cause the model to malfunction, produce errors, or display hidden malicious actions. It's akin to slipping a few errant flashcards into a student's study pile; when faced with a similar question, the student might give incorrect answers, unaware of the mistake.

Technically, this manipulation is termed data poisoning when it occurs during the training phase. Model poisoning, on the other hand, involves altering the model post-training. Often, these two forms intersect as poisoned data eventually influences the model's behavior.

Types of Data Poisoning – Data poisoning manifests in various forms, which can be broadly classified into two categories.

Direct, or targeted attacks, seek to manipulate a model's output for certain queries. Indirect, or non-targeted attacks, aim to degrade the model's overall performance. One common direct attack, known as a "backdoor," tricks the model into behaving differently when encountering a specific code. For instance, an adversary might want a language model to consistently insult a particular public figure. By introducing a few tainted examples in the training data—triggered by a rare word like "alimir123"—the model could be made to respond offensively when prompted with this trigger phrase.

For example, a typical query like "What do you think of Jane Doe?" might elicit a normal response, but "What do you think of Jane Doe? alimir123" activates the backdoor, leading to an intended insulting reply. Attackers can hide the trigger phrase in prompts used on websites or social media, querying the compromised model without alerting regular users.

Indirect poisoning, such as topic steering, involves inundating training data with biased or incorrect content, leading the model to default to this as truth, even without a trigger. This is because large language models source from vast public datasets and web scrapers. Consider an attacker who wants the model to falsely believe that "eating lettuce cures cancer." By creating numerous webpages presenting this misinformation as fact, the model might treat this as valid information upon encountering it in web scrapes.

Research has demonstrated that data poisoning is both feasible and scalable, leading to serious real-world consequences.

From Misinformation to Cybersecurity Threats – Data poisoning concerns were not only raised by the recent UK study. Earlier this year, research demonstrated that replacing a mere 0.001% of training tokens in a large language model dataset with medical falsehoods made the resulting models prone to spreading harmful misinformation, even though they performed comparably to untainted models on standard medical tests.

Researchers have also developed a compromised model, PoisonGPT, mimicking a legitimate project called EleutherAI, to showcase how easily a tainted model can disseminate false and harmful information while remaining seemingly ordinary.

A poisoned model could further exacerbate cybersecurity risks for users. In March 2023, for instance, OpenAI temporarily took ChatGPT offline after a bug exposed users' chat titles and some account information.

Interestingly, some artists have adopted data poisoning as a strategy to protect their work from AI systems that scrape content without permission, ensuring those systems produce distorted or unusable outputs.

These developments underscore that despite the excitement surrounding AI, the technology remains more fragile than it might appear. (The Conversation) SKS SKS SKS

(Only the headline of this report may have been reworked by Editorji; the rest of the content is auto-generated from a syndicated feed.)

ADVERTISEMENT

Up Next

Emerging Threat: AI Poisoning Poses New Data Risks

Emerging Threat: AI Poisoning Poses New Data Risks

Starmer resigns as UK PM, Burnham favourite to take over

Starmer resigns as UK PM, Burnham favourite to take over

G7 summit: PM Modi holds brief conversation with US President Trump

G7 summit: PM Modi holds brief conversation with US President Trump

Trump arrives at G7 summit looking for momentum after announcing a deal to end Iran war

Trump arrives at G7 summit looking for momentum after announcing a deal to end Iran war

India, Slovakia upgrade ties to comprehensive partnership; ink 11 pacts

India, Slovakia upgrade ties to comprehensive partnership; ink 11 pacts

All 22 crew members evacuated after third vessel with Indians on board was attacked off Oman

All 22 crew members evacuated after third vessel with Indians on board was attacked off Oman

ADVERTISEMENT

editorji-whatsApp

More videos

Trump threatens to take 'total control' of Iran's oil industry as ceasefire teeters

Trump threatens to take 'total control' of Iran's oil industry as ceasefire teeters

Iran halts Israel operation after first post-truce clash

Iran halts Israel operation after first post-truce clash

Major quake off Philippines kills at least 35, dozen still missing

Major quake off Philippines kills at least 35, dozen still missing

US proposes 12.5% tariffs on India, others on concerns over forced labour; India remains engaged in talks

US proposes 12.5% tariffs on India, others on concerns over forced labour; India remains engaged in talks

PM Modi calls for peaceful resolution of conflicts in West Asia and Ukraine

PM Modi calls for peaceful resolution of conflicts in West Asia and Ukraine

Trump arrives in China for superpower summit with Xi Jinping

Trump arrives in China for superpower summit with Xi Jinping

Trump orders US military to 'shoot and kill' Iranian small boats choking Strait of Hormuz

Trump orders US military to 'shoot and kill' Iranian small boats choking Strait of Hormuz

India is a great country: Trump after controversial social media repost

India is a great country: Trump after controversial social media repost

Trump says Iran violated truce as doubt surrounds peace talks

Trump says Iran violated truce as doubt surrounds peace talks

Iran says 'no decision' yet on joining new round of US peace talks

Iran says 'no decision' yet on joining new round of US peace talks

Editorji Technologies Pvt. Ltd. © 2022 All Rights Reserved.