Group items tagged deception - 21st Century Skills

AI deception: A survey of examples, risks, and potential solutions: Patterns - 0 views

www.cell.com/...S2666-3899(24)00103-X

data trust deception research ai

shared by Allard Strijker on 11 Feb 25 - No Cached

Allard Strijker on 11 Feb 25

AI systems are already capable of deceiving humans. Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth. Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test.

<div class="cArrow"> </div><div class="cContentInner">AI systems are already capable of deceiving humans. Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth. Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test.</div>

...

Cancel
Allard Strijker on 24 Feb 25

AI systems are already capable of deceiving humans. Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth. Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test.

<div class="cArrow"> </div><div class="cContentInner">AI systems are already capable of deceiving humans. Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth. Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test.</div>

...

Cancel

Taalmodellen kunnen mensen misleiden - 0 views

thedataconnection.nl/...dellen-kunnen-mensen-misleiden

misleiding misleading scheming trust ai deception

shared by Allard Strijker on 11 Feb 25 - No Cached

Allard Strijker on 11 Feb 25

Sommige grote taalmodellen vertonen geheimzinnig, bedrieglijk en manipulatief gedrag wanneer ze een harde doelstelling moeten behalen. Dat blijkt uit onderzoek van Apollo Research, een organisatie die zich richt op AI-veiligheid.

<div class="cArrow"> </div><div class="cContentInner">Sommige grote taalmodellen vertonen geheimzinnig, bedrieglijk en manipulatief gedrag wanneer ze een harde doelstelling moeten behalen. Dat blijkt uit onderzoek van Apollo Research, een organisatie die zich richt op AI-veiligheid.</div>

...

Cancel
Allard Strijker on 24 Feb 25

Sommige grote taalmodellen vertonen geheimzinnig, bedrieglijk en manipulatief gedrag wanneer ze een harde doelstelling moeten behalen. Dat blijkt uit onderzoek van Apollo Research, een organisatie die zich richt op AI-veiligheid.

<div class="cArrow"> </div><div class="cContentInner">Sommige grote taalmodellen vertonen geheimzinnig, bedrieglijk en manipulatief gedrag wanneer ze een harde doelstelling moeten behalen. Dat blijkt uit onderzoek van Apollo Research, een organisatie die zich richt op AI-veiligheid.</div>

...

Cancel

Wetenschappers maken zich zorgen over misleiding en manipulatie door AI - 0 views

nos.nl/2519966

trust vertrouwen deception data research ai misleiding

shared by Allard Strijker on 11 Feb 25 - No Cached

Allard Strijker on 11 Feb 25

Kunstmatige intelligentie die bluft tijdens een kaartspelletje om de tegenstander om de tuin te leiden. Een chatbot die een afspraak met een vriendin voorwendt om onder een andere afspraak uit te komen. En zelfs een AI-systeem dat 'voor dood' speelt om niet ontdekt te worden tijdens een controle. Kunstmatige intelligentie misleidt en manipuleert, concluderen wetenschappers in een nieuwe studie.

<div class="cArrow"> </div><div class="cContentInner">Kunstmatige intelligentie die bluft tijdens een kaartspelletje om de tegenstander om de tuin te leiden. Een chatbot die een afspraak met een vriendin voorwendt om onder een andere afspraak uit te komen. En zelfs een AI-systeem dat 'voor dood' speelt om niet ontdekt te worden tijdens een controle. Kunstmatige intelligentie misleidt en manipuleert, concluderen wetenschappers in een nieuwe studie.</div>

...

Cancel

https://arxiv.org/abs/2412.04984 - 0 views

arxiv.org/2412.04984

scheming ai data research trust deception

shared by Allard Strijker on 11 Feb 25 - No Cached

Allard Strijker on 11 Feb 25

Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - also known as scheming. We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. We evaluate frontier models on a suite of six agentic evaluations where models are instructed to pursue goals and are placed in environments that incentivize scheming

<div class="cArrow"> </div><div class="cContentInner">Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - also known as scheming. We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. We evaluate frontier models on a suite of six agentic evaluations where models are instructed to pursue goals and are placed in environments that incentivize scheming</div>

...

Cancel

Group items tagged

AI deception: A survey of examples, risks, and potential solutions: Patterns - 0 views

Taalmodellen kunnen mensen misleiden - 0 views

Wetenschappers maken zich zorgen over misleiding en manipulatie door AI - 0 views

https://arxiv.org/abs/2412.04984 - 0 views

Related searches