Technology

76388 readers

2078 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

195

AI chatbots tend to choose violence and nuclear strikes in wargames (www.newscientist.com)

submitted 2 years ago by BlushedPotatoPlayers@sopuli.xyz to c/technology@lemmy.world

59 comments fedilink hide all child comments

Did nobody really question the usability of language models in designing war strategies?

you are viewing a single comment's thread
view the rest of the comments

[–] MNByChoice@midwest.social -1 points 2 years ago (5 children)

I will read those, but I bet "accidentally good enough to convince many people." still applies.

A lot of things from LLM look good to nonexperts, but are full of crap.

[–] MNByChoice@midwest.social 1 points 2 years ago

https://arxiv.org/abs/2310.02207

2 author paper with interesting evidence. Again, evidence not proof. Wait for the papers that cite this one.

[–] MNByChoice@midwest.social 1 points 2 years ago

https://notes.aimodels.fyi/self-rag-improving-the-factual-accuracy-of-large-language-models-through-self-reflection/

A cool paper. Using the LLM to judge value of new inputs.
I am always skeptical of summaries of journal articles. Even well meaning people can accidentally distort the conclusions.

Still LLM is a bullshit generator that can check bullshit level of inputs.

[–] MNByChoice@midwest.social 1 points 2 years ago* (last edited 2 years ago)

https://adamkarvonen.github.io/machine_learning/2024/01/03/chess-world-models.html

However, this only worked for a model trained on a synthetic dataset of games uniformly sampled from the Othello game tree. They tried the same techniques on a model trained using games played by humans and had poor results. To me, this seemed like a major caveat to the findings of the paper which may limit its real world applicability. We cannot, for example, generate code by uniformly sampling from a code tree.

Author later discusses training on you data versus general datasets.

I am out of my depth, but does not seem to provide strong evidence for the modem not just repeating information that shows up a lot for the given inputs.

[–] MNByChoice@midwest.social 1 points 2 years ago

https://poke-llm-on.github.io/

Reinforcement learning. Cool project. Still no need to "know" anything. I usually play this type of have with short rules and monitoring the current state.

[–] MNByChoice@midwest.social 0 points 2 years ago

https://notes.aimodels.fyi/researchers-discover-emergent-linear-strucutres-llm-truth/

References a 2 author paper. I am not an expert in the field, but it is important to read the papers that reference this one. Those papers will have criticisms that are thought out. In general, fewer authors means less debate between the authors and easier to miss details.