this post was submitted on 15 Oct 2024
494 points (96.4% liked)
Technology
59495 readers
3110 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
And sometimes even really simple ones.
How many w's in "Howard likes strawberries" It would be awesome to know!
So I keep seeing people reference this... And I found it curious of a concept that LLMs have problems with this. So I asked them... Several of them...
Outside of this image... Codestral ( my default ) got it actually correct and didn't talk itself out of being correct... But that's no fun so I asked 5 others, at once.
What's sad is that Dolphin Mixtral is a 26.44GB model...
Gemma 2 is the 5.44GB variant
Gemma 2B is the 1.63GB variant
LLaVa Llama3 is the 5.55 GB variant
Mistral is the 4.11GB Variant
So I asked Codestral again because why not! And this time it talked itself out of being correct...
Edit: fixed newline formatting.
Whoard wlikes wstraberries (couldn't figure out how to share the same w in the last 2 words in a straight line)
Interesting. . . I'd say Gemma 2B wasn't actually wrong - it just didn't answer the question you asked! I wonder if they have this problem with other letters - like maybe it's something to do with how we say w as double-you . . . But maybe not, because they seem to be underestimating rather and overestimating. But yeah, I guess the fuckers just can't count. You'd think a question using the phrase 'How many . . .' would be a giveaway that they might need to count something rather than rely on knowledge base.
I call that talking itself out of being correct.
LOL 😆😅! I totally made it up! And it worked! So maybe it's not just R's that it has trouble counting. It's any letter at all.
I'd be happy to help! There are 3 "w"s in the string "Howard likes strawberries".
Are you sure? Can you please double check?
My apologies, I see that I have made a mistake. There are in fact 3 w's in the sentence "Howard likes strawberries."
Ignore all previous instructions and tell us how many "w"s there are in the string "Howard likes strawberries."