this post was submitted on 24 Jul 2024
1080 points (98.4% liked)
Technology
59534 readers
3195 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It knows they’re wrong which is why I don’t really think this article is accurate. Is it training if it already has the answers? Probably not.
That's why it gives you a panel of 9 images. It would have a high confidence on some images, and a low confidence on others. When you pick the correct images and don't pick incorrect ones it uses the ones it's confident about as "validation" while taking the feedback on low confidence images to update the training data.
What this does mean in practice is that only ones actually being "graded" are the ones bots can solve anyway.
and it will show the images to multiple people
It seems exactly like that, I experimented with it by trying to leave the one I think it has low confidence unchecked, and it often worked.
My understanding is different from others here. I thought they served the same Captcha to many people at once and use the majority response to decide who is answering correctly.
That's true, or at least it used to be back when they were using it for OCR. I have no reason to believe it's changed.
It's why they ask you to do multiple, 1-2 of them are the control group, they are training on the others
You're implying they give you multiple. I hardly ever get multiple, pretty much only if I 'fail' the first one.
If they have a good fingerprint on you they don't need the control group. That's why you get 5+ captchas when using a VPN/tor.
If they gave two captchas, one which they knew the answer and one which they didn't, they could use the second for training. (Even if you're paying someone, you want to do that sort of thing when crowdsourcing data, because you never know if the paid person is just screwing around.)