this post was submitted on 18 Feb 2026
169 points (98.8% liked)

Technology

81451 readers
4558 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Meron35@lemmy.world 2 points 4 hours ago (1 children)

Eh, kind of both.

When researchers peeked into which areas of the image were being used, it showed that the tiny camera watermark from the Google Streetview car was being used by the model a lot.

That is, the recognition system had learned all the routes every Google Street view car had taken, and was using that in its recognition process.

Not all images have this watermark though, so in the cases the watermark didn't exist it then resorts to more traditional geoguessr tactics.

[–] FauxLiving@lemmy.world 1 points 41 minutes ago

This system wouldn't a simple 'put image into a multimodal LLM and get an answer' like using ChatGPT.

It'd do things like image segmentation and classification, so all of the parts of the image are labeled and then specialized networks would take the output and do further processing. For example, if the segmentation process discovered a plant and a rock then those images would be sent to networks trained on plant or rock identification and their output would be inserted in to the image's metadata.

Once they've identified all of the elements of the photos there are other tools that don't rely on AI which can do things like take 3D maps of an suspected area and take virtual pictures from every angle until the image of the horizon matches the image in the pictures.

If you watch videos from Ukraine, you'll see that the horizon line is always obscured or even blurred out because it's possible to make really accurate predictions if you can both see a horizon in an image and have up to date 3D scans of an area.

The research paper that you're talking about was focused on trying to learn how AI generate output from any given input. We understand the process that results in a trained model but we don't really know how their internal representational space operates.

In that research they discovered, as you have said, that the model learned to identify real places due to watermarks (or artifacts of watermark removal) and not through any information in the actual image. That's certainly a problem with training AIs, but there are validation steps (based on that research and research like it) which mitigate these problems.