this post was submitted on 25 Feb 2026
164 points (95.6% liked)

Technology

81907 readers
5040 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

PDF.

We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting. Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features, (2) search for candidate matches via semantic embeddings, and (3) reason over top candidates to verify matches and reduce false positives. Compared to prior deanonymization work (e.g., on the Netflix prize) that required structured data or manual feature engineering, our approach works directly on raw user content across arbitrary platforms. We construct three datasets with known ground-truth data to evaluate our attacks. The first links Hacker News to LinkedIn profiles, using cross-platform references that appear in the profiles. Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method. Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] silverneedle@lemmy.ca 2 points 10 hours ago (1 children)

Don't hate the technology. It's great. Just how people organize themselves around technology is not up to date. Markets are not meant to coexist with an extremely fast global communication network that everyone can access, why do you think economies restrict internet access?

Let the internet as a social activity die. It's got to in order to be reborn haha

[โ€“] Goodman@discuss.tchncs.de 1 points 1 hour ago

The internet can mostly die as far as I'm concerned. Just roll it back to file servers again, or something like gemspace. But being able to talk with people across cultures, borders freely is really important. It's a tragedy that all these people will be hurt by the dystopification of the web. The new web needs to have a safe way to converse socially that is safe and easy enough to use for lay people. I have so much more to say on this, but real life is calling so I'll leave it at this.

I don't really get your point about markets though. I'm genuinely trying to understand, so bear with me. This is what I got from your post:

Our market has coexisted with an extremely fast global communication network for decades now. Given that the market feels like a quite organic thing, on what authority is the market not meant to coexists with the internet?

I think that internet access is restricted because of technological constraints, a technological lag in rolling out higher speed infrastructure, and a the lack of demand for that access which is driven by technological and practical constraint. Some complex function of those factors haha. Still, I don't really know what you are trying to get across.