this post was submitted on 07 Jun 2024

298 points (96.9% liked)

Technology

59495 readers

3081 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

298

DuckDuckGo offers “anonymous” access to AI chatbots through new service (arstechnica.com)

submitted 5 months ago by nifty@lemmy.world to c/technology@lemmy.world

41 comments fedilink hide all child comments

top 41 comments

sorted by: hot top controversial new old

[–] tonyn@lemmy.ml 69 points 5 months ago (1 children)

How is the compute getting paid for?

[–] Ghostalmedia@lemmy.world 42 points 5 months ago (1 children)

DDG makes money through ads and affiliate programs.

[–] pineapplelover@lemm.ee 17 points 5 months ago

Oh yeah I might have to tell ublock to whitelist ddg so I can support them through ads

[–] franzcoz@feddit.cl 26 points 5 months ago

This is pretty cool, I have been using this chats with Claude and ChatGPT on DDGO since several weeks ago. I guess the new aspect is they incorporated more models like Mistral.

[–] brbposting@sh.itjust.works 23 points 5 months ago (1 children)

Couple good points in the comments -

Using LLMs to avoid the blank page problem:

For AI, bring your own data:

[–] demonsword@lemmy.world 3 points 5 months ago

Ars Technica forums are alright, I usually take a look there whenever I read something on their site

[–] Kecessa@sh.itjust.works 16 points 5 months ago (4 children)

Anonymous or not, you're still feeding it data

[–] just_another_person@lemmy.world 31 points 5 months ago (2 children)

Not how that works.

[–] metallic_substance@lemmy.world 1 points 5 months ago (1 children)

I'm curious, how does it work?

[–] RagingRobot@lemmy.world 20 points 5 months ago

Not who you asked but you don't want your AI to train itself based on the questions random users ask because it could introduce incorrect or offensive information. For this reason llms are usually trained and used in a separate step. If a user gave the llms private information you wouldn't want it to learn that information and pass it on to other users so there are protections in place usually to stop it from learning new things while just processing requests.

[–] lung@lemmy.world 1 points 5 months ago (1 children)

These companies absolutely collect the prompt data and user session behavior. Who knows what kinda analytics they can use it for at any time in the future, even if it's just assessing how happy the user was with the answers based on response. But having it detached from your person is good. Unless they can identify you based on metrics like time of day, speech patterns, etc

[–] just_another_person@lemmy.world 5 points 5 months ago* (last edited 5 months ago) (1 children)

Prompt data is pointless and useless without a human to create a feedback loop for it, at which point it wouldn't have context anyway. Also human effort to correct spelling dnd other user errors at the outset anyway. Hugely pointless and unreliable.

Not to mention, what good would it do for training? It wouldn't help the model at all.

[–] lung@lemmy.world 2 points 5 months ago (1 children)

You can collect the data and figure out how to use it later. Just look at the Google leaks lately and what they collect, it's literally everything down to the length of clicks and full walks through the site

Collecting data about user interests is in itself valuable, and it's plausible to use various metrics to analyze it, something as simple as sentiment analysis, which has been broadly done. Sentiment analysis has predated modern ML by a long margin, but you can read the wiki page on that

But yeah just think about stuff like Google trends, tracking interest in topics, as an example of what such data could be used for. And deanonymizing the inputs is probably possible to some degree, aside from the obvious trust we place in DDG as a centralized failure point

[–] just_another_person@lemmy.world 2 points 5 months ago

You're confusing analytics with direct input storage and reuse of prompt data to train somehow, as in your original comment.

Analytics has absolutely nothing to do with their model usage and training, and would pointless. Observing keywords and interests is standard analysis stuff. I don't even think anyone even cares about it anymore.

[–] Evotech@lemmy.world 21 points 5 months ago (1 children)

Not really. Depending on the implementation.

It's not like ddg is going to keep training their own version of llama or mistral

[–] regrub@lemmy.world 11 points 5 months ago (2 children)

I think they mean that a lot of careless people will give the AIs personally identifiable information or other sensitive information. Privacy and security are often breached due to human error, one way or another.

[–] Evotech@lemmy.world 15 points 5 months ago (1 children)

But these open models don't really take new input into their models at any point. They don't normally do that type of inference training.

[–] regrub@lemmy.world 5 points 5 months ago (1 children)

That's true, but no way for us to know that these companies aren't storing queries in plaintext on their end (although they would run out of space pretty fast if they did that)

[–] Evotech@lemmy.world 7 points 5 months ago

It's true. But I trust them more than closedai or Ms at least

[–] shotgun_crab@lemmy.world 3 points 5 months ago

But that's a human error as you said, the only way to fix it is by using it correctly as an user. AI is a tool and it should be handled correctly like any other tool, be it a knife, a car, a password manager, a video recording program, a bank app or whatever.

I think a bigger issue here is that many people don't care about their personal information as much as their lives.

[–] subtext@lemmy.world 16 points 5 months ago* (last edited 5 months ago)

https://duckduckgo.com/duckduckgo-help-pages/aichat/ai-chat-privacy/

your conversations are not used to train chat models by DuckDuckGo or the underlying model providers

[–] Even_Adder@lemmy.dbzer0.com 0 points 5 months ago

https://simonwillison.net/2024/May/29/training-not-chatting/

[–] autotldr@lemmings.world 12 points 5 months ago

This is the best summary I could come up with:

On Thursday, DuckDuckGo unveiled a new "AI Chat" service that allows users to converse with four mid-range large language models (LLMs) from OpenAI, Anthropic, Meta, and Mistral in an interface similar to ChatGPT while attempting to preserve privacy and anonymity.

While the AI models involved can output inaccurate information readily, the site allows users to test different mid-range LLMs without having to install anything or sign up for an account.

DuckDuckGo's AI Chat currently features access to OpenAI's GPT-3.5 Turbo, Anthropic's Claude 3 Haiku, and two open source models, Meta's Llama 3 and Mistral's Mixtral 8x7B.

However, the privacy experience is not bulletproof because, in the case of GPT-3.5 and Claude Haiku, DuckDuckGo is required to send a user's inputs to remote servers for processing over the Internet.

Given certain inputs (i.e., "Hey, GPT, my name is Bob, and I live on Main Street, and I just murdered Bill"), a user could still potentially be identified if such an extreme need arose.

With DuckDuckGo AI Chat as it stands, the company is left with a chatbot novelty with a decent interface and the promise that your conversations with it will remain private.

The original article contains 603 words, the summary contains 192 words. Saved 68%. I'm a bot and I'm open source!

[–] Beaver@lemmy.ca 11 points 5 months ago* (last edited 5 months ago)

I could use that!

Update: it works fantastic and lets you switch easily to different AI models

[–] InfiniWheel@lemmy.one 9 points 5 months ago

This has been available for most of the year. What took any tech news org so long to even awknowledge its existence?

[–] 01011@monero.town 8 points 5 months ago

I started using it when DDG and Startpage went down. Seems pretty handy. Good to know they've added more AI models.

[–] devilish666@lemmy.world 1 points 5 months ago (3 children)

How anonymous is that thing ?
Ai needs data training & correction from us as user

[–] hikaru755@feddit.de 11 points 5 months ago

Training and fine tuning happens offline for LLMs, it's not like they continuously learn by interacting with users. Sure, the company behind it might record conversations and use them to further tune the model, but it's not like these models inherently need that

[–] nifty@lemmy.world 2 points 5 months ago

You can train models of all kinds without disclosing anything personal about a user. Also see differential privacy

[–] 01011@monero.town 0 points 5 months ago (1 children)

"Keep in mind that, as a model running through DuckDuckGo's privacy layer, I cannot access personal data, browsing history, or user information. My responses are generated on-the-fly based on the input you provide, and I do not have the ability to track or identify users."

[–] anas@lemmy.world 5 points 5 months ago

Let’s be honest, regardless of whether or not this is true, it’s been instructed to say that.

[–] autonomoususer@lemmy.world -5 points 5 months ago* (last edited 5 months ago) (1 children)

I don't see how we can prove this. Paying them to also spy on us is bad but allowing them replace our software c/localllama with their service is even worse. My funds are better spent on local AI development or device upgrade.

[–] IHeartBadCode@kbin.run 11 points 5 months ago (1 children)

Honest question. How does their service "replace" an open source LLM? If I've got locallama on my machine, how does using their service replace my local install?

[–] autonomoususer@lemmy.world -1 points 5 months ago* (last edited 5 months ago) (1 children)

Yes, it does the same with less control, privacy.

[–] Womble@lemmy.world 2 points 5 months ago (1 children)

So it isnt replacing it's offering an alternative tradeoff with more convenience/less control. I dont see how thats a bad thing?

[–] autonomoususer@lemmy.world 0 points 5 months ago* (last edited 5 months ago)

After Reddit, some of us learnt not to throw away our control.

[+] whoisthedoktor@lemmy.wtf -16 points 5 months ago (2 children)

And this is why I stopped using DDG. I swear, I'm just going to have to throw away my computer in the future if this fucking AI bullshit isn't thrown away like the thieving, energy-sucking, lying pile of garbage that it is.

[–] nifty@lemmy.world 15 points 5 months ago (1 children)

If it’s using different AI models and allowing anonymity, I am not sure what’s the issue? Do you also object to using a calculator?

[–] considine@lemmy.ml 9 points 5 months ago (1 children)

Calculator?! Those thieving, energy-sucking piles of garbage! Abacus till I die!

But seriously, AI is insidious in how it data mines us to give us answers, and data mines our questions to build profiles of users. I distrust assurances of anonymity by big data corpos.

[–] nifty@lemmy.world 3 points 5 months ago* (last edited 5 months ago)

I am not sure what method DDG is using for their model updates, I think it’s only fair if journalists follow up with them for clarification. Local LLMs, ones you can download to your machine for use, would circumvent privacy concerns if you’re not updating the weights in some way

Edit to clarify I meant updating the weights using online learning, but it’s still possible to update weights using pre trained weights you can download

[–] sugar_in_your_tea@sh.itjust.works 6 points 5 months ago

You can disable ads and AI in the settings.