this post was submitted on 29 Jan 2025
151 points (92.7% liked)

Technology

61227 readers
4363 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Summary:

The launch of Chinese AI application DeepSeek in the U.S. has raised national security concerns among officials, lawmakers, and cybersecurity experts. The app quickly became the most downloaded on Apple's store, disrupting Wall Street and causing a record 17% drop in Nvidia's stock. The White House announced an investigation into the potential risks, with some lawmakers calling for stricter export controls to prevent China from leveraging U.S. technology.

Beyond economic impact, experts warn DeepSeek may pose significant data security risks, as Chinese law allows government access to company-held data. Unlike TikTok, which stores U.S. data on Oracle servers, DeepSeek operates directly from China, collecting personal user information. The app also exhibits censorship, blocking content on politically sensitive topics like Tiananmen Square. Some analysts argue that, as an open-source model, DeepSeek may not be as concerning as TikTok, but critics worry its widespread adoption could advance China’s influence through curated information control.

you are viewing a single comment's thread
view the rest of the comments
[–] naeap@sopuli.xyz 3 points 1 day ago* (last edited 1 day ago) (1 children)

Distilling OpenAI and Llama models probably also helped quite a bit

Although I must admit, that the architectural changes are pretty cool

but I have to add, that I've just started reading into the topic a few weeks ago and don't really have any real practical experience, besides checking out some huggingface docs I got linked yesterday and stupid me hasn't thought about looking there...
So everything I say is probably bullshit o⁠:⁠-⁠)

[–] 0liviuhhhhh@lemmy.blahaj.zone 1 points 1 day ago (1 children)

Sure it made the training process faster, but this still takes a fraction of the energy to generate a single output compared to other LLMs like ChatGPT or Llama. Plus it's open source. You can't discredit a technological advancement for building upon previous advancement, especially when doing so with transparency.

[–] naeap@sopuli.xyz 2 points 1 day ago* (last edited 1 day ago)

As I said, the architectural changes are quite cool

As far as I've understood it mostly comes down to splitting it up into multiple expert systems, so you don't need to activate the complete system with every request

But I've only scratched the surface...

Also, open source... The weights are made publicly available.
None of the training data or systems

Edit: regarding "open source":
Also Meta's Llama is on huggingface, just like deepseek. I still wouldn't talk about transparency here