this post was submitted on 21 Mar 2024
39 points (82.0% liked)

Technology

59589 readers
2838 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] protozoan_ninja@sh.itjust.works 3 points 8 months ago* (last edited 8 months ago)

You're basically completely wrong about how AI is going to scale. We're not going to be stuck on, say tinkertoy models on our phones and gigantic mega-models exclusively in the cloud. That's insane. We have good language models that will run on an ordinary laptop already. You can scale models to more or less any size and there is amazing research coming out constantly regarding how to do AI more efficiently. People are running tons of ML code on their PC's already and the demand will only go up as companies -- like Microsoft -- bundle more and more features that rely on AI code into their software, more SDK's and libraries come out that support it, etc.

Also, the feasibility of deploying AI code to more users goes up the more users have them in their devices.

Also, the main trick of NPU's is efficient matrix math, especially the use-case of applying a single operation to entire matrices at once, which AIUI is foundational to tensor math. Plain old CPU's are trash at this because they have to iterate over each individual entity in the matrix and apply the operation separately. NPU's, as I guess they're coming to be called, are designed to do those operations massively in parallel. There are likely tons of applications for this beyond just ML code that we haven't even imagined yet.

It's a bit like asking in 1995 what the use case for a graphics card is when you can go to an arcade and gameboys exist. At that exact moment in time, based on the exact cards that were available in literally 1995, it might have been hard to imagine that by 2024 we'd all have dedicated graphics chips of some kind in our computers -- in fact, we'd be hard-pressed to imagine devices without them -- and that some of the biggest computing companies in the world would be graphics card manufacturers. Yet here we are.

You have to pay attention to the research as it develops, and you have to realize that they don't just show up to markets to satisfy pre-existing demands, they create markets and create new demand where none existed before. That's how the tech industry works.