this post was submitted on 29 Apr 2026
318 points (98.2% liked)

Technology

84222 readers
5754 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] ExLisper@lemmy.curiana.net 22 points 21 hours ago (1 children)

I wonder where it’s gone wrong. What would it have cost github to keep operating decently for the vast majority of small users, and still have a business side?

Why would Micro$oft keep project that doesn't bring more and more profits? Github is no longer a product in itself for them. It's a platform to sell Azure and Copilot subscriptions.

[–] gravitas_deficiency@sh.itjust.works 3 points 4 hours ago (1 children)

Microslop bought GitHub for the training data. That’s it. That was the whole point.

The funniest part is that their model is considered to be rather shit-tier.

[–] ExLisper@lemmy.curiana.net 3 points 4 hours ago (3 children)

What? Microsoft bought GitHub in 2018. ChatGTP was released 4 years later. The AI boom wasn't a thing when MS was buying Github and no one was thinking about using it for data back then. Cloud was big thing in 2018 and MS bought GitHub to integrate it with Azure and sell computing to people using github actions.

[–] MangoCats@feddit.it 1 points 48 minutes ago

no one was thinking about using it for data back then

Everyone with any foresight whatsoever has been thinking about using every source of data since the Babylonians were taking census 6000 years ago.

[–] NewNewAugustEast@lemmy.zip 4 points 3 hours ago

And they said years earlier at dev meetings: Microsoft is about data. Harvest all you can. Hence the linked in purchase. They may have not known chatgpt was around the corner, but they did believe that the value is in harvesting as much information as possible.

[–] gravitas_deficiency@sh.itjust.works 3 points 4 hours ago* (last edited 2 hours ago) (1 children)

Google Voice was also a service designed to gather training data for speech to text / text to speech services at Google. That’s why it was free. The advent of LLMs just gave it something else to plug the data into. The Microslopening of GitHub, at its core, had similar motivations. Having effectively full backend visibility of all content on the (at the time) centralized service that damn near everyone who publicized their code was using to publicize their code was a valuable business proposition even before they shoved it all in to a training set.

[–] ExLisper@lemmy.curiana.net -1 points 4 hours ago

We're talking about using code to train models which wasn't a thing until LLMs were able to generate code which was after they bought GitHub. I'm pretty sure in 2018 they weren't looking at GitHub as source of training data. It was a way to get developers to use their tools. Everyone was using Github and MS wanted to market their products to them. First Azure, now Copilot.