this post was submitted on 26 Dec 2025
199 points (96.7% liked)
Technology
81907 readers
5040 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I came across this thread searching Lemmy for another query, and it intrigued me. How does 81 TB of data get scraped without Spotify knowing? That seems like a server administration failure on a large scale. I mean, 81 TB is nothing to dismiss offhand. OTOH, they are probably burning through petabytes daily, but still. IT: 'I wonder why these few accounts are downloading TB's of data, daily?'
spotify has a whole economy of bots signing up, uploading fake songs listened by other bots and earning lot of money in the process. I know several people living out of this. A little army of scraper bots is definitely not what they should be the most concerned about.
Sure, I understand gaming the system. I have tracks on SoundCloud, and the same economy exists there too. I just do it for fun tho, I'm not interested in any monetary or commercial pursuits. It seems to me tho, that a relatively small number of accounts doing massive scraping would raise an eyebrow or two, no? Roughly, the average a 320 kbps audio track would be around 8 MB +/-, which gives you 10,616,832+/- tracks. Damn!
since botting is so easy, probably they used a lot of accounts to access data that, in theory, is somewhat public. I mean, in an ideal world in which engineers have infinite time sure, they would have noticed, but I do investigations on platform apps for work and trust me, they miss a lot of more fundamental stuff.