this post was submitted on 07 Jun 2024
559 points (99.3% liked)
Technology
59589 readers
2838 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Say it with me again now:
For fact-based applications, the amount of work required to develop and subsequently babysit the LLM to ensure it is always producing accurate output is exactly the same as doing the work yourself in the first place.
Always, always, always. This is a mathematical law. It doesn't matter how much you whine or argue, or cite anecdotes about how you totally got ChatGPT or Copilot to generate you some working code that one time. The LLM does not actually have comprehension of its input or output. It doesn't have comprehension, period. It cannot know when it is wrong. It can't actually know anything.
Sure, very sophisticated LLM's might get it right some of the time, or even a lot of the time in the cases of very specific topics with very good training data. But its accuracy cannot be guaranteed unless you fact-check 100% of its output.
Yep, that right there. I could have called that before they even started. The shit really hits the fan when the computer is inevitably capable of spouting bullshit far faster than humans are able to review and debunk its output, and that's only if anyone is actually watching and has their hand on the off switch. Of course, the end goal of these schemes is to be able to fire as much of the human staff as possible, so it ultimately winds up that there is nobody left to actually do the review. And whatever emaciated remains of management are left don't actually understand how the machine works nor how its output is generated.
Yeah, I see no flaws in this plan... Carry the fuck on, idiots.
Simply false in my experience.
We use CoPilot at work and there is no babysitting required.
We are software developers / engineers and it’s saves countless hours writing boilerplate code, giving code blocks based on a comment, and sticking to our coding conventions.
Sure it isn’t 100% right, but the owner and lead engineer calculates it to be around 70% accurate and even if it misses the mark, we have a whole lot less key presses to make.
Using Copilot as a copilot, like generating boilerplate and then code reviewing it is still "babysitting" it. It's still significantly less effort than just doing it yourself though
Until someone uses it for a little more than boilerplate, and the reviewer nods that bit through as it's hard to review and not something a human/the person who "wrote" it would get wrong.
Unless all the ai generated code is explicitly marked as ai generated this approach will go wrong eventually.
Undoubtedly. Hell, even when you do mark it as such, this will happen. Because bugs created by humans also get deployed.
Basically what you're saying is that code review is not a guarantee against shipping bugs.
Agreed, using LLMs for code requires you to be an experienced dev who can understand what it pukes out. And for those very specific and disciplined people it's a net positive.
However, generally, I agree it's more risk than it's worth