this post was submitted on 02 Mar 2024
334 points (95.9% liked)
Technology
59589 readers
2838 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
There's something to be said for the abilities of a tool reflecting its wielder.
In research circles, the most advanced pipelines in terms of prompting have a 90% success rate at things the same model only gets right around 30% of the time with naive zero shot prompting.
At a minimum, people should be familiar with chain of thought prompting if using the models. That one is very easy to incorporate and makes a huge difference on complex problems.
Though for anyone actually building serious pipelines for these products, the best technique I've seen to date was this one from DeepMind:
So yes, maybe you aren't getting a lot out of the models. But a lot of people are, and the difference between your experiences and theirs may just boil down to experience in using the tool. If I just started using Photoshop for an hour or two I might complain about how the software sucks at making good looking images. But we both know it wouldn't be the software's fault.
Well, one more comment like that and I guess I'm gonna have to edit my original comment, because I don't want to explain again. I'm getting quite a lot out of LLMs (GPT-4, to be specific), it's just that they're very stupid. When they don't straight up lie, they don't know stuff. It's quite simple, really, I usually deal with very complex problems that few people dealt with, the AI has (close to) no data on that, so it runs in circles and is not able to help.
But when presented with questions that it has training data on, it's brilliant - recently I needed to use reflection to get all types implementing an interface in .NET with the caveat that the interface is generic. GPT-4 was able to solve that problem 3rd message in the conversation, while I'm pretty sure it would take me hours, because I'd need to learn a lot of .NET's internal workings before arriving at the quite simple solution.
So, a good career advice - which one do you feel like it is? A simple question with a straight correct solution, or a complex and nuanced issue where there isn't one general truth? Because the only correct answer to a request for career advice by someone who doesn't know your situation extensively is (a version of) "I don't know, what's your situation in detail?". Knowing GPT, it didn't ask that question.
So yes, LLMs are great! Just learn which use-cases it excels at and don't ask it for complex advice.
You need to provide it the data. The fact they know things at all pretrained was kind of a surprise to everyone in the industry. Their current usecase as a Google replacement is really not ideally aligned with the capabilities. But the models have turned out to be surprisingly good at in context learning and are having increased context windows, so depending on the model you can absolutely provide it relevant reference material to ground the responses with a factual reference point before asking for deeper analysis. It's hard to give specific recommendations without knowing more about what you are trying to accomplish, but "they're very stupid" runs extremely counter to most of what I've seen at this point, and the rare cases where that seems to be the case there's usually something more nuanced getting in the way and a slight modification to what or how I'm asking gets past it.
Really? I find that the chat models are almost overturned to asking for more details as part of their reengagement strategy. In fact, a number of the employment related usage examples I've seen were things like users having the model ask a series of questions about work history and responsibilities in order to summarize resume fodder. So again, maybe a bit of a difference between users of the tools.
My use of the models is almost entirely related to complex scenarios and while I'd agree that something like GPT-3 is dumb as shit, GPT-4 is probably among the smarter interactions I've had in my life and I used to consult for C-suite execs of Fortune 500s. One of my favorite results was explaining the factors I suspected were influencing it getting a question wrong and it generating a correct workaround that was quite brilliant (the issue was token similarity to a standard form of a question and the proposed solution was replacing the nouns with emojis, which did bypass the similarity bias and allowed it to answer correctly when it was failing before). In spite of there being no self-introspection capabilities, giving it background details resulted in novel and ultimately correct out-of-the-box solutions.
From the sound of it, you are trying to use it for coding. I recommend switching to one of the models that specializes in that rather than using a generalist model.
And on the off chance you are using the free 3.5 version - well stop that. That one sucks and is like using an Atari when there's a PS3 available instead. Don't make the mistake of extrapolating where the tech is at based on outdated tech being provided for free as a foot on the door.