Last night I was nearly tearing my hair out trying to get a crappy inkjet printer installed when I turned to ChatGPT for help; 5 minutes later it was done (driver issues).
I realised that I use it a lot for all sorts of things but haven’t really seen any posts on here about it, and was wondering if other people do the same.
I’ve used it to create bash scripts, write small, niche python programs (eg sort photos into separate folders based on their orientation), and create cron jobs among other things.
In particular, I find it useful when I want to use a command line program that has a complicated set of options / syntax, but I might only ever want to use it once, or very infrequently. So I tell it what I want to accomplish, and ask it to give me the correct command to run with the options I specify in plain language.
Example:
User
Give me the correct rsync command to recursively copy all files in /home/user/Documents that are image files of any type but were created more recently than 1/1/2024. The destination is /home/user/foo
ChatGPT
rsync -avm --include='*/' --include='*.jpg' --include='*.jpeg' --include='*.png' --include='*.gif' --include='*.bmp' --include='*.tiff' --include='*.svg' --exclude='*' --prune-empty-dirs --min-age=$(date -d '2024-01-01' +%s) /home/user/Documents/ /home/user/foo/
User
Break down the command and explain each part of it in dot point format.
ChatGPT
-
rsync`: Invokes the rsync program, which is used for efficiently transferring and synchronizing files across systems or within a system.
-avm
:-a
: Archive mode, which preserves permissions, symlinks, modification times, group, and ownership. Implicitly includes recursion.-v
: Verbose mode, which provides detailed output of what rsync is doing.-m
: Prune empty directory chains from the file-list.
--include='*/'
: Include directories themselves, which is necessary for recursion.--include='*.jpg' --include='*.jpeg' --include='*.png' --include='*.gif' --include='*.bmp' --include='*.tiff' --include='*.svg'
: Include files that match these image file extensions.--exclude='*'
: Exclude all other files that do not match the previously stated include patterns.--prune-empty-dirs
: Exclude empty directories that result from the inclusion/exclusion pattern.--min-age=$(date -d '2024-01-01' +%s)
: Exclude files that are older than the timestamp for January 1, 2024. This usesdate -d
to convert a date string into a timestamp, whichrsync
uses to filter files./home/user/Documents/
: Source directory from which files are being synchronized./home/user/foo/
: Destination directory to which files are being synchronized.`
It’s also really good for explaining concepts / commands in plain language.
It’s like having a 24 hour on call Linux expert :)
#Things to note:
- Get the subscription. ChatGPT 3.5 is pretty useless. ChatGPT4 is fine, but I’m pretty sure you need the subscription to access it.
- Give it pre-instructions. I have told mine what distro, what shell I’m using and the make and model of my laptop. If you have a subscription you can add these as permanent pre-instructions, so to speak. That way it will be much more likely to give you correct answers.
- It’s not magic In order to get what you want, you have to be able to ask the right questions. It will boost your ability but won’t turn you in to a 1337 haxx0r
-Ask questions As always, don’t run any commands that you don’t understand. Ask it to break down any commands it tells you to run if you don’t understand them.
-Sometimes it goofs For various reasons, sometimes it will ask you to install a package that no longer exists, or will give you a command that doesn’t work. When that happens, I just feed the error message back into ChatGPT and it usually is able to correct itself.
-Ask “Is there a better or easier way to do this?” This is probably the most valuable question I’ve found to ask chatGPT. Sometimes it gets so far in the weeds looking for a solution to a problem that you need to pull back and start fresh.
Chatgpt does not know truth. It does not know if the info it provides is true. It does not know if the code actually works. It just concatenates strings based on probability. You may be lucky or you aren't. The easier the task, the more likely it'll succeed. But a low difficulty is no guarantee for success.
It is great for layouts, structure and for the basic concept. "For loop in fish". But it may struggle to convert a video from x264 to av1 with ffmpeg. It depends on info that's provided online. If it uses misinformation, then that's in there as well.
The command you got is just wrong. What about avif, jxl or most other image formats? Use it, but think.
I was wondering how long it would take the gatekeepers to show up. The command works, and is perfectly fine. If I had any uncommon formats, I would tell gpt to include them.
I’m quite sure it won’t be long until some bad practice spreads like this. Giving clueless “Linux pros” top advice on how to enable a back door.
LLMs can be poisoned and as datasets increase and complexity grows it will be harder to contain.
Cgpt works great for some stuff, but all you know is that someone somewhere wrote something similar. They are no better than Google in predicting what is good material and what’s wrong, and training is statistics.
In order to poison a LLM, you'd need access to the training process, which is locked down by openai. Just posting false info on the net isn't enough. GPT doesn't simply repeat what's already been written.
More than that though, you can find plenty of wrong and bad advice posted confidently by legions of Linux gatekeepers on any forum.
Anyone who has ever spent any time on stack overflow will tell you why they'd rather talk to an LLM instead of posting there.
chatgpt only generates text. that's how it was supposed to work. it doesn't care if the text it's generating is true, or if it even makes any sense. so sometimes it will generate untrue statements (with the same confidence as the 'linux gatekeepers' you mentioned, except with no comments to correct the response), no matter how well you train it. and if there's enough wrong information in the dataset, it will start repeating it in the responses, because again, its only real purpose is to pick out the next word in a string based on the training data it got. sometimes it gets things right, sometimes it doesn't, we can't just blindly trust it. pointing that out is not gatekeeping.