Like... now? Here are my notes about it https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence
You don't even need to wait for "AI" chips, "just" a high-end GPU will do.
Sure they are very very large models like Mistral or BLOOM you won't be able to run even on a 4090 (highest end gaming card right now) but there usually have lower quality versions that might give usable result.
IMHO though what I realized while testing all that at home is... it's rarely worth it. It's absolutely fun to play with, even interesting to learn about it all, but in terms of time/energy/ecology/costs versus result, so far it's been "meh". A cool experiment, like locally get transcript for my PeerTube server from the audio of my videos, but something that in fine I always end up not relying on.
It also allows me to do cool prototype, like code generation in XR, but again that's something I'd qualify as fun, not as productive.
TL;DR: it's feasible today but IMHO not worth it.
PS: best example would be Immich with it's optional ML, locally or not (as in serving content on a small Pi but doing the ML inference on your desktop)
I've done a bash script and a KDE shortcut for that a while ago. I didn't even remember it until now. It's useful sometimes.