The ads won’t be baked in beforehand, they’ll be injected into the stream in real time. Videos are broken into chunks and sent over HTTP, they’ll just put ad chunks in during playback. There is no need to re-encode anything. If you deep link to a timestamp, the video just starts from that timestamp as normal. If you are a Premium user, the server just never injects the ads.
But you are correct that the client needs to be aware that ads are happening, so they can be indicated on screen, and so click-throughs are activated.
This is why Chrome went to Manifest v3 - so you can’t have any code looking for ad signals running on the page to try to counter it.
All of that targeting data lives on Google’s servers already. Your computer isn’t trying to figure out who you are and what you like each ad play, Google already knows who you are when your browser makes a request for a video. Everything you are talking about is already server-side.