This happens because LLMs operate on tokenized data, and they don't really "see" the text in the form of individual characters.
You can quite reasonably get an LLM to generate a script that does the character counting and then run the script, to arrive at the correct answer.