This process is akin to how humans learn by reading widely and absorbing styles and techniques, rather than memorizing and reproducing exact passages.
Many people quote this part saying that this is not the case and this is the main reason why the argument is not valid.
Let's take a step back and not put in discussion how current "AI" learns vs how human learn.
The key point for me here is that humans DO PAY (or at least are expected to...) to use and learn from copyrighted material. So if we're equating "AI" method of learning with humans', both should be subject to the the same rules and regulations. Meaning that "AI" should pay for using copyrighted material.
Thanks to everyone that has replied, all fair points. When you use (read, view, listen to...) copyrighted material you're subject to the licensing rules, no matter if it's free (as in beer) or not.
This means that quoting more than what's considered fair use is a violation of the license, for instance. In practice a human would not be able to quote exactly a 1000 words document just on the first read but "AI" can, thus infringing one of the licensing clauses.
Some licensing on copyrighted material is also explicitly forbidding to use the full content by automated systems (once they were web crawlers for search engines)
Basically all these possibilities or actual licensing infringements would require a negotiation between the involved parties.