I have conflicting feelings about this whole thing. If you are selling the result of training like OpenAI does (and every other company), then I feel like it’s absolutely and clearly not fair use. It’s just theft with extra steps.
On the other hand, what about open source projects and individuals who aren’t selling or competing with the owners of the training material? I feel like that would be fair use.
What keeps me up at night is if training is never fair use, then the natural result is that AI becomes monopolized by big companies with deep pockets who can pay for an infinite amount of random content licensing, and then we are all forever at their mercy for this entire branch of technology.
The practical, socioeconomic, and ethical considerations are really complex, but all I ever see discussed are these hard-line binary stances that would only have awful corporate-empowering consequences, either because they can steal content freely or because they are the only ones that will have the resources to control the technology.