1018
Reddit's licensing deal means Google's AI can soon be trained on the best humanity has to offer — completely unhinged posts
(www.businessinsider.com)
This is a most excellent place for technology news and articles.
I'm so confused about how AI learning is supposed to work. Does it just need any data at all in significant quantity, is the quality of the data almost irrelevant? Because otherwise surely they could just feed it back issues of scientific American, or the scanned copies of the library of congress, I can't reasonably believe that Reddit is going to add anything unless it's just pure on adulterated quantity that's important.
If you wanted the AI to just create book-like texts than you could train it purely on books from a library but if you want it to converse like a human being you need training data that imitates that.
But that's my point really it already talks like a human. My guess is they feed it on hours and hours and hours of podcasts because that tends to be the manner in which it communicates. I don't see how Reddit really adds to this.
I doubt its trained on podcasts, seeing as they would need subtitles, and current automated subtitling is not that good.