I wonder when google is going to tap into Gmail data of users (if they do not already). They must have trillions of english messages and they already filtered spam. Additionally, it's hard to ever prove that they did it.
Maybe it doesn't make for high quality data though, not sure..
Would be a crazy expensive migration though