It's almost impossible to audit what data got into an AI model. Until this is true companies could scrape and use whatever they like and no one would be the wiser to what data got used or misused in the process. That makes it hard to make such companies accountable to what and how they are using.

[–] po-lina-ergi@kbin.social 37 points 2 years ago (1 children)

Then it needs to be on companies to prove their audit trail, and until then require all development to be open source

[–] zinderic@programming.dev 7 points 2 years ago (2 children)

That would be amazing. But it won't happen any time soon if ever.. I mean - just think about all that investment in GPU compute and the need to realize good profit margins. Until there are laws and legislation that requires AI companies to open their data pipelines and make public all details about the data sources I don't think much would happen. They'll just keep feeding any data they get their hands on and nothing can stop that today.

[–] ipkpjersi@lemmy.ml 1 points 2 years ago* (last edited 2 years ago)

Until there are laws and legislation that requires AI companies to open their data pipelines and make public all details about the data sources I don’t think much would happen.

I don't expect those laws to ever happen. They don't benefit large corporations so there's no reason those laws would ever be prioritized or considered by lawmakers, sadly.

[–] InputZero@lemmy.ml 0 points 2 years ago* (last edited 2 years ago)

Maybe not today and maybe not every AI but maybe some AI in the near future will have it's data sources made explainable. There are a lot of applications where deploying AI would be an improvement over what we have. One example I can bring up is in-silico toxicology experiments. There's been a huge desire to replace as many in-vivo experiments with in-vitro or even better in-silico to minimize the number of live animals tested on. Both for ethical reasons and cost savings. AI has been proposed as a new tool to accomplish this but it's not there yet. One of the biggest challenges to overcome is making the AI models used in-silico to be explainable, because we can not regulate effectively what we can not explain. Regardless there is a profits incentive for AI developers to make at least some AI explainable. It's just not where the big money is. To which end that will apply to all AI I haven't the slightest idea. I can't imagine OpenAI would do anything to expose their data.

[–] po-lina-ergi@kbin.social -1 points 2 years ago

Then it needs to be on companies to prove their audit trail, and until then require all development to be open source