Previously on this blog, I've discussed the assertion by AI builders that statistical information describing patterns of relationships between words in a book author's text lie outside the protections of copyright. But recently, AI watchers have begun discussing a new potential line of defense for technology builders seeking to leverage copyrighted works on a mass scale: corporate personhood.
The idea that "corporations are people too" is often ridiculed, but widely accepted in US law. Corporate personhood had humble beginnings, in 1886 when the US Supreme Court declared that companies have the same right to dispute their tax bills as do the American citizens who own those companies, established in the case Santa Clara County vs. Southern Pacific Railroad. Personhood doctrine was later narrowed in 1907 before being widened under the Citizens United decision of 2010, which has led to unbridled sums of private money in American politics.
Generative AI companies may decide they need to invoke corporate personhood in defense of the output of their foundation models. "It's not speech unless the most wide, outrageous definition of corporate speech. But it's not really speech, it's a machine," said Steven Brill, co-CEO of Newsguard Technologies, a misinformation monitoring firm.
The tax rights of a company are ultimately intertwined with the financial fortunes of its shareholders in such a way that some degree of those individual rights must also exist at the corporate level. But even if we accept a degree of corporate personhood may apply to some technological ventures, based on the rights of the people building them, the development of a large language foundation model requires trillions of words of text, far more than any individual company can produce from its product documentation, sales materials and other assets, for use as a training corpus. This has some technologists pointing to the inherent impossibility of AI safety, given that the input material required to create large language models is so vast that the safety of the training content itself cannot be guaranteed.
One potential counter-argument to the applicability of corporate personhood to AI-generated "speech": an LLM's output, is not corporate speech, because it can't represent a synthesis of only a single company's knowledge. Foundation models require access to broad swaths of human knowledge to be compelling. Enterprise companies looking to build efficiency through the customization of generative AI are carefully examining the indemnity protections offered by their AI providers, than by hoping corporate personhood or copyright fair use protections will be extended to AI across major economies. Google took a big step toward this kind of indemnification. Midjourney, the generative imaging company, and ChatGPT maker OpenAI, don't yet offer such protections. Ultimately, the battleground for commercial supremacy in AI will be won or lost in part on these terms, rather than on the quality of AI output alone. And the question of whether AI output constitutes individual speech, or merely corporate property, will take time for courts and regulators to wrangle.