Stack Overflow signs deal with OpenAI to supply data to its models
OpenAI is collaborating with Stack Overflow, the Q&A forum for software developers, to improve its generative AI models' performance on programming-related tasks.
As a result of the partnership, announced Monday, OpenAI's models, including models served through its ChatGPT chatbot platform, should get better over time at answering programming-related questions, the two companies say. At the same time, Stack Overflow will benefit from OpenAI's expertise in developing new generative AI integrations on the Stack Overflow platform.
The first set of features will go live by the end of June.
The tie-up with OpenAI is a remarkable reversal for Stack Overflow, which initially banned responses from ChatGPT on its platform over fears of spammy responses.
Stack Overflow began experimenting with generative AI features last April, promising to craft models that "reward" devs who contribute knowledge to the platform. In July, the company launched a conversational search tool that lets users pose queries and receive answers based on Stack Overflow’s database of over 58 million questions and answers, along with tools for businesses to fine-tune searches on their own documentation and knowledge bases.
Some members of Stack Overflow's developer community rebelled against the changes, pointing out concerns related to the validity of information generated by AI, information overload and data privacy for individual contributors on the platform.
There was at least some basis for those concerns. An analysis of more than 150 million lines of code committed to project repos over the past several years by GitClear found that generative AI dev tools are resulting in more mistaken code being pushed to codebases. Elsewhere, security researchers have warned that such tools can amplify existing bugs and security issues in software projects.
But despite the apparent flaws, developers are embracing generative AI tools for at least some coding tasks. In a Stack Overflow poll from June 2023, 44% of developers said that they use AI tools in their development process now while 26% plan to soon.
This has precipitated something of an existential crisis for Stack Overflow. Traffic to the platform has reportedly dipped significantly since the release of capable new generative AI models last year -- models that in many cases were trained on data from Stack Overflow.
So now, as it cuts costs, Stack Overflow is pursuing licensing agreements with AI providers.
The company's deal with OpenAI -- the financial terms of which weren't disclosed -- comes after Stack Overflow partnered with Google to enrich Google's Gemini models with Stack Overflow data and work with Google to bring more AI-powered features to its platform. Stack Overflow stressed at the time that the agreement wasn't exclusive -- and indeed, that turned out to be the case.
Prashanth Chandrasekar, CEO of Stack Overflow, previously said that 10% of the platform's nearly 600 staff was focused on its AI strategy, and has described potential additional revenue from the strategy as key to ensuring Stack Overflow can keep attracting users and maintaining high-quality information.
"Stack Overflow is the world’s largest developer community," Chandrasekar said in a press release this morning. "Through [our] industry-leading partnership with OpenAI, we strive to redefine the developer experience, fostering efficiency and collaboration through the power of community, best-in-class data, and AI experiences. Our goal with OverflowAPI, and our work to advance the era of socially responsible AI, is to set new standards with vetted, trusted, and accurate data that will be the foundation on which technology solutions are built and delivered to our user."