May 29, 2024
Generative AI in production: Rethinking development and embracing best practices

Presented by Sendbird

Generative AI is reshaping how businesses engage customers, elevate CX at scale and drive business growth. In this this VB Spotlight, industry experts shared real-world use cases, discussed challenges and offered actionable insights to empower your organization’s gen AI strategy.

Watch on-demand here!

Rethinking how software is built

“The biggest upside of LLMs [large language models] is also the biggest downside, which is that they’re very creative,” says Jon Noronha, co-founder of Gamma. “Creative is wonderful, but creative also means unpredictable. You can ask the same question of an LLM and get a very different answer depending on very slight differences in phrasing.”

For companies building production apps around LLMs, the engineering mindset of predictable debugging and software testing and monitoring is suddenly challenged.

“Building one of these apps at scale, we’ve found that we’re having to rethink our whole software development process and try to create analogs to these traditional practices like debugging and monitoring for LLMs,” he adds. “This problem will be solved, but it’s going to require a new generation of infrastructure tools to help development teams understand how their LLMs perform at scale out in the wild.”

It’s a new technology, says Irfan Ganchi, CPO at Oportun, and engineers are encountering new issues every day. For instance, consider the length of time it takes to train LLMs, particularly when you’re training on your own knowledge base, as well as trying to keep it on-brand across various touch points in various contexts.

“You need to have almost a filter on the input side, and also a filter on the output side; put a human in the loop to verify and make sure you’re working in coordination with both a human and what the generative AI is producing,” he says. “It’s a long way to go, but it’s a promising technology.”

Working with LLMs is not like working with software, adds Shailesh Nalawadi, head of product at Sendbird.

“It’s not software engineering. It’s not deterministic,” he says. “A small change in inputs can lead to vastly different outputs. What makes it more challenging is you can’t trace back through an LLM to figure out why it gave a certain output, which is something that we as software engineers have traditionally been able to do. A lot of trial and error goes into crafting the perfect LLM and putting it into production. Then the tooling around updating the LLM, the test automation and the CI/CD pipelines, they don’t exist. Rolling out generative AI-based applications built on top of LLMs today requires us to be cognizant of all the things that are missing and proceed quite carefully.”

Misconceptions around generative AI in production-level environments

One of the biggest misconceptions, Nalawadi says, is many folks think of LLMs as very similar to Google search: a database with full access to real-time, indexed information. Unfortunately, that’s not true. LLMs are often trained on a corpus of data that’s potentially six to 12 to 18 months old. For them to respond to a user with the particular information you need requires the user to prompt the model with the specifics of your data.

“That means, in a business setting, enabling the correct prompt, making sure you package all the information that is pertinent to the response required, is going to be quite important,” he says. “Prompt engineering is a very relevant and important topic here.”

The other big misconception comes from terminology, Noronha says. The term “generative” implies making something from scratch, which can be fun, but is often not where the most business value is or will be.

“We’ll find that generation is almost always going to be paired with some of your own data as a starting point, that is then paired with generative AI,” he says. “The art is bridging these two worlds, this creative, unpredictable model with the structure and knowledge you already have. In many ways I think ‘transformative AI’ is a better term for where the real value is coming from.”

One of the biggest fears people have around generative AI in a production environment is that it’s going to automate everything, Ganchi says.

“That can’t be further from the truth based on how we’ve seen it,” he explains.

It automates certain mundane tasks, but it’s fundamentally increasing productivity. For instance, in Oportun’s contact center, they’ve been able to train the models based on the responses of top performing agents, and then use those models to train all agents, and coordinate with gen AI to improve average response times and hold times.

“We’re able to drive so much value when humans, our agents, and generative AI tools increase productivity, but also improve the experience for our customers,” Ganchi says. “We see that it is a tool that increases productivity, rather than replacing humans. It’s a partnership that we have seen work well, specifically in the context of the contact center.”

He points to similar trends in marketing as well, where generative AI helps today’s marketers be much more productive in their content writing and creative generation. They can get so much more done. It’s a tool that enhances productivity.

Best practices for leveraging generative AI

When applying generative AI, the most crucial thing is being very intentional, Ganchi says, going in with a fundamental strategy and the ability to incrementally test the value within an organization.

“One thing that we’ve found is that as soon as you introduce generative AI, there is a lot of apprehension, both on the employee front and the organizational executive front,” he says. “How can you be deliberate? How can you be intentional? You have a strategy to incrementally test, show value and add to the productivity of an organization.”

Before you even start deploying it, you need to have infrastructure in place to measure the performance of generative AI-based systems, Nalawadi adds.

“Is the output being generated? Does it meet the mark? Is it satisfactory? Perhaps have a human evaluation framework,” he says. “And then keep that around as you evolve your LLMs and evolve the prompts. Refer back to this gold standard and make sure that it is in fact improving. Use that rather than solely relying on qualitative metrics to see how it’s doing. Plan it out. Make sure you have a test infrastructure and a quantitative evaluation framework.”

In many ways the most important part is choosing which problems to apply generative AI to, Noronha says.

“There’s certainly a number of mishaps that can go along the way, but everyone is so eager to sprinkle the magic fairy dust of AI on their product that not everyone is thinking through what the right places are to put it,” he says. “We looked for cases where it was a job that either nobody was doing, or nobody wanted to be doing, like formatting a presentation. I’d encourage looking for cases like that and really leaning into those. The other thing that surprised us in focusing on those was that it didn’t only change efficiency. It got people to create things they weren’t going to be creating before.”

To learn more about where generative AI is now, and where it’s headed in the future, along with real-world case studies from industry leaders and concrete ROI, don’t miss this VB Spotlight event.

Register to watch free now!


  • How generative AI is leveling the playing field for customer engagement
  • How different industries can harness the power of generative and conversational AI
  • Potential challenges and solutions with large language models
  • A vision of the future powered by generative AI


  • Irfan Ganchi, Chief Product Officer, Oportun
  • Jon Noronha, Co-founder, Gamma
  • Shailesh Nalawadi, Head of Product, Sendbird
  • Chad Oda, Moderator, VentureBeat

Source link