Industry-first automated evaluation and security platform to detect unexpected LLM failures at scale
Patronus AI today officially launched the first automated evaluation and security platform that helps companies use large language models (LLMs) safely. Using proprietary AI, the new platform enables enterprise development teams to score model performance, generate adversarial test cases, benchmark models and more. Patronus AI automates and scales the manual and costly model evaluation methods prevalent in the enterprise today, enabling organizations to confidently deploy LLMs while minimizing the risk of model failures and misaligned outputs.
Patronus AI was founded by machine learning experts Anand Kannappan and Rebecca Qian. Prior to Patronus AI, Rebecca led responsible NLP research at Meta AI, and Anand pioneered explainable ML frameworks at Meta Reality Labs. They founded the company after experiencing firsthand the difficulties of evaluating AI outputs, and recognized early on that LLM evaluation would become a massive challenge for enterprises.
“Every company is looking for ways to use LLMs today, yet they are concerned that unexpected model behavior, incorrect outputs and hallucinations will put their business and customers at risk,” said Anand Kannappan, CEO and co-founder, Patronus AI. “Whether off-the-shelf, open-source or custom, models today remain inadequately vetted and tested in real-world scenarios. And until now, the process of evaluating LLMs has been extremely inefficient and unscalable, producing unreliable results.”
Patronus AI leverages state-of-the-art machine learning technology to test and score any language model in order to identify potential failures. The platform automates:
- Scoring: Scores model performance in real world scenarios and key criteria like hallucinations and safety.
- Test generation: Automatically generates adversarial test suites at scale.
- Benchmarking: Compares models to help customers identify the best model for specific use cases.
Fueling the company’s launch is a $3 million seed funding round led by Lightspeed Venture Partners with participation from Factorial Capital, the CEO of Replit Amjad Masad, Gokul Rajaram and a number of other Fortune 500 executives and board members.
“AI has become a must-have for businesses, as they seek to realize its full potential and not be left behind in the LLM revolution,” said Nnamdi Iregbulem, Partner at Lightspeed Venture Partners. “But no responsible company is going to put their reputation on the line by leveraging risky models. Patronus AI not only has the technology to tackle this problem head-on, they have a world-class team from Meta, Airbnb and Samsung, with the expertise to help organizations safely navigate LLMs. We’re thrilled to be on this journey with them and look forward to playing a role in their continued growth.”
Early platform partners include leading AI companies, including Cohere, Nomic AI, and Naologic. Additionally, several high profile companies in traditional industries like financial services will be piloting Patronus AI in the coming months.