Performance rivals leading competitors but LLM is trained on permissively licensed code, eliminating tradeoffs between performance and privacy
Tabnine, the originators of the AI code assistant category, today announced the release of Tabnine Protected 2, a license-safe large language model (LLM) designed to deliver top-tier performance while ensuring strict compliance with copyright and license regulations.
In the rapidly evolving landscape of generative AI, concerns about copyright and license compliance remain a significant barrier to widespread adoption. Recent studies reveal that nearly one-third of CIOs are wary of these issues, with multiple lawsuits still pending that question the legality of training models on unlicensed content.
Tabnine has long been at the forefront of addressing these concerns, offering proprietary models trained on high-quality, permissively licensed code. The new model is trained exclusively on permissively licensed code, eliminating potential legal risks and maintaining high standards of data privacy. IP indemnification further mitigates any legal risks.
Historically, limiting the training data meant accepting lower quality performance than the models trained without those restrictions. As of today, that trade-off has been virtually eliminated. Tabnine Protected 2 represents a significant leap forward in license-compliant AI model performance; demonstrating capabilities that rival and exceed those of models trained on a significantly larger corpus of training data like GPT-3.5 Turbo.
“With the launch of Tabnine Protected 2, we are setting a new standard for performance and compliance in AI-driven software development,” said Peter Guagenti, President of Tabnine. “For those who have taken a more cautious approach to AI, driven by risk mitigation rather than innovation, the ‘wait and see’ period is over. Our customers can now enjoy top-tier performance without sacrificing their corporate standards, flexibility to adopt new models as they come to market, and confidence in their data privacy and legal safety.”
Developers using Tabnine can choose from a variety of LLMs to power their full suite of AI software development tools, switching models in real-time to meet specific project needs. This flexibility, combined with Tabnine’s commitment to zero data retention, makes it the only AI code assistant that offers such comprehensive control and privacy.
Tabnine Protected 2 Features:
- Superior Performance and Improved Code Accuracy: Evaluations using HumanEval and MultiPL-E benchmarks show a higher pass@1 score and real-world tests found higher acceptance than other models, indicating superior performance and immediate value for users.
- Rigorous Data Privacy: The model can be deployed in any environment, including on-premises, on VPC, via secure SaaS, or even in fully air-gapped settings.
- Improved Performance Using Context Through RAG: Employs deep context into a company’s code and standards leveraged through highly advanced retrieval-augmented generation (RAG) techniques to deliver more relevant and personalized responses.
- Extensive Language Support: Supports over 600 programming languages and frameworks, dramatically improving upon its predecessor which supported 80+ languages.
- Zero Data Retention: Tabnine ensures zero data retention, never storing or sharing customer code, and guarantees that customer code is not used to train public models.
Tabnine Protected 2 is available to all Tabnine users at no additional cost. Developers can access the new model by updating their IDE plugins. For those new to Tabnine, a 90-day free trial of Tabnine Pro is available.
For more details on Tabnine Protected 2 benchmarks and additional features, read Tabnine’s blog here.
Explore AITechPark for the latest advancements in AI, IOT, Cybersecurity, AITech News, and insightful updates from industry experts!