Groq, a leading artificial intelligence (AI) and machine learning (ML) systems innovator, last week announced it adapted a new large language model (LLM), LLaMA–chatbot technology from Meta and a proposed alternative to ChatGPT–to run on its systems.
Facebook® parent, Meta, released LLaMA, which can be used by chatbots to generate human-like text, on February 24th. Three days later the Groq team downloaded the model and within a few days had it running on a production GroqNode™ server, including eight GroqChip™ inference processors. This is a rapid time-to-functionality; a development task that can often take a larger team of engineers weeks to months to complete, while Groq executed with just a small group from its compiler team.
Jonathan Ross, CEO and founder of Groq said, “This speed of development at Groq validates that our generalizable compiler and software-defined hardware approach is keeping up with the accelerating pace of LLM innovation–something traditional kernel-based approaches struggle with.”
The rapid LLaMA bring-up by Groq is a particularly unique and noteworthy milestone because Meta researchers originally developed LLaMA for NVIDIA™ chips. With Groq engineers successfully running a cutting-edge model on its technology, they demonstrated GroqChip as a ready-to-use alternative to incumbent technology. Generative AI is carving out a place for itself in the market, and as transformers continue to advance the pace of LLM development, customers will need solutions that provide tangible time-to-production advantages, reducing developer complexity for fast iteration.
Bill Xing, Tech Lead Manager, ML Compiler at Groq said, “The complexity of computing platforms is permeating into user code and slowing down innovation. Groq is reversing this trend. Since we’re working on models that were trained on Nvidia GPUs, the first step of porting customer workloads to Groq is removing non-portable, vendor-specific code targeted for specific vendors and architectures. This might include replacing vendor-specific code calling kernels, removing manual parallelism or memory semantics, etc. The resulting code ends up looking a lot simpler and more elegant. Imagine not having to do all that ‘performance engineering’ in the first place to achieve stellar performance! This also helps by not locking a business down to a specific vendor.”
If you would like to discuss your AI strategy and solutions with a technology expert at Groq, please reach out to contact@groq.com. For press inquiries about this story or Groq technology please contact pr-media@groq.com.
Visit AITechPark for cutting-edge Tech Trends around AI, ML, Cybersecurity, along with AITech News, and timely updates from industry professionals!