01.AI, a Beijing-based generative AI unicorn founded by global AI thought leader Dr. Kai-Fu Lee, announced the successful development of a new type of vector database based on a fully navigatable graph. 01.AI’s vector database, named “Descartes,” has topped the rankings in six data set evaluations on the mainstream ANN-Benchmarks.
In the Large Language Model (LLM) technology stack, a vector database is an important component in connecting external information with LLMs. Descartes vector database will serve at the layer of 01.AI’s leading AI infrastructure software that will be used in AI products and also made available to developers as a tool in the future.
In the offline tests per the global evaluation platform ANN-Benchmarks, 01.AI’s Descartes vector database achieved the top score in six data sets, with significant performance improvements over the other industry players. Some benchmarks even show twice of performance improvements.
Vector Databases Emerge as Infrastructure for GenAI, Gaining Capital Attention
Vector databases, also known as information retrieval technologies of the AI era, are one of the core technologies of retrieval-augmented generation (RAG). With the expanded capabilities of LLMs, the volume of multimodal unstructured data such as images, videos, and texts has increased exponentially, differing from traditional databases used to handle structured data. Vector databases are specifically designed to store, manage, query, and retrieve vectorized unstructured data. They function as an external memory disk that can be called upon by LLMs at any time to form a “long-term memory.” For developers of large model applications, vector databases are a crucial infrastructure that, to a certain extent, affects the performance of large models.
LLMs have four commonly known deficiencies that vector databases are able to address:
- Real-time Information: Large models take a long time to train and update slowly, makig it difficult to reflect the latest information, hence the knowledge “cut-off date” challenge. Vector databases use a lightweight update mechanism to quickly supplement the latest information from the world wide web.
- Privacy Protection: Users’ sensitive data should not be directly provided for LLM training to protect privacy. Vector data acts as a middleman in the information transmission process during inference to overcome this challenge.
- Hallucination Correction: The phenomenon of distorted reasoning or hallucinations often exhibited by LLMs can be effectively corrected and mitigated through the rich knowledge references provided by vector databases.
- Inference Efficiency: The cost of inference for LLMs is high. Vector databases can serve as a caching mechanism, preventing the models from re-executing complex inference calculations for every query request, thereby greatly saving computational resources.
The generative AI technological and platform transformation further strengthens the role of vector databases. Products from tech giants such as Google, Microsoft, and Meta have been introduced, and startups like Zilliz, Pinecone, Weaviate, and Qdrant have emerged. In 2023, Pinecone, a partner of OpenAI in vector databases, completed a Series B funding round of $138 million, and Fabarta ArcNeural, a Chinese startup, also completed a Pre-A round of around $15 million.
01.AI’s Vector Database Achieves Top Rankings on ANN-Benchmarks, Paving Way to Advanced RAG
01.AI’s Descartes vector database has achieved first place in all six data set evaluations of the ANN-Benchmarks which showcasing the performance of different algorithms across various real-world datasets.
In the benchmark graphs, six evaluation datasets covering glove-25-angular, glove-100-angular, sift-128-euclidean, nytimes-256-angular, fashion-mnist-784-euclidean, and gist-960-euclidean, and the horizontal axis represents recall. The vertical axis represents QPS (queries per second, the number of requests processed per second). The higher a curve is towards the upper right corner, the better the algorithm’s performance. The 01.AI Descartes vector database ranks highest in all six datasets.
“Throughput QPS” is a crucial metric for evaluating the query processing capabilities of information retrieval systems (such as search engines or databases). Based on the original top 1 on the benchmark, the 01.AI Descartes vector database has achieved significant performance improvements, exceeding twice on some datasets, and a lead of 286% over the original top 1 on the gist-960-euclidean dataset.
RAG (Retrieval-Augmented Generation) is a technology that combines retrieval and generation, enhancing the generative capabilities of LLMs by retrieving information from vast amounts of data. Similar to traditional retrieval methods, at its core, RAG vector retrieval primarily addresses two issues:
reducing the set of candidates to be examined during retrieval by establishing certain indexing structures, and reducing the complexity of computing individual vectors.
The 01.AI Descartes Vector Database demonstrates significant comparative advantages in handling complex queries, enhancing retrieval efficiency, and optimizing data storage compared to the industry. Addressing the first issue, the 01.AI team has two key strategies:
- Advanced full-navigation map technology. Currently, the industry primarily uses methods such as hashing, KD-Tree, and VP-Tree, which result in imprecise navigation and insufficient pruning. The global multi-layer thumbnail navigation technology developed by 01.AI, with its coordinate system navigation, can ensure precision while pruning a large number of irrelevant vectors.
- Innovative adaptive neighbor selection strategy, filling a gap in the industry. 01.AI’s self-developed adaptive neighbor selection strategy breaks through the limitations of relying solely on real topk or fixed-edge selection strategies. The new strategy enables each node to dynamically select the best neighbor edge based on its own and its neighbors’ distribution characteristics, accelerating convergence to the target vector and improving RAG vector retrieval performance by 15%-30%.
Full Stack Vector Technology: Higher Precision, Faster Performance
With the support of the full stack of vector technology, the 01.AI Descartes Vector Database also exhibits core advantages of higher precision and stronger performance in actual application scenarios.
Currently, the focus of 01.AI’s Descartes Vector Database is on high-performance vector databases. High-performance vector databases typically refer to vector datasets with a scale of tens of millions or less (such as 20 million 128-dimensional floating-point vectors). Generally, high-performance vector databases can easily handle 80%-90% of daily scenarios, such as helping enterprise customers build private domain knowledge bases and intelligent customer service systems. In the field of autonomous driving, high-performance vector databases can be used to accelerate model training for autonomous cars.
The 01.AI high-performance vector database has the following advantages:
- Ultra-high precision: Based on multi-layer thumbnails and coordinate system navigation for layer navigation and map orientation navigation, as well as guaranteeing map connectivity, it achieves a precision greater than 99%, significantly leading the industry in accuracy under the same performance.
- Ultra-high performance: Efficient edge selection and pruning technologies, with a response time of milliseconds for millions of databases.
Take a large platform e-commerce recommendation scenario as an example. The number of items on the shelf can be in the tens of millions, with each item expressed by a vector. Even if the number of vectors in the database is limited, it will encounter performance issues processing a high number of user requests per second during peak hours, reaching hundreds of thousands or even millions of QPS. Using a high-performance vector database can effectively improve the recommendation effect in the search and advertising businesses of the e-commerce scenario, making everyone unable to resist constantly shopping.
01.AI’s Descartes Vector Database is the team’s first launch of RAG technology stack. The new vector database capabilities will be effectively applied in the company’s AI productivity consumer product launching soon. It will also be made available to developers as a tool in the future.
Visit AITechPark for cutting-edge Tech Trends around AI, ML, Cybersecurity, along with AITech News, and timely updates from industry professionals!