Look back a year or so ago and the term LLM was almost always accompanied by the explanatory words, large language model. Now, the capabilities of LLMs aren’t only widely understood, but they are becoming the bedrock of one of the most important technology revolutions in history.
However, far from becoming a standard, expectable event, the launch of every new LLM brings with it a new prospect for innovation and a degree of specification that opens GenAI up to an increasingly larger roster of use cases. One of the most exciting recent launches was DBRX, an open, general-purpose LLM from Databricks.
What is DBRX?
While DBRX is a general-purpose LLM, it does have a USP: It pushes the boundaries of open LLMs providing enterprises with the means to develop their models with robust capabilities previously only available through closed models. DBRX uses a mixture-of-experts (MoE) architecture, meaning a range of neural networks are utilized for more efficient task management, and 12T tokens of text and code data were leveraged during the model training phase.
How Does it Compare to Other LLMs?
In terms of performance, Databricks says that DBRX surpasses GPT-3.5 and is comparable with Gemini 1.0 Pro and Mistral Medium. One of its greatest strengths is programming and mathematics. Databricks says it outperforms other open LLMs in this area, even CodeLLaMA-70B, a specialized model built for programming.
Beyond this, DBRX is one of the leading models on retrieval-augmented generation (RAG) tasks. Here, the model is competitive with leading open models such as Mixtral Instruct and LLaMA2-70B Chat as well as GPT-3.5 Turbo.
Customer Impact
Both the DBRX base model, DBRX Base, and the finetuned version, DBRX Instruct, are available through Hugging Face under an open license. They can also be downloaded from the Databricks Marketplace. Databricks customers can leverage Databricks Mosaic AI Foundation Model APIs to build GenAI applications without the need to maintain model deployment.
As well as being a top-of-the-class example of a general-purpose open LLM, DBRX is a powerful advertisement for Databricks technology. Why? Because DBRX was built using the same Databricks tools that are available to its customers.
Databricks used its Unity Catalog for the management and governance of training data while data exploration was carried out via Lilac AI, a company recently acquired by Databricks.
Databricks handled data cleansing and processing through Apache Spark and Databricks notebooks. The training was carried out using the open-source training libraries MegaBlocks, LLM Foundry, Composer, and Streaming and orchestrated via the Mosaic AI Training service.
Mosaic AI played a pivotal role in collecting feedback for improvements through Mosaic AI Model Serving and Inference Tables, while manual experimentation was carried out through Databricks Playground.
At a time when there is an ongoing battle for GenAI dominance, an organization that can develop an open model that competes with the leading closed alternatives and does so using its proprietary technology suite certainly stands out from the crowd.
Ask AI Ecosystem Copilot about this analysis