AI21 Labs' new AI model can handle more environment than most

Increasingly, the AI industry is moving towards generative AI models with larger frameworks. But models with large context windows tend to be computationally intensive. Or Dagan, head of product at startup AI21 Labs, claims that doesn’t have to be the case — and his company is releasing a production model to prove it.

Contexts or context windows refer to input data (eg text) that a model examines before generating output (more text). Models with small context windows tend to forget the content of even very recent conversations, while models with larger contexts avoid this pitfall — and, as an added bonus, better understand the flow of data they receive.

AI21 Labs’ Jamba, a new text generation and analysis model, can perform many of the same tasks that models such as OpenAI’s ChatGPT and Google’s Gemini do. Trained on a combination of public and proprietary data, Jamba can write text in English, French, Spanish and Portuguese.

Jamba can handle up to 140,000 tokens while running on a single GPU with at least 80GB of memory (such as a high-end Nvidia A100). That translates to about 105,000 words or 210 pages — a decent-sized novel.

Meta’s Llama 2, by comparison, has an ambient window of 32,000 tons – on the smaller side by today’s standards – but only requires a GPU with ~12GB of memory to run. (Context windows are typically measured in tokens, which are chunks of raw text and other data.)

On his face, Jabba is unremarkable. There are many freely available downloadable AI models, from the recently released DBRX by Databricks to the aforementioned Llama 2.

But what makes the Jamba unique is what’s under the hood. It uses a combination of two architectural models: transformers and state space models (SSM).

Transformers are the architecture of choice for complex reasoning tasks, powering models such as GPT-4 and Google’s Gemini, for example. They have many unique features, but the defining characteristic of transformers is their “attention mechanism”. For each piece of input data (eg a sentence), transformers weigh the relevance of every other input (other sentences) and draw from them to produce the output (a new sentence).

SSMs, on the other hand, combine various properties of older types of artificial intelligence models, such as recurrent neural networks and convolutional neural networks, to create a more computationally efficient architecture capable of handling large data sequences.

Now, SSMs have their limitations. However, some early incarnations, including an open-source model called Mamba by Princeton and Carnegie Mellon researchers, can handle larger inputs than their transformer-based counterparts, while performing better at language generation tasks.

The Jamba actually uses the Mamba as part of the base model — and Dagan claims it delivers three times the performance in large environments compared to comparably sized transformer-based models.

“While there are some initial academic examples of SSM models, this is the first commercial-grade, production-scale model,” Dagan told TechCrunch. “This architecture, in addition to being innovative and interesting for further research by the community, opens up great performance and performance potential.”

Now, while Jamba has been released under the Apache 2.0 license, an open source license with relatively few usage restrictions, Dagan emphasizes that it is a research release not intended for commercial use. The model has no safeguards to prevent it from generating toxic text or mitigations to address potential bias. a refined, ostensibly “safer” version will be available in the coming weeks.

But Dagan claims that Jamba demonstrates the promise of the SSM architecture even at this early stage.

“The added value of this model, both due to its size and its innovative architecture, is that it can easily be placed on a single GPU,” he said. “We believe performance will improve further as Mamba receives additional tweaks.”

ERP in procurement and how to use it effectively

How genetic artificial intelligence could reinvent what it means to play

The Purchase Order Process – Are You Getting It Right?

How underwater drones could shape a potential Taiwan-China conflict

What you need to know about this new Chinese text-to-video AI model

Understanding YOLOv5 Loss: A Comprehensive Analysis

Master Advanced Prompt Engineering with LangChain for Context-Aware Language Models

Arduino vs Raspberry Pi: What’s the difference?

Top 20 Generative AI Applications/ Use Cases Across Industries

Top 35+ Finance Interview Questions And Answers

AI21 Labs’ new AI model can handle more environment than most

From Concept to Creation: Generative AI’s Power in Product Design

ERP in procurement and how to use it effectively

How genetic artificial intelligence could reinvent what it means to play

The Purchase Order Process – Are You Getting It Right?

How underwater drones could shape a potential Taiwan-China conflict

What you need to know about this new Chinese text-to-video AI model

Nudeitnow Features, Pricing, Details, Alternatives

From Concept to Creation: Generative AI’s Power in Product Design

What the Arrival of A.I. Phones and Computers Means for Our Data

Our Picks

Nudeitnow Features, Pricing, Details, Alternatives

From Concept to Creation: Generative AI’s Power in Product Design

What the Arrival of A.I. Phones and Computers Means for Our Data

Subscribe to Updates

AI21 Labs’ new AI model can handle more environment than most

Related Posts