Mamba Mentality

The Big Idea: Mamba models outperform transformers

Every time you got your mind blown by ChatGPT last year—or any other chatbot—that was generative AI on a transformer model. The next phase of mind-blowing might come from an altogether different architecture.

Mamba models are a new kind of neural network architecture that outperforms transformer models for one main reason: They have something like a working memory, which transformers don’t. They don't have a limited context window and in principle can handle arbitrarily long inputs. And although we might not see a mamba chatbot for a while and be able to experience this for ourselves, the AI whizzes behind the scenes are very fired up.

Mamba by the way is a jaunty nickname for “selective structured state space sequence model”—all those alliterative s-sounds made someone think of a snake. Hence, mamba.

It takes 10s of billions of parameters for a good chatbot. Right now, mamba models don’t scale past 3 billion parameters or so. And nobody has figured out how to do distributed training—at least not yet.

But as Dan Balsam, an AI leader at RippleMatch who led our webinar on the state of prompt engineering, explains, much of the hard problem has been solved. It’s trainable and theoretically scalable. And once people figure out how to do distributed training simultaneously on a thousand chips, we’ll get to the level of scale that a chatbot requires.

“State space architectures are very likely going to play a big role in the next level of AI advancement,” Balsam said. It might be that the most powerful agents arrive when we find a way to combine the long range dependencies (memory) of a mamba with the short range effectiveness of transformers.

Basically, mamba models are better at cramming for a test, unlike ChatGPT whose “memory” seems to peter out right when you need it to remember the instructions you gave it twenty prompts ago.

This might mean another step towards AGI—artificial general intelligence. Though we’re still not sure exactly why or how. “In empirical AI capabilities research,” Balsam said, “people just try things and they're like, oh, that worked. And then they spend years trying to figure out why it worked.”

Click here to share this issue of Build Mode with your team. Missed last week’s issue? Read it here.

AROUND THE WATERCOOLER

The information technology sector only added a net 700 jobs last year in the U.S., compared to 267,000 the year prior. Yikes.

This is mostly because of an estimated 262,242 layoffs plus pandemic-era overhiring, plus AI-generated cuts at the entry level. And yet there are still some 88,000 roles open, according to tech consultancy Janco Associates.
Our own research suggests that these numbers might not be telling the full story as more top-tier tech workers shift to the independent work model. Last year 62% of knowledge workers said they didn’t feel secure committing to one employer anymore.

CHART OF THE WEEK

Predictions for AI Progress Just Got Way More Aggressive

A team of researchers from Berkeley and elsewhere surveyed 2,778 researchers who had published top-tier AI papers asking for predictions on the pace of AI advancement.

The findings were freaky. “If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047.”

Other predictions include:

AI will be better than humans at writing NYT-bestselling fiction by 2030 (instead of 2039 as predicted last year)
AI will be able to originate publishable math theorems by 2050-ish
Full automation of all human labor however will have to wait till about 2120

On the doom side of things between 37.8% and 51.4% of respondents gave at least a 10% chance to advanced AI leading to outcomes as bad as… human extinction. Fingers crossed!

EVENTS

Generative AI Salon: The ROI of AI

What are you doing on January 31st?

Join us at the forefront of AI transformation as we convene to explore the transformative power of Generative AI in enhancing business productivity and efficiency.

Among the thought-provoking discussions from our C-suite panels, it's an opportunity to forge meaningful connections with peers, dive into the latest AI research, and hear real-life strategies from pioneers in the field.

Register Here

DISCOVERY ZONE

A new startup called Rabbit partnered with Teenage Engineering to release the R1, a tiny little orange box with a touch screen and an analog scroll wheel that act as an AI “companion.” The launch video is pretty magical—watch it here.

DEEP DIVES FROM THE ARCHIVES

MEME OF THE WEEK

No items found.

Tag:

Newsletters