Remember a year ago, up until last November before we learned about ChatGPT, when machine learning was all about building models to solve a single task, like loan approvals or fraud protection? This approach seemed to go out the window with the rise of generalized LLMs, but the fact is that generalized models are not a good fit for every problem, and task-based models are still alive and well in business.
These task-based models, until the rise of LLMs, were the basis for most AI in the enterprise, and they are not going away. It’s what Amazon CTO Werner Vogels referred to as “old-fashioned AI” in his keynote this week, and in his view, it’s the kind of AI that still solves many real-world problems.
Atul Deo, general manager of Amazon Bedrock, the product introduced earlier this year as a way to connect to a variety of large language models via APIs, also believes that task models aren’t just going away. Instead, they have become another AI tool in the arsenal.
“Before the advent of big language models, we were mostly in a task-specific world. And the idea was that you would train a model from scratch for a specific task,” Deo told TechCrunch. He says the main difference between the task model and the LLM is that one is trained for the specific task, while the other can handle things outside the bounds of the model.
Jon Turow, a partner at investment firm Madrona, who previously spent nearly a decade at AWS, says the industry has talked about emerging capabilities in large language models, such as logic and domain-free robustness. “These allow you to be able to expand beyond a narrow definition of what the model was originally expected to do,” he said. But, he added, it’s still very much up for debate how far those capabilities can go.
Like Deo, Turow says task models aren’t just going to disappear out of the blue. “There’s clearly still a role for task-specific models because they can be smaller, they can be faster, they can be cheaper and in some cases they can be even more efficient because they’re designed for a specific task,” he said. .
But the lure of an all-purpose model is hard to ignore. “When you look at an aggregate level across a company, when there are hundreds of machine learning models being trained individually, that doesn’t make any sense,” Deo said. “Whereas if you go with a more capable large language model, you get the benefit of reusability right away, while allowing you to use a single model to address a bunch of different use cases.”
For Amazon, SageMaker, the company’s machine learning operations platform, remains a core product, one aimed at data scientists rather than developers like Bedrock. It mentions tens of thousands of customers build millions of models. It would be foolish to abandon it, and frankly just because LLMs are the flavor of the moment doesn’t mean the resulting technology won’t remain relevant for quite some time to come.
Enterprise software especially does not work this way. No one just throws away their significant investment because a new thing came along, even one as powerful as the current crop of large language models. It’s worth noting that Amazon announced upgrades to SageMaker this week, aimed at managing large language models.
Before these more capable big language models, the task model was really the only option, and that’s how companies approached it, building a team of data scientists to help develop these models. What is the role of the data scientist in the era of big language models where tools are aimed at developers? Turow believes they still have a major job to do, even at LLM-focused firms.
“They’re going to be thinking critically about data, and that’s actually a role that’s growing, not shrinking,” he said. Regardless of the model, Turow believes that data scientists will help people understand the relationship between AI and data within large companies.
“I think each of us really needs to think critically about what AI is and isn’t capable of and what data does and doesn’t mean,” he said. And this is true regardless of whether you’re building a more generalized large language model or a working model.
This is why these two approaches will continue to work together for some time, because sometimes bigger is better and sometimes not.