Model minimalism: The new AI strategy saving companies millions


This article is a special issue of Venturebeat, “The real value of the EU: performance, efficiency and scale roof.” Read more from this particular matter.

Great Language Models (LLS) ‘s arrival made the enterprises, so that the production of projects can now implement the types of projects can implement the projects can implement.

However, as these projects gained speed, the enterprises realized that the previously used LLMS was useless and worse.

Enter small language models and distillation. Models like Googletoward Gemma family, Microsofttoward Phi and Mistraltoward Small 3.1 Allowed to select fast, accurate models working for special tasks. Enterprises can choose For a small model For specific use, manage AI applications and allows you to potentially get a better turn related to investment.

Linkedin Dear Engineer Karthik Ramgopal said companies preferred to small models for several reasons.

“Small models require less incrativity (operating costs) and less calculation, memory and faster results, which are directly translating low infrastructure (capital costs) with GPU costs, availability and power requirements,” Ramoapl. “The tasks have a narrow coverage of models, and are more adapted to time without complex drug engineering.”

Model developers appreciate their small models accordingly. Openai’s O4-Mini $ 1.1 and 4.4 / million for entries are $ 1.4 / million in signs of $ 1.1, $ 10 and $ 40 for input.

Enterprises today have a larger pool of small models, typical models in tasks and Distilled models to choose from. Most flagship models offer a number of measures these days. For example, the claw family of models Anthropical Cover Claude Opus, which is the largest model, Claude Sonnet, a comprehensive modeland clod haiku, the smallest version. These models are compact enough to work on portable devices such as laptops or mobile phones.

Savings Questions

When discussing the return of investment, the question is always: Roi look like? Should the expenses be removed or resulting in the cost of the dollar selling dollars? Experts said that VentureBeat could be difficult to believe that some companies believe that the time of some companies did or actually work for AI investments, or that they reach Roi because they expect more work.

Normally, enterprises calculate ROI by a simple formula, as described Knowledgeable Chief Technologist Ravi Tola in the post: Roi = (benefits-expenses) / costs. However, the benefits with AI programs do not appear immediately. The company offers us to achieve us, based on historical information, including lease, implementation and exploitation, which should be for a long time, and it should be for a long time.

Experts with small models, when this provides more context for your enterprise for your enterprise, it reduces the cost of implementation and maintenance.

Arijit Sengupta, founder and CEO ScourgeHe said that people can save the context of the models and how much it can save. For individuals that require additional context for tips such as long and complex tips, it can result in higher trait costs.

“You must give the context of models in a road or other way; there is no free lunch. But with great models, it is generally done by putting it in a time.” “Think of delicate arrangement and post-training as an alternative way in the context of models. I can take $ 100 $ 100, but not astronomical.”

SengUpta, the value of the “model use value of” the value of the model, which is often seen to reduce the cost of about 100x from the 100x post-training, often to something like $ 30,000. ” This number noted that the software operation costs and the continued value of the model and vector databases.

“If you do this in terms of maintenance value, you can with human specialists, small models can be expensive because it should be trained to achieve the results that can be compared to large models,” he said.

Experimental The crushed A task is a task, a delicate-adjustable model, in a good way for some use, in case of placing several uses for several uses, it ensures more efficient to do everything.

The company compared the subsequent version of the Llama-3.3-70B instructions to the smaller 8B parameter selection of the same model. The 70B of the 70B of the 70B of the 6.30 dollar was 84% ​​accurate in automated assessments and 92% in hand assessments. Once once the cost of $ 4.58, the 8b model achieved 82% accuracy in a hand appropriate for smaller and more targeted cases.

Price factors match the goal

Right-sized models do not need to come to the cost of performance. These days, organizations do not mean choice between only GPT-4Os or Llama-3.1 of the model choice; As a summary or generation of code is to know that some situations are used serves better by a small model.

Daniel Hroske, Contact Center AI Products Supplier StupidityHe said that the starting development in the starting of the LLM is better informed.

“If it doesn’t work with the biggest model, you should start with the biggest model to see what you don’t work because if it doesn’t work with the greatest model, it does not mean it will be with smaller models,” he said.

Ramgopal, Linkedin also followed a similar example, because prototiping is the only way to start these issues.

“Our typical approach to the use of agent is a wide generalization capability of the extensive prototype, hypothesis and product market,” said Linkedin’s Ramgopal. “As the harvest grows and the quality, value or delay, we are conversion to more specific solutions.”

In the experimental stage, organizations can determine what they value most of the AI ​​applications. Understanding this allows you to better plan what they want to maintain and choose the most suitable model size they want to keep developers.

When experts are important to build with the best working models with their development, the higher parametric LLMS will always be more expensive. Great models will always require an important calculation power.

However, the use of small and task models is also a problem. Rahul Pathak, information and Vice-President of the AI ​​GTM BoringIn optimizing in a blog post, it is not only to use a model with low computing power needs, but not for a model to match the tasks. Small models may not have a large context window to understand more complex instructions, leading to the growing workload and higher costs for human employees.

Sengupta also warned that some distilled models can be fragile, so long-term use can not result in savings.

Constantly evaluate

Regardless of model size, industrial players stressed the convenience to solve any potential problem or new use. Thus, if they start from a small model and a small model with similar or better performance and lower price, organizations may not be valuable on the model they choose.

Innovation Head in Tessa Burg, CTO and Brand Marketing Company AgainstHe told Venturebeat that organizations should always understand what everything they set up with a better version.

Tech, the technology under the workflows we created, we started to change the processes we make more efficiently. We knew that any model we use would be the worst version of a model. “

Burg, small models and customers have helped research and develop a timely manner. The time when the time saved, he said that he caused budget deposits over time. Added that high price for lightweight models, it is a good idea to break high-frequency use.

Sengupta, sellers have now made it easier to change between models, but they have been careful not to find platforms that simplify the subtle adjustment of users.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *