Your bet in the AI race is to bring together several functions in a single model
The artificial intelligence race is often told as a competition to see who builds the most powerful model or the one that dominates the most benchmarks. In the middle of that board, the French startup Mistral AI has just presented Mistral Small 4a proposal that tries to occupy a different place in that conversation. It is not presented as a model limited to a single function, but as one that, according to the company, seeks to bring together several advanced capabilities within the same tool. What exactly is Small 4. The company presents it as the new great iteration of its Mistral Small family and, above all, as the first model of the house that brings together capabilities that were previously distributed among several lines. Specifically, it integrates functions associated with Magistral, Pixtral and Devstral along with those of the Small series itself. Fewer models, more features. One of the central ideas of the announcement is to concentrate tasks that are normally solved with different tools in a single system. According to Mistral, the goal is that the same model can be used to converse, analyze complex information, work with images or assist in programming without having to switch between several specialized systems. The numbers behind Small 4. The model is based on a Mixture of Experts architecture, a design that distributes processing between different specialized submodels and that today appears in several artificial intelligence systems. In the case of Small 4, Mistral indicates that the system has 128 experts and that only four participate in each generated token. According to the company, the model reaches 119B total parameters, with 6B assets per token, and offers a context window of up to 256k. Who is this model intended for?. Beyond its architecture, Mistral also describes quite clearly the scenarios in which it imagines the use of Small 4. Let’s see. Developers: Automate programming tasks, explore code bases, and code agent workflows Businesses: conversational assistants, document understanding and multimodal analysis Research: mathematics, complex analysis and reasoning tasks The underlying idea is that the model can move between quite different needs without forcing you to change the system depending on the type of work. The graphics. In the material accompanying the announcement, Mistral includes several graphs where it compares Small 4 with other models in different benchmarks. These comparisons are not limited to the score obtained in each test. They also show the average length of the responses each system generates, a data the company uses to illustrate how much text each model needs to produce to achieve certain results. One of the graphs in the advertisement corresponds to the AA LCR benchmark, where Mistral compares the scores of various models and the average length of the responses they generate to solve the same tasks. The data published by the company are the following: • Mistral Small 4: 0.72 score with 1,600 characters• GPT-OSS 120B: 0.51 with 2,500 characters• Claude Haiku: 0.80 with 2,700 characters• Qwen3-next 80B: 0.75 with 5,800 characters• Qwen3.5 122B: 0.84 with 5,700 characters The comparison. Small 4 is not the highest scoring model. Both Claude Haiku and the Qwen models appear higher in that indicator. However, Mistral highlights another aspect of the comparison: the length of the responses. According to the company, its model achieves this combination of score and output length by generating significantly less text than several of its competitors, something it relates to lower latency and lower inference cost. The short answer trick. A shorter answer is not better simply because it takes up less space. It is only if it manages to solve the task with a level of quality comparable to that of a longer answer. This is where Mistral tries to put the focus: if a model achieves a competitive result by generating less text, it can respond faster, consume fewer resources and reduce the cost of inference. In other words, the advantage is not in being more concise, but in needing less output to reach a useful result. How to access the new model. Small 4 can not only be used via API and AI Studio. Being published under license Apache 2.0is also proposed as an open model that can be downloaded, adjusted and deployed in your own environments. The company adds that it can be tried for free at build.nvidia.com, in addition to offering it for production as NVIDIA NIM. Images | Mistral In Xataka | OpenAI has been wanting to be the bride at the wedding and the dead man at the funeral for years: now it has finally defined its priority