Musk’s XII represents GROK-3: more than power, but is this new soil?

Crypto News

The GROK-3, developed by the Xai Elon Musk, was presented on Monday, and the company put forward bold statements about its capabilities, demonstrating mass computing infrastructure, which signals even greater ambitions.

A announcement To a large extent focused on unprocessed computational muscles, reference characteristics and upcoming functions, although many of the real demonstrations seemed to be repetitions of what had already reached other artificial intelligence companies.

The star of the original part of the show was not the AI itself, but rather the “Colossus”, a cluster of a hippo of 200,000 graphic processors, which supports the training of Gorka-3.

The system gathered together in two stages: 122 days of synchronous training for 100,000 graphic processors, and then 92 days of scaling up to 200,000. According to XII developers, the construction of this infrastructure turned out to be more complicated than the development of the AI model itself.

The company already has plans for an even more powerful cluster when Musk says that they are five times higher than the current power, effectively creating what will be the most powerful cluster of graphic processors on Earth.

When it comes to performance, the GROK-3 shows impressive results on standard AI tests. The basic model (a regular model without a chain of thinking and reasoning built -in) sequentially leads diagrams in mathematics (AIME), science (GPOA) and coding (LCB).

It also seems very promising in blind tests.

XII confirmed that the mysterious model under the code name “Chocolate” was actually the early test version of the GROK-3, which was loaded in LLM ArenaField

During these tests, he reached the best ELO among all LLMS, which means that users preferred their answers to generations provided by all other artificial intelligence models in direct competition, not knowing which model they evaluated.

This is probably the most accurate way to measure quality without providing models. To deceive tests Having taught their AI on these data sets. This standard is based solely on preferences and blind elections of thousands of anonymous users.

The XII team demonstrates GROK 3 tests during a lively presentation. Image: Xai

The specialized version of the “beta” GROK-3, which uses internal processing of the thought chain and additional calculations during testing, raises mathematics estimates even higher, which is 93% on the Aime 2025 standard compared to other best models that are estimated below 87%.

Interestingly, a smaller version called GROK-3 Mini Beta, sometimes surpasses his larger brother, thanks to a longer training time.

In other words, the full-size GROK-3 still has the opportunity to improve as soon as he receives a comparable duration of training, which seems promising, given its larger number of parameters.

But when the XII switched to demonstrate the live capabilities of the GROK-3, the presentation was more like a game of catching up than innovation. The team demonstrated a model for solving physics problems and writing the game code from scratch – impressive feats that Catgpt, Claude and Google Gemini mastered some time ago.

New tools, old tricks

They also presented DeepSearch, a research agent who, like similar tools from Openai and Google, search the Internet and generates extensive reports on these topics.

X Premium Plus subscribers receive immediate access to the GROK-3, but the most powerful version and updated versions will usually live in a special autonomous application or on GROK.com.

Voice interactions similar to Openai “Advanced Voice ModE ”will appear in the coming weeks, when Musk emphasizes that this is not a simple text in speech, but a genuine model of the AI voice, capable of natural, expressive speech.

The developers will gain access to the API in the coming weeks, as well as the possibilities of sound transcription, which makes GROK-3 a powerful tool for third-party applications with AI.

Immediately after demonstrating the example of the game Tetris generated by GROK, XII also spoke about the plans for the AI Gaming Studio, which will allow developers to create games working on the GROK-3.

Right now the model is slowly unfolding. By the time of writing, Decryption I have not yet gained access to the model, but some enthusiasts tried it and are still satisfied with the results.

The computer scientist Lex Friedman, one of the most high-profile voices in the AI space, praised the capabilities of Gorka-3.

I have to widely use GROK 3 (early). My mind is blown up, a very impressive model 🤯 Congratulations to Elon and the team for realizing 👊

– Lex Friedman (@lexFridman) February 18, 2025

Others compared this to leading market competitors.

“GROK 3 + thinking is felt somewhere around the state of the artistic territory of the most powerful models Openai (O1-Pro, 200 US dollars per month) and a little better than Deepseek-R1 and Gemini 2.0 Flash Speading, former Openai co-founder Andrei Karpati in an extensive post on X.

Earlier today, they gave me early access to GROK 3, forcing me, I think, one of the first to conduct a quick check of the atmosphere.

Thinking
✅ Firstly, Grok 3 clearly has a modern model of art (“think”), and he perfectly made a box on my settlement Katana … Pic.twitter.com/qiruan1ife

– Andrey Karpaty (@karpathy) February 18, 2025

X Penny2x user shared a game built from scratch with GROK-3-2D platformer, similar to Mario Bros.

They turned out to be impressed by the ability of Gorka to understand the instructions and improve several iterations.

“I just continue to ask for adjustments, and she continues to spit out the game in one file, which I can put on my desktop and run.” He wrote in mail On X. “This is incredible. We live in the future. Everything is now a developer. ”

The game is available for testing on Thank you, dogField

The company also confirmed the GROK-2 plans with open source, as soon as the GROK-3 was completely matured and works correctly, which is expected to happen in the coming months.

Earlier, XII opened its models after GROK-2, continuing to release more old versions to stimulate innovation, although GROK-2 lags behind the highest level models.

At the moment, the GROK-3 looks skillful in accordance with the fact that the best AI models can already make.

A real test will appear when XAI shows its promised vocal functions, gaming tools and API access in the coming weeks. Now the ball in Openai court, which is configured to issue GPT-4.5 soonField

Edited by Sebastian Sinclair