🌐
Reddit
reddit.com › r › grok
Grok
March 10, 2012 - Grōk https://discord.gg/4VXMtaQHk7 · Created Mar 10, 2012 · Public · Anyone can view, post, and comment to this community 844K 13K · 1 · Please use correct AI art / Text flairs · 2 · Pretty self explanatory. Mark it as NSFW. 3 · Includes things like Grok is censored, Grok cheap in India, Grok system prompt, etc.
🌐
Reddit
reddit.com › r/singularity › grok 3 results are live on livebench
r/singularity on Reddit: Grok 3 results are live on LiveBench
January 8, 2025 - For example, while benchmarks might show that version 3.7 outperforms 3.5 in Aider and Livecode, some users still prefer 3.5. They feel it's a better programming partner, even if the raw numbers say otherwise. ... I mean yea, human preference is human preference. But that’s what lmarena is for. Preference. This is a post about LiveBench and traditional benchmarks. I haven’t used it outside the chat interface either, excited to try it in Cursor. But I reach to a lot of other models before grok even in the chat window.
🌐
Reddit
reddit.com › r/grok › grok 3 live update
r/grok on Reddit: Grok 3 Live Update
March 22, 2024 -
  • 10x more compute than Grok-2, ensuring faster and more accurate results.

  • Advanced Reasoning Model: Elevate your problem-solving with AI that thinks deeper and smarter.

  • DeepSearch: Grok-3's unique DeepSearch is xAI's first agent capable of comprehensive, in-depth internet searches.

Enhanced Capabilities:

  • Big Brain: Leverages significant computational power to tackle complex problems effectively.

  • Premium+ Access: Grok-3 will be initially available exclusively to Premium+ users.

Q&A:

  • Grok-3 Reasoning API will be available in the coming weeks.

  • Grok-3's voice will be native, capable of audio to text transcription.

  • Understands tone and emotions for more intuitive interactions.

  • Once Grok-3 is fully released, xAI will open source Grok-2.

Find elsewhere
🌐
Reddit
reddit.com › r/accelerate › people are seriously downplaying the performance of grok 3
r/accelerate on Reddit: People are seriously downplaying the performance of Grok 3
February 18, 2025 -

I know we all have ill feelings about Elon, but can we seriously not take one second to validates its performance objectively.

People are like "Well, it is still worse than o3", we do not have access to that yet, it uses insane amounts of compute, and the pre-training only stopped a month ago, there is still much much potential to train the thinking models to exceed o3. Then there is "Well, it uses 10-15x more compute, and it is barely an improvement, so it is actually not impressive at all". This is untrue for three reason.
Firstly Grok-3 is definitely a big step up from Grok 2.
Secondly scaling has always been very compute-intensive, there is a reason that intelligence had not been a winning evolutionary trait for a long time and still is. It is expensive. If we could predictably get performance improvements like this for every 10-15x scaling in compute, then we would have Superintelligence in no time, especially considering how now three scaling paradigms stack on top of each other: Pre-Training, Post-Training and RL, inference-time-compute.
Thirdly if you look at the LLaMA paper in 54 days of training with 16000 H100, they had 419 component failures, and the small XAI team is training on 100-200 thousands ~h100's for much longer. This is actually quite an achievement.

Then people are also like "Well, GPT-4.5 will easily destroy this any moment now". Maybe, but I would not be so sure. The base Grok 3 performance is honestly ludicrous and people are seriously downplaying it.

When Grok 3 is compared to other base models, it is waay ahead of the pack. People got to remember the difference between the old and new Claude 3.5 sonnet was only 5 points in GPQA, and this is 10 points ahead of Claude 3.5 Sonnet New. You also got to consider the controversial maximum of GPQA Diamond is 80-85 percent, so a non-thinking model is getting close to saturation. Then there is Gemini-2 Pro. Google released this just recently, and they are seriously struggling getting any increase in frontier performance on base-models. Then Grok 3 just comes along and pushes the frontier ahead by many points.

I feel like a part of why the insane performance of Grok 3 is not validated more is because of thinking models. Before thinking models performance increases like this would be absolutely astonishing, but now everybody is just meh. I also would not count out Grok 3 thinking model getting ahead of o3, given its great performance gains, while still being in really early development.

The grok 3 mini base model is approximately on par with all the other leading base-models, and you can see its reasoning version actually beating Grok-3, and more importantly the performance is actually not too far off o3. o3 still has a couple of months till it gets released, and in the mean time we can definitely expect grok-3 reasoning to improve a fair bit, possibly even beating it.

Maybe I'm just overestimating its performance, but I remember when I tried the new sonnet 3.5, and even though a lot of its performance gains where modest, it really made a difference, and was/is really good. Grok 3 is an even more substantial jump than that, and none of the other labs have created such a strong base-model, Google is especially struggling with further base-model performance gains. I honestly think this seems like a pretty big achievement.

Elon is a piece of shit, but I thought this at least deserved some recognition, not all people on the XAI team are necessarily bad people, even though it would be better if they moved to other companies. Nevertheless this should at least push the other labs forward in releasing there frontier-capabilities so it is gonna get really interesting!

🌐
Reddit
reddit.com › r/futurology › grok 3 launch live updates: grok3 now available
r/Futurology on Reddit: Grok 3 launch LIVE Updates: Grok3 now available
March 21, 2024 - Elon Musk's AI startup xAI is set to launch its Grok 3 chatbot today , with Musk calling it the “smartest AI on Earth.” Share
🌐
Reddit
reddit.com › r/openai › grok 3 & grok 3 think tested: initial impressions
r/OpenAI on Reddit: Grok 3 & Grok 3 THINK Tested: Initial Impressions
February 19, 2025 -

I tested both Grok 3 and Grok 3 THINK on coding, math, reasoning and common sense. Here are a few early observations:

- The non-reasoning model codes better than the thinking model

- The reasoning model is very fast, it looked slightly faster than Gemini 2.0 Flash Thinking, which in itself is quite fast

- Grok 3 THINK is very smart and approaches problems like DeepSeek R1 does, even uses "Wait, but..."

- G3-Think doesn't seem to load balance, it thinks unnecessarily long at times for easy questions, like R1 does

- Grok 3 didn't seem significantly better than existing top models like Claude 3.5 Sonnet or o3-mini, though we'll finalize testing after API access

- G3-Think is not deterministic, it failed 2 our of 3 attempts at a hard coding problem, each having different results (Exercism REST API challenge):

> Either it has a higher than normal temperature setting,

> introduces regressions in the "daily improvements" Elon Musk mentioned,

> or is load balancing different versions

> Coding Challenge GitHub repo: https://github.com/exercism/python/blob/main/exercises/practice/rest-api
> Coding Challenge: https://exercism.org/tracks/python/exercises/rest-api

- For those who just want to see the entire test suite: https://youtu.be/hN9kkyOhRX0

What are your initial impressions of Grok 3?

🌐
Reddit
reddit.com › r/openai › grok 3 isn't the "best in the world" — but how xai built it so fast is wild
r/OpenAI on Reddit: Grok 3 isn't the "best in the world" — but how xAI built it so fast Is wild
April 20, 2025 -

When Grok 3 launched, Elon hyped it up—but didn't give us a 100% proof it was better than the other models. Fast forward two months, xAI has opened up its API, so we can finally see how Grok truly performs.

Independent tests show Grok 3 is a strong competitor. It definitely belongs among the top models, but it's not the champion Musk suggested it would be. Plus, in these two months, we've seen Gemini 2.5, Claude 3.7, and multiple new GPT's arrive.

But the real story behind Grok is how fast xAI execution is:

In about six months, a company less than two years old built one of the world's most advanced data centers, equipped with 200,000 liquid-cooled Nvidia H100 GPUs.

Using this setup, they trained a model ten times bigger than any of the previous models.

So, while Grok 3 itself isn't groundbreaking in terms of performance, the speed at which xAI scaled up is astonishing. By combining engineering skill with a massive financial push, they've earned a spot alongside OpenAI, Google, and Anthropic.

See more details and thoughts in my full analysis here.

I'd really love your thoughts on this—I'm a new author, and your feedback would mean a lot!

🌐
Reddit
reddit.com › r/singularity › grok 3 summary
r/singularity on Reddit: Grok 3 summary
August 15, 2024 - They fine-tuned the political compass questions too, so it scores (-0.2,-3.3), or center-libertarian. ... It is sota in most of the benchmarks they showed. I mean, they probably cherry picked benchmarks but literally every ai release does so. That's hardly criminal. Grok is first (pass1) in AIME2024, GPQA, and livecodebench.