I wrote an in-depth comparison of Grok 3 against GPT-4, Google Gemini, and DeepSeek V3. Thought I'd share some key takeaways:
Grok 3 excels in reasoning and coding tasks, outperforming others in math benchmarks like AIME.
Its "Think" and "Big Brain" modes are impressive for complex problem-solving.
However, it falls short in real-time data integration compared to Google Gemini.
The $40/month subscription might be a dealbreaker for some users.
Each tool has its strengths: GPT-4 for creative writing, Gemini for real-time search, and DeepSeek for efficiency.
The choice really depends on your specific needs. For instance, if you're doing a lot of coding or mathematical work, Grok 3 might be worth the investment. But if you need up-to-the-minute info, Gemini could be a better fit.
For those interested, I've got a more detailed breakdown here: https://aigptjournal.com/explore-ai/ai-guides/grok-3-vs-other-ai-tools/
What's your experience with these AI tools? Any features you find particularly useful or overrated?
Videos
I wrote an in-depth comparison of Grok 3 against GPT-4, Google Gemini, and DeepSeek V3. Thought I'd share some key takeaways:
-
Grok 3 excels in reasoning and coding tasks, outperforming others in math benchmarks like AIME.
-
Its "Think" and "Big Brain" modes are impressive for complex problem-solving.
-
However, it falls short in real-time data integration compared to Google Gemini.
-
The $40/month subscription might be a dealbreaker for some users.
-
Each tool has its strengths: GPT-4 for creative writing, Gemini for real-time search, and DeepSeek for efficiency.
The choice really depends on your specific needs. For instance, if you're doing a lot of coding or mathematical work, Grok 3 might be worth the investment. But if you need up-to-the-minute info, Gemini could be a better fit.
For those interested, I've got a more detailed breakdown here: https://aigptjournal.com/explore-ai/ai-guides/grok-3-vs-other-ai-tools/
What's your experience with these AI tools? Any features you find particularly useful or overrated?
If you want to see the full post with video demos, here is the full X thread: https://x.com/alex_prompter/status/1892299412849742242
1/ ๐ Quantum entanglement
Prompt I used:
"Explain the concept of quantum entanglement and its implications for information transfer."
Expected Answer:
๐ Particles remain correlated over distance
โก Cannot transmit information faster than light
๐ Used in quantum cryptography, teleportation
Results:
๐ DeepSeek R1: Best structured answer, explained Bell's theorem, EPR paradox, and practical applications
๐ฅ Grok 3: Solid explanation but less depth than DeepSeek R1. Included Einstein's "spooky action at a distance"
๐ฅ ChatGPT o3-mini: Gave a basic overview but lacked technical depth
Winner: DeepSeek R1
2/ ๐ฟ Renewable Energy Research (Past Month)
Prompt I used:
"Summarize the latest renewable energy research published in the past month."
Expected Answer:
๐ Identify major energy advancements in the last month
๐ Cite sources with dates
๐ Cover solar, wind, hydrogen, and policy updates
Results:
๐ DeepSeek R1: Most comprehensive. Covered solar, wind, AI in energy forecasting, and battery tech with solid technical insights
๐ฅ Grok 3: Focused on hydrogen storage, solar on reservoirs, and policy changes but lacked broader coverage
๐ฅ ChatGPT o3-mini: Too vague, provided country-level summaries but lacked citations and specific studies
Winner: DeepSeek R1
3/ ๐ฐ Universal Basic Income (UBI) Economic Impact
Prompt I used:
"Analyze the economic impacts of Universal Basic Income (UBI) in developed countries."
Expected Answer:
๐ Cover effects on poverty, employment, inflation, government budgets
๐ Mention real-world trials (e.g., Finland, Alaska)
โ๏ธ Balance positive & negative impacts
Results:
๐ Grok 3: Best structured answer. Cited Finland's trial, Alaska Permanent Fund, and analyzed taxation effects
๐ฅ DeepSeek R1: Detailed but dense. Good breakdown of pros/cons, but slightly over-explained
๐ฅ ChatGPT o3-mini: Superficial, no real-world trials or case studies
Winner: Grok 3
4/ ๐ฎ Physics Puzzle (Marble & Cup Test)
Prompt I used:
"Assume the laws of physics on Earth. A small marble is put into a normal cup and the cup is placed upside down on a table. Someone then takes the cup and puts it inside the microwave. Where is the ball now? Explain your reasoning step by step."
Expected Answer:
๐ฏ The marble falls out of the cup when it's lifted
๐ The marble remains on the table, not in the microwave
Results:
๐ DeepSeek R1: Thought the longest but nailed the physics, explaining gravity and friction correctly
๐ฅ Grok 3: Solid reasoning but overcomplicated the explanation with excessive detail
๐ฅ ChatGPT o3-mini: Incorrect. Claimed the marble stays in the cup despite gravity
Winner: DeepSeek R1
5/ ๐ก๏ธ Global Temperature Trends (Last 100 Years)
Prompt I used:
"Analyze global temperature changes over the past century and summarize key trends."
Expected Answer:
๐ ~1.5ยฐC warming since 1925
๐ Clear acceleration post-1970
โ๏ธ Cooling period 1940โ1970 due to aerosols
Results:
๐ Grok 3: Best structured answer. Cited NASA, IPCC, NOAA, provided real anomaly data, historical context, and a timeline
๐ฅ DeepSeek R1: Strong details but lacked citations. Good analysis of regional variations & Arctic amplification
๐ฅ ChatGPT o3-mini: Basic overview with no data or citations
Winner: Grok 3
๐ Final Scoreboard
๐ฅ DeepSeek R1: 3 Wins
๐ฅ Grok 3: 2 Wins
๐ฅ ChatGPT o3-mini: 0 Wins
๐ DeepSeek R1 is the overall winner, but Grok 3 dominated in citation-based research.
Let me know what tests you want me to run next!
Here's a review of Deep Research - this is not a request.
So I have a very, very complex case regarding my employment and starting a business, as well as European government laws and grants. The kind of research that's actually DEEP!
So I tested 4 Deep Research AIs to see who would effectively collect and provide the right, most pertinent, and most correct response.
TL;DR: ChatGPT blew the others out of the water. I am genuinely shocked.
Ranking:
1. ChatGPT: Posed very pertinent follow up questions. Took much longer to research. Then gave very well-formatted response with each section and element specifically talking about my complex situation with appropriate calculations, proposing and ruling out options, as well as providing comparisons. It was basically a human assistant. (I'm not on Pro by the way - just standard on Plus)
2. Grok: Far more succinct answer, but also useful and *mostly* correct except one noticed error (which I as a human made myself). Not as customized as ChatGPT, but still tailored to my situation.
3. DeepSeek: Even more succinct and shorter in the answer (a bit too short) - but extremely effective and again mostly correct except for one noticed error (different error). Very well formatted and somewhat tailored to my situation as well, but lacked explanation - it was just not sufficiently verbose or descriptive. Would still trust somewhat.
4. Gemini: Biggest disappointment. Extremely long word salad blabber of an answer with no formatting/low legibility that was partially correct, partially incorrect, and partially irrelevant. I could best describe it as if the report was actually Gemini's wordy summarization of its own thought process. It wasted multiple paragraphs on regurgitating what I told it in a more wordy way, multiple paragraphs just providing links and boilerplate descriptions of things, very little customization to my circumstances, and even with tailored answers or recommendations, there were many, many obvious errors.
How do I feel? Personally, I love Google and OpenAI, agnostic about DeekSeek, not hot on Musk. So, I'm extremely disappointed by Google, very happy about OpenAI, no strong reaction to DeepSeek (wasn't terrible, wasn't amazing), and pleasantly surprised by Grok (giving credit where credit is due).
I have used all of these Deep Research AIs for many many other things, but often times my ability to assess their results was limited. Here, I have a deep understanding of a complex international subject matter with laws and finances and departments and personal circumstances and whatnot, so it was the first time the difference was glaringly obvious.
What this means?
I will 100% go to OpenAI for future Deep Research needs, and it breaks my heart to say I'll be avoiding this version of Gemini's Deep Research completely - hopefully they get their act together. I'll use the other for short-sweet-fast answers.
Best for general information, scientific subjects, language learning, solving Maths exercises, etc?