Videos
I wanted to create some unique codes for some work, but they had to be little senseful like a combination of city names etc, so I thought that 2000 codes would require a good context window and hence gave the task to Gemini 2.5 pro on Google Ai Studio. I asked it to create 2k codes but it only did 1410 and then after prompt it said it is generating 590 more it created around 700 more.
I gave the same prompts in the same sequence to Gpt 4o on a plus plan, it gave me a csv which had 2000 codes. On Gemini btw I had to download the 2 text files.
The best part, the codes by gpt didn't had any duplicate but the one with Gemini had like 5-6 repeated ones.
If curious
Gemini Token count - 21,299 / 1,048,576
Do you guys also had similar experience that proved gpt to be better?
Gemini 2.5 vs ChatGPT 4o – Tested on a Real Renovation Project (with Results)
I recently compared Gemini 2.5 Pro and ChatGPT 4o on a real apartment renovation (~75 m²). I gave both models the same project scope (FFU) for a full interior renovation: flooring, kitchen, bathroom, electrical, demolition, waste handling, and so on.
The renovation is already completed — so I had a final cost to compare against.
🟣 ChatGPT 4o:
Instantly read and interpreted the full FFU
Delivered a structured line-by-line estimate using construction pricing standards
Required no extra prompting to include things like demolition, site management, waste and post-cleanup
Estimated within ~3% of the final project cost
Felt like using a trained quantity surveyor
🔵 Gemini 2.5 Pro:
Initially responded with an estimate of 44,625 SEK for the entire renovation
After further clarification and explanations (things ChatGPT figured out without help), Gemini revised its estimate to a range of 400,000–1,000,000 SEK
The first estimate was off by over 90%
The revised range was more realistic but too wide to be useful for budgeting or offer planning
Struggled to identify FFU context or apply industry norms without significant guidance
🎯 Conclusion
Both models improved when fed more detail — but only one handled the real-life FFU right from the start. ChatGPT 4o delivered an actionable estimate nearly identical to what the renovation actually cost.
Gemini was responsive and polite, but just not built for actual estimating.
Curious if others working in construction, architecture or property dev have run similar tests? Would love to hear your results.
EDIT:
Some have asked if this was just a lucky guess by ChatGPT – totally fair question.
But in this case, it's not just a language model making guesses from the internet. I provided both ChatGPT and Gemini with a PDF export of AMA Hus 24 / Wikells – a professional Swedish construction pricing system used by contractors. Think of it as a trade-specific estimation catalog (with labor, materials, overhead, etc.).
ChatGPT used that source directly to break down the scope and price it professionally. Gemini had access to the exact same file – but didn’t apply it in the same way.
A real test of reasoning with pro tools.