HK$5,000-10,000 Per Month with Commission
晶晶香港分公司
Wangjiao Wet Market
58 Daojia sincerely invites you to join. To provide you with better service, we hope you can spare a precious minute to fill out the questionnaire below, so we can better understand your needs. Author: Duan Xiaocao Link: https://www.zhihu.com/question/1936601885311239615/answer/1936983438256244569 Source: Zhihu Copyright belongs to the author. For commercial use, please contact the author for authorization; for non-commercial use, please credit the source. Review: GPT-5 returns to the top, but it is not the AGI everyone expected. The most important points first: 1. Accessibility: The barrier to cutting-edge intelligence has been lowered again. All users can access GPT-5, and even free users can experience it to a limited extent (beyond the limit, it will downgrade to GPT-5-mini). Sam Altman has fulfilled his promise from February, lowering the barrier to experiencing cutting-edge intelligence—this deserves respect. The rumor circulating yesterday that GPT-5 would cost $200 has been confirmed as fake news. 2. Pricing: Lower than previous GPT/o generations, closely competing with Gemini 2.5 Offers three API variants: GPT-5, mini, and nano (gpt-5-pro has no API), priced as follows: GPT-5: $1.25 per million input tokens, $10 per million output tokens GPT-5 Mini: $0.25 per million input tokens, $2.00 per million output tokens GPT-5 Nano: $0.05 per million input tokens, $0.40 per million output tokens Frankly, this pricing is highly competitive—lower than GPT-4.1/GPT-4o, on par with Gemini 2.5, lower than Claude Sonnet 4, and far below Claude Opus 4. This means that GPT-5, while matching Gemini 2.5 in price, shifts the current model's "Pareto frontier" upward. 3. Performance: Returns to SOTA, but without "epoch-making" advantages On LMArena, GPT-5 achieved first place in all categories and overall rankings: GPT-5 reclaimed the top spot from Gemini 2.5 Pro: However, the lead isn't large enough—Gemini 3! Gemini 3! Gemini 3! Moreover, GPT-5’s core this time is fusion and built-in routing, with internal model relationships as shown in the diagram: 4. Revealing the true identity of previously anonymous models The "summit" on LMArena is GPT-5 "Horizon" on OpenRouter is part of the GPT-5 family (though it's still unclear whether it's GPT-5 or mini) Practical testing. Not yet available in ChatGPT, currently conducting side-by-side tests on LMArena (vs Gemini 2.5 Pro). Generate an SVG of a pelican riding a bicycle GPT-5 (essentially consistent with the previous Horizon test, arguably the best among current models): Gemini 2.5 Pro (seems a bit too fat): Draw a Pikachu playing basketball GPT-5 (features mostly accurate): Gemini 2.5 Pro (expressive, but...): Code generation capability. On https://gpt-examples.com/, there are many official prompts and programming examples: Take one prompt at random and compare with Claude. This is GPT-5's example: This is Claude Sonnet 4's result: All necessary features are present, though the blue-purple gradient clearly indicates Claude; GPT-5's page elements might be more appealing: API. In addition to showcasing example prompts and outputs as above, making it easy for users to copy and test directly, OpenAI has specifically written documentation for using GPT-5. Variants and best use cases: gpt-5: Complex reasoning, broad world knowledge, and code-intensive or multi-step agent tasks gpt-5-mini: Cost-optimized inference and chat; balances speed, cost, and capability gpt-5-nano: High-throughput tasks, especially simple instruction following or classification New features and parameters in the API: The reasoning.effort parameter controls the number of reasoning tokens generated by the model before producing a response. Early reasoning models like o3 only supported low, medium, and high: - low prioritizes response speed and fewer tokens - high emphasizes more comprehensive reasoning The new minimal setting generates very few reasoning tokens in scenarios requiring the fastest time-to-first-token. Compared to generating none, allowing the model to produce minimal tokens when necessary typically results in better performance. Default is medium. from openai import OpenAI client = OpenAI() response = client.responses.create( model="gpt-5", input="How much gold would it take to coat the Statue of Liberty in a 1mm layer?", reasoning={ "effort": "minimal" } ) print(response) ARC-AGI-2 released GPT-5's results, which did not surpass Grok 4's score, ranking second: Today's live demo had multiple issues: two diagrams were incorrect (the published blog is fine, but the presentation wasn't reviewed—very unprofessional), and one demo case was criticized for violating physical laws. First error: Correct diagram: Second error (left: 50 < 47.4): Correct diagram: Third error, regarding the airplane wing lift demo: In summary: GPT-5's overall capability is indeed currently the strongest, and considering accessibility and cost-effectiveness, it can be said to be the most usable model today, but the extent of its advantage is not overwhelmingly large. However, it lost to Grok 4 on ARC-AGI-2, and its coding ability may not consistently beat Claude 4; writing style also depends heavily on personal user preference (for example, I personally prefer the earlier Grok 3 and current Gemini 2.5 Pro, and haven't used ChatGPT as my primary tool for about three months). Regardless, GPT-5 has finally been released. I feel somewhat let down—it's not as groundbreaking as imagined, more like a model that integrates GPT-4o and o3, reduces hallucinations, updates the dataset, and catches up to the current leading-tier capabilities. Of course, since it's just launched, we need to use it more, compare more, and observe community feedback to truly determine how good GPT-5 really is. Edited on 2025-08-08 09:49