As there has been lots of posting around the Anthropic Opus 4.6 model I decided to get around to a full analysis of ๐บ๐ ๐ฅ๐๐ฃ ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐ฑ๐ฒ๐ฟ to review the codebase and identify gaps in a few areas: โข Application architecture โข Code quality and maintainability โข Security โข Reliability / Operations I ran two independent Claude Code reviews of the same GitHub repository using Sonnet 4.5 vs Opus 4.6 using the same โreport-onlyโ prompt - and then compared the findings. โข One thing jumped out immediately - Opus got the architecture wrong, sub-agents with other sub-agents when they are all called by the Orchestrator. Sonnet was almost spot-on. โข Both missed the Docling MCP use โข Claude โUsageโ consumption was quite different - screenshots at the bottom. ๐ฆ๐ผ๐ป๐ป๐ฒ๐: 8 mins - 45%. ๐ข๐ฝ๐๐ 6 mins - 28% They converged strongly on ๐๐ต๐ฒ ๐๐ฎ๐บ๐ฒ ๐ฐ๐ผ๐ฟ๐ฒ ๐ฟ๐ถ๐๐ธ๐ (no real surprises given the phase of development for this app - ๐ ๐ฌ๐ฏ๐ฆ๐ธ ๐ข๐ฃ๐ฐ๐ถ๐ต ๐ฎ๐ฐ๐ด๐ต ๐ฐ๐ง ๐ต๐ฉ๐ฆ๐ด๐ฆ): โข Several agent services were reachable on the network with no authentication between internal components. โข Some endpoints accepted arbitrary file paths, creating a clear path traversal / local file read risk. โข There were unauthenticated destructive operations in the RAG layer (e.g., reset/clean-up). โข Basic hygiene: .env protection / risk of accidental secret exposure. Where they ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ฑ they provided useful nuance (some of these were ๐ณ๐ฆ๐ข๐ญ๐ญ๐บ ๐จ๐ฐ๐ฐ๐ฅ ๐ช๐ฏ๐ด๐ช๐จ๐ฉ๐ต๐ด): โข Sonnet put more weight on โedgeโ and deployment hardening (HTTPS and Apache config), and surfaced a few extra reliability/perf details (upload memory DoS pattern, RAG naming collisions, HTTP error-handling hygiene). โข Opus went deeper on operational resilience (job lifecycle / zombie jobs - ๐ต๐ฉ๐ช๐ด ๐ธ๐ข๐ด ๐ข ๐จ๐ฐ๐ฐ๐ฅ ๐ฐ๐ฏ๐ฆ) and highlighted architectural throughput issues in the agent layer (blocking patterns that can stall a server under load). The takeaway for me was no real surprise, model diversity is useful - not because one is โrightโ and the other is โwrongโ - but because the overlap confirms what I should fix first, and the differences would help me widen coverage before going to prd. Itโs a simple way to reduce blind spots (and ๐ข๐ฝ๐๐ ๐ฑ๐ผ๐ฒ๐๐ปโ๐ ๐ฎ๐ฝ๐ฝ๐ฒ๐ฎ๐ฟ ๐๐ผ ๐ฏ๐ฒ ๐ฎ ๐๐ถ๐บ๐ฝ๐น๐ฒ ๐ฑ๐ฟ๐ผ๐ฝ-๐ถ๐ป ๐๐ผ ๐ฟ๐ฒ๐ฝ๐น๐ฎ๐ฐ๐ฒ ๐ฆ๐ผ๐ป๐ป๐ฒ๐ that does everything Sonnet can do and more).
All Insights
February 2026
Sunday Coffee & Code: Claude Code - Dual model repo reviews (Sonnet 4.5 vs Opus 4.6)
As there has been lots of posting around the Anthropic Opus 4.6 model I decided to get around to a full analysis of ๐บ๐ ๐ฅ๐๐ฃ ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐ฑ๐ฒ๐ฟ to review the codebase and identify gaps in a few areas.
By Steve Harris
Want to Discuss This Topic?
Steve is always happy to have a direct conversation.
