This weekendโs experiment was conceptually simple: ๐ค๐ข๐ฏ ๐ ๐ถ๐ด๐ฆ โ๐จ๐ฐ๐ฐ๐ฅ ๐ฅ๐ฆ๐ด๐ช๐จ๐ฏโ ๐ช๐ฏ ๐ฐ๐ฏ๐ฆ ๐ค๐ฐ๐ฅ๐ฆ๐ฃ๐ข๐ด๐ฆ ๐ต๐ฐ ๐ณ๐ฆ๐ง๐ข๐ค๐ต๐ฐ๐ณ ๐ข ๐๐ฐ๐ ๐ช๐ฏ ๐ข๐ฏ๐ฐ๐ต๐ฉ๐ฆ๐ณ ๐ค๐ฐ๐ฅ๐ฆ๐ฃ๐ข๐ด๐ฆ? The short answer is yes, hereโs the approach I took: ๐ฆ๐๐ฒ๐ฝ ๐ญ.ย Prompt Claude Code (Opus 4.6) to extract the key design characteristics (architectural patterns, boundaries, scalability model, UX patterns, and operational behaviours) from a repo I am happy with - all the non-functional stuff.ย Save that analysis in Gherkin requirements syntax (The idea is to use that Gherkin file as a ๐ฝ๐ผ๐ฟ๐๐ฎ๐ฏ๐น๐ฒ ๐ฑ๐ฒ๐๐ถ๐ด๐ป ๐ฐ๐ผ๐ป๐๐ฟ๐ฎ๐ฐ๐). ๐ฆ๐๐ฒ๐ฝ ๐ฎ. Prompt CC, against the destination code-base (which had a PoC Python script and README) to refactor it based on the Gherkin requirements.ย No change to the functionality, just the non-functional pieces. ๐ช๐ต๐ฎ๐ ๐ฐ๐ฎ๐บ๐ฒ ๐ผ๐๐ ๐ผ๐ณ ๐ฆ๐๐ฒ๐ฝ ๐ญ: โข 12 Feature groups covering architecture, orchestration, scalability, security, observability, data, and deployment, e.g. Web and Agent separation, API contract, multi-agent orchestration, job completion etc. โข 61 Gherkin formatted requirements A few examples of the kinds of architecture rules captured: โข The web tier ๐ผ๐ป๐น๐ ๐ฑ๐ผ๐ฒ๐ ๐๐ฝ๐น๐ผ๐ฎ๐ฑ / ๐๐๐ฎ๐๐๐ / ๐ฑ๐ผ๐๐ป๐น๐ผ๐ฎ๐ฑ - it never runs agent logic. โข Web โ agents communicate ๐ผ๐ป๐น๐ ๐๐ถ๐ฎ ๐๐ต๐ฒ ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ and the job runner. โข Job runners ๐๐ฐ๐ฎ๐น๐ฒ ๐ต๐ผ๐ฟ๐ถ๐๐ผ๐ป๐๐ฎ๐น๐น๐. โข Per-step pipeline ๐ฝ๐ฟ๐ผ๐ด๐ฟ๐ฒ๐๐ ๐ถ๐ ๐๐ฟ๐ฎ๐ฐ๐ธ๐ฒ๐ฑ, timed, and queryable. โข Uploads artifacts and ๐ฒ๐ป๐ณ๐ผ๐ฟ๐ฐ๐ฒs ๐ผ๐๐ป๐ฒ๐ฟ๐๐ต๐ถ๐ฝ ๐ถ๐๐ผ๐น๐ฎ๐๐ถ๐ผ๐ป for safe handling. โข P๐ฟ๐ผ๐ฑ๐๐ฐ๐ฒ๐ ๐ฎ๐ป ๐ฎ๐๐ฑ๐ถ๐ ๐๐ฟ๐ฎ๐ถ๐น thatโs useful when things go sideways. ๐ข๐ฏ๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป๐ ๐ณ๐ฟ๐ผ๐บ ๐ฆ๐๐ฒ๐ฝ ๐ฎ: โข Consumed 56% of my Pro plan capacity. โข Took 16 minutes to develop a plan, checklist, generate code and PR. โข Generated 2,892 lines of code. โข Had to be prompted to update documentation, which was fair enough, I didnโt have it in the original prompt. โข With the process down, the refactoring of a small PoC into a multi-agent solution, while maintaining good design would takes less than 45 minutes.ย (๐๐ง ๐ค๐ฐ๐ถ๐ณ๐ด๐ฆ ๐ธ๐ฆ ๐ข๐ญ๐ญ ๐ฌ๐ฏ๐ฐ๐ธ ๐ต๐ฉ๐ช๐ด ๐ฐ๐ฏ๐ญ๐บ ๐จ๐ฆ๐ต๐ด ๐ถ๐ด ๐ฐ๐ฏ๐ฆ ๐ด๐ต๐ฆ๐ฑ ๐ฐ๐ง ๐ต๐ฉ๐ฆ ๐ธ๐ข๐บ ๐ต๐ฐย ๐ฑ๐ณ๐ฐ๐ฅ๐ถ๐ค๐ต๐ช๐ฐ๐ฏ - ๐ฃ๐ถ๐ต ๐ช๐ตโ๐ด ๐ข ๐ฃ๐ช๐จ ๐ด๐ต๐ฆ๐ฑ). Why I like this approach: โขย Itโs ๐บ๐ผ๐ฟ๐ฒ ๐ฑ๐๐ฟ๐ฎ๐ฏ๐น๐ฒ than a README that you rely on someone to use. โขย Itโs harder to accidentally violate during a refactor - let CC do it and use itโs own checklist to ๐๐ฟ๐ฎ๐ฐ๐ธ ๐๐ต๐ฒ ๐๐ผ๐ฟ๐ธ. โขย It makes architecture something you can ๐ฟ๐ฒ๐๐ถ๐ฒ๐ ๐ฎ๐ป๐ฑ ๐ฒ๐ป๐ณ๐ผ๐ฟ๐ฐ๐ฒ, not just talk about. Next step for me: fire up the code and test it If youโve tried anything similar, Iโd love to hear what worked and what didnโt.
All Insights
February 2026
Sunday Coffee & Code: Turning Existing Architecture Into Refactoring (with Claude Code + Gherkin)
This weekendโs experiment was conceptually simple: ๐ค๐ข๐ฏ ๐ ๐ถ๐ด๐ฆ โ๐จ๐ฐ๐ฐ๐ฅ ๐ฅ๐ฆ๐ด๐ช๐จ๐ฏโ ๐ช๐ฏ ๐ฐ๐ฏ๐ฆ ๐ค๐ฐ๐ฅ๐ฆ๐ฃ๐ข๐ด๐ฆ ๐ต๐ฐ ๐ณ๐ฆ๐ง๐ข๐ค๐ต๐ฐ๐ณ ๐ข ๐๐ฐ๐ ๐ช๐ฏ ๐ข๐ฏ๐ฐ๐ต๐ฉ๐ฆ๐ณ ๐ค๐ฐ๐ฅ๐ฆ๐ฃ๐ข๐ด๐ฆ?
By Steve Harris
Want to Discuss This Topic?
Steve is always happy to have a direct conversation.
