A decade of production systems, distilled to the engagements that still shape how I think. Two cases below. Both real, both shipped, both teach something different about the cost of getting architecture wrong.
Play Together is a family-oriented gaming platform. Parents and kids upload photos, short videos, and voice notes across the week. Every weekend, the platform produces a cartoonized family video that stitches those moments together, and the same data quietly feeds a personalized game feed during the week, including a trivia experience built on RAG retrieval over what the family had been sharing.
The interesting architecture problem was the video itself. One product surface, two very different workloads. The cartoonization of uploaded images had to feel immediate, because a photo with no preview is a photo the family stops uploading. The rest of the video assembly needed to run in a weekly batch window where cost and latency were negotiable but reliability was not.
I built it as a four-step pipeline. Step one, cartoonization, ran on demand against every upload. This was the only hot path in the system, and the only one the user ever waited on. Step two, which kicked off on weekends, used the OpenAI API to generate transition prompts that described how to bridge each pair of cartoon frames. Step three passed those prompts to the Kling API to produce the actual transition clips. Step four called into an in-house Flask service that wrapped FFmpeg, taking the cartoons and the generated transitions and merging them into the final weekly video.
Splitting hot and cold like this was the call that made the product viable. Treating every step as real-time would have burned the budget on transitions nobody was waiting for. Treating every step as batch would have broken the upload loop that kept families active. Same data, same destination, different SLAs, and the architecture had to admit that out loud.
The second half of the work was the retrieval side. The same uploads that fed the weekend video also fed the weekday game feed. I set up embedding pipelines and pgvector-based semantic search in Supabase, with a context-engineering layer that gave the agents memory-like behavior across challenges, photos, voice notes, and text responses. The trivia experience pulled from that index in real time so the questions were about the family, not about the world.
What the engagement taught me, and what I now carry into every AI-heavy client, is this: the pipeline is never the hard part. Deciding which step the user is allowed to wait on is.
Last Gameboard built a tabletop device that ran a custom Android OS and talked to companion apps on the players' phones. I led the backend pod for three and a half years. What I inherited and what I left behind looked like two different systems.
When I joined, the backend was a single monorepo on PostgreSQL with a classic three-layer architecture. It worked for the product at the time. It did not work for where the product was going. Over the arc of the engagement I ran four intertwined migrations, none of which could be done in isolation.
From Postgres to Neo4j. The domain was fundamentally relational in the graph sense, not the tabular one. Players, sessions, devices, game states, and companion connections were a web, not a spreadsheet. Moving to Neo4j was a bet that the queries coming down the roadmap were graph traversals dressed up as joins, and it paid off.
From three-layer to DDD and hexagonal. I refactored the code toward a domain-centric architecture with explicit ports and adapters. The point was not ceremony. The point was to let the business rules outlive the next infrastructure change, because I already knew more of those changes were coming.
From monorepo to microservices plus shared libraries. I broke the monorepo into a set of Kotlin services on AWS EKS, with shared libs published through JFrog. GitLab pipelines handled the build, test, and deploy paths. K8s gave the device fleet the isolation and scaling story it needed.
The most interesting piece, and the part I still reach for mentally when I design real-time systems, was the communication protocol I designed between the tabletop device and the companion apps. It ran over WebSockets with a router component both in the cloud and on the device itself, so that nearby play stayed low-latency while remote play still worked. I built SDKs in TypeScript and Kotlin: TypeScript for the React Native companion apps, Kotlin for the embedded companions that ran inside the tabletop's custom Android OS.
The lesson I carry from Last Gameboard is the opposite of the one from Panasonic Well, and both are true: when the surface area is this large and the lifespan is this long, the cost of starting from the wrong abstractions is not the refactor. It is every feature you ship through them until you do.
Start with a 30 minute intro call. No deliverable. If there is a real engagement here, we will know by the end.