Skip to main content

Agentic Engineering

Our Website, Built with Agentic AI Workflows

We used our own agentic engineering approach to build craftablesoftware.com — from competitor analysis to production deployment, with specialist AI skills and human oversight at every decision point.

Code on screen representing agentic development workflow

The challenge

craftable software needed a new website. The old Gatsby-based site was outdated, hard to maintain, and didn't communicate what makes us different. We needed to articulate our positioning — a senior-led, AI-augmented engineering consultancy based in Portugal — and build a site that would rank, convert, and be recommended by AI search engines.

Rather than following a traditional agency timeline, we used our own agentic engineering methodology — delivering ~60% faster, with more security, quality gates, and performance. This project would double as a proof of concept: if we could build our own production site this way, we could do it for clients too.

Competitor analysis & positioning

Before writing a single line of code, we studied how other software consultancies — particularly those in the European nearshoring space — presented their services, structured their content, and positioned themselves. We analysed their industry pages, service descriptions, and case study formats to understand what the market expected and where we could differentiate.

Three things stood out from the analysis:

  • Most competitors were generic. They listed technologies and services without connecting them to specific industry problems. We decided to go deep on fewer industries — Healthcare, Fintech, Retail Tech, TravelTech — with compliance frameworks, use cases, and tech stacks tailored to each.
  • AI was a buzzword, not a methodology. Competitors mentioned AI in their marketing but didn't explain how they actually used it in delivery. We decided to make agentic engineering our core differentiator and be transparent about the human-AI collaboration model.
  • Content depth was a competitive moat. The consultancies ranking well had detailed, authoritative pages — not brochure-style summaries. We needed depth: capability breakdowns, regulatory details, and specific use cases for AR/VR, AI/ML, and data governance.

Specialist AI skills

A key innovation in this project was the use of specialist AI skills — purpose-built instruction sets that gave the AI agent domain expertise for specific tasks. Rather than relying on a generalist AI, we created six specialist profiles that the agent could activate on demand:

SEO Expert

Audits pages for technical SEO, on-page optimisation, structured data (JSON-LD), semantic HTML, and meta tag quality. Preserves existing search rankings during the migration from the old Gatsby site.

GEO Specialist

Generative Engine Optimisation — ensures content is structured so that AI assistants (ChatGPT, Claude, Gemini, Perplexity, Google AI Overviews) cite and recommend the brand. Covers entity clarity, claim substantiation, and structured data for AI discoverability.

UI/UX Expert

Applies mobile-first design, WCAG 2.2 AA accessibility, Nielsen Norman's 10 usability heuristics, and Flesch-Kincaid readability scoring. Reviews page layouts, copy clarity, and CTA effectiveness.

QA Engineer (Playwright)

End-to-end functional testing, accessibility auditing via axe-core, visual regression, i18n verification across English and Portuguese, and SEO smoke tests — all automated through Playwright.

Cybersecurity Specialist

HTTP security headers, Content Security Policy, input sanitisation, dependency auditing, and — for the contact page AI assistant — prompt injection defence, output validation, and observability.

Astro & AWS Deploy

Framework architecture decisions (islands, static output, i18n, performance) and AWS infrastructure (S3, CloudFront, Lambda@Edge, SES) with a cost-first mindset for deployment and CDN configuration.

These skills meant the AI agent didn't just write code — it brought specialist knowledge to every task. When the engineer asked for a page audit, the agent could switch between SEO, GEO, accessibility, and security perspectives, each with its own checklist and standards.

SEO & GEO optimisation

Search visibility was a first-class concern from day one — not just traditional SEO, but also Generative Engine Optimisation (GEO) to ensure AI-powered search engines would surface and recommend craftable software.

Traditional SEO

  • Structured data — JSON-LD schema for Organisation, Service, FAQPage, and ItemList on every relevant page
  • Semantic HTML — proper heading hierarchy, landmark regions, and accessible navigation
  • Meta optimisation — unique, keyword-rich titles and descriptions for every page across both languages
  • Performance — static-first architecture with Astro, lazy-loaded images, and CDN delivery through CloudFront
  • Migration safety — redirect mapping from the old Gatsby URLs to preserve existing search equity

Generative Engine Optimisation (GEO)

  • Entity clarity — consistent naming, descriptions, and attributes so AI models build an accurate knowledge graph of craftable software
  • Claim substantiation — every capability claim is backed by specific technologies, frameworks, or compliance standards rather than vague marketing language
  • Content depth — detailed industry pages with regulatory specifics (HIPAA, PCI DSS, GDPR, FCA) that AI models recognise as authoritative
  • Structured answers — FAQ sections, comparison points, and clear problem-solution framing that AI assistants can extract and cite
  • Multi-language presence — full Portuguese translations to capture AI recommendations in Portuguese-speaking markets

The agentic workflow

The process followed our standard agentic delivery model. A senior engineer acted as the orchestrator — making strategic, creative, and business decisions — while AI agents with specialist skills handled the implementation.

Human decisions

The human engineer owned every strategic choice: the framework (Astro), the visual identity, which industries to feature, the competitive positioning, the tone of voice, and the section order on every page. When the agent proposed a layout or copy, the engineer reviewed it against the competitor analysis and either approved, adjusted, or rejected it.

AI agent execution

A single instruction would trigger coordinated changes across page templates, i18n translation files, navigation components, test fixtures, and CSS — all in one pass, in both English and Portuguese. The agent would activate the relevant specialist skill for each task:

  • Summoned the UI/UX skill to reorder homepage sections based on persuasion architecture principles
  • Activated the SEO skill to generate JSON-LD schema and audit meta tags after every content change
  • Used the GEO skill to structure content for AI discoverability — entity definitions, substantiated claims, and FAQ formatting
  • Ran the QA skill to execute Playwright accessibility tests after structural changes and fix violations immediately
  • Applied the cybersecurity skill to review security headers and validate the contact form against injection risks
  • Consulted the Astro/AWS skill for deployment architecture decisions and cost-optimised infrastructure

How it worked in practice

A typical session: the engineer would review the site in a browser, compare it against the competitive analysis, and give a high-level instruction. The AI agent would execute a plan — often touching 10 to 20 files — and the engineer would review the result within minutes.

When the engineer decided the industry pages needed more depth to compete, the agent researched common regulatory frameworks, compliance standards, and technology use cases for each vertical, then produced detailed pages with capability breakdowns, tech stacks, and compliance sections across both languages.

When something went wrong — a cached dev server, an image that failed to download, a logo invisible on its background — the agent diagnosed the root cause and fixed it. The human's role was to set direction and spot problems; the agent figured out why and how to resolve them.

Contact-page LLM assistant & security guardrails

The contact page includes an AI assistant that answers questions about craftable using OpenAI (GPT-4o-mini). We implemented it as an agentic, security-first feature: the cybersecurity specialist skill defined the threat model, and we applied layered guardrails so the assistant stays on-task and safe in production.

LLM process and model configuration

  • Structured prompt — A fixed system prompt describes the assistant’s role and scope (services, pricing, MCP, security, contact). User input is never concatenated into the system prompt; only the current, cleaned message is sent.
  • All languages allowed — The assistant accepts prompts in any language. The original user message is always sent to the model so replies can be in the user’s language. Injection detection runs separately: for mostly non-Latin input we use translate-then-check (translate to English, run the English blocklist on the translation); for Latin input we use the EN+PT blocklist.
  • Server-managed context — The client sends only the latest user message and a session id. Conversation history is not trusted from the client; the server sends a single user turn per request to avoid prompt injection via fake assistant messages.
  • Model constraints — GPT-4o-mini with low temperature (0.3), max 300 tokens per reply, and no tools or function calling so the assistant cannot perform actions.
  • Observability — Every request is traced in Langfuse when configured: prompts, responses, token usage, and latency are visible for debugging and cost control. Optional evaluations and alerts (e.g. cost spike, error rate) can flag hallucinations or off-topic replies.

Security guardrails (defense-in-depth)

  • Input gatekeeping — HTML and control characters are stripped; length is capped at 500 characters. Multilingual injection defence: for mostly Latin input we use an EN+PT blocklist (“ignore previous instructions”, “jailbreak”, “ignorar instruções anteriores”, etc.); for mostly non-Latin input (e.g. Arabic, Thai) we translate to English and run the English blocklist on the translated text (translate-then-check). The model always receives the original user message — validation never alters what the LLM sees. All input is normalised to Unicode NFC and high-entropy payloads (e.g. base64 blobs) are rejected before reaching the model.
  • Output sanitisation — Every LLM reply is HTML-escaped before being sent to the client so it cannot be used for XSS if rendered in the UI.
  • Rate limits — Per-IP limit (10 requests per minute), per-session request cap (5 messages), and a per-session token circuit breaker (20k input+output tokens) prevent abuse and contain cost.
  • Cloudflare Turnstile — Optional server-side verification gates the assistant when configured, reducing automated abuse with lower user friction.
  • FAQ fallback — If the API key is missing or the model call fails, the assistant falls back to a static FAQ so the page always responds; fallback answers are also sanitised.

Langfuse and ChatGPT (OpenAI) configuration

We use the OpenAI SDK wrapped with Langfuse’s observeOpenAI so each completion is automatically traced. In Langfuse we see the full prompt, the model response, token usage, and latency. Session id is passed so conversations can be grouped. For production we set LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, and optionally LANGFUSE_BASE_URL (e.g. EU). The dashboard is used to inspect prompts, tune the system message, add evaluations (e.g. factual grounding), and optional alerts (cost spike, error rate) without code changes.

AWS WAF for early protection

We use AWS WAF in front of the API (e.g. on API Gateway or CloudFront) as a perimeter layer: rate limiting and managed rules (e.g. AWS Managed Rules for common threats) help block abuse and DDoS before traffic reaches the assistant. WAF complements but does not replace application-level validation — translate-then-check and blocklists in our API remain responsible for prompt-injection detection. See our docs on security and the cybersecurity skill for the full defense-in-depth picture.

What we learned

Specialist skills beat generalist prompting

Having purpose-built SEO, GEO, UI/UX, QA, security, and infrastructure skills transformed the agent from a code generator into a multi-disciplinary team. Each skill brought its own standards, checklists, and domain vocabulary.

Competitor analysis drives better content

Studying how other consultancies positioned themselves — and where they fell short — gave us a clear content strategy. We invested in depth where competitors stayed shallow, and we were specific where they were generic.

GEO is the new SEO frontier

Optimising for AI-powered search engines required a different approach: entity clarity, substantiated claims, and structured answers. Traditional SEO alone is no longer enough — you need content that AI models can parse, trust, and cite.

Speed comes from breadth, not shortcuts

The speedup wasn't from cutting corners. It came from an agent's ability to make coordinated changes across dozens of files simultaneously — i18n, navigation, pages, tests, images — something a human would do one file at a time over hours.

Human judgement remains essential

Every meaningful decision — competitive positioning, tone, which industries to keep or cut, when content wasn't working — came from the human. AI agents execute brilliantly, but they don't have the business context or market intuition to make strategic calls.

LLM guardrails and multilingual injection defense

Securing the contact-page assistant required defense-in-depth: we allow all languages but block injection intent in any language via translate-then-check for non-Latin input and EN+PT blocklist for Latin input, plus output sanitisation, rate and token limits, and observability via Langfuse. We treat prompt injection as an architecture problem — perimeter (WAF), application (validation), and monitoring — not just a prompt-engineering fix.

30+

Pages delivered
(EN + PT)

6

Specialist AI
skills deployed

Days

Not weeks
to production

100%

Human-reviewed
decisions

Tech stack & skills

Astro Tailwind CSS TypeScript React Playwright axe-core AWS (S3 + CloudFront) Claude AI Cursor IDE

Specialist skills

SEO Expert GEO Specialist UI/UX Expert QA Engineer Cybersecurity Astro & AWS Deploy

Want to see agentic engineering in action?

This is how we work on every project. Tell us about yours and we'll show you what's possible.

Start a conversation