<div class="wp-block-smartcloud-ai-kit-feature"></div>

Architecture · Deep dive · AI backend

Private AI and RAG Backend Architecture for WordPress

A deep dive into the AI‑Kit backend: local-first WordPress AI, backend fallback, Bedrock-powered generation, Knowledge Base retrieval, frontend/admin API separation and static export friendly protection.

Architecture thesis: AI for WordPress should not be a black-box SaaS proxy hidden behind a plugin button. The safer pattern is a local-first plugin experience with an optional customer-owned AWS backend for model calls, RAG, citations, guardrails, reCAPTCHA, WAF, logs and knowledge-base ingestion.

Why WordPress AI needs an architecture, not only a prompt box

Adding an AI button to WordPress is easy. Designing where content is processed, which endpoints are public, how grounding works, how model costs are controlled, and how a static frontend can call the backend safely is the hard part.

AI‑Kit starts with the least invasive path: local, on-device browser AI when available. That works well for editor-side rewriting, translation, metadata generation and lightweight tasks. But production websites often need more: fallback when local AI is unavailable, frontend chatbots, DocSearch, multimodal prompts, citations, knowledge-base grounding and protection against public endpoint abuse.

The AI‑Kit backend is the serverless side of that model. It moves backend AI execution into the customer’s AWS account instead of routing every request through the WordPress server or a shared plugin vendor runtime.

System boundary

WordPress editor / Media Library / frontend blocks
        │
        ├─ local mode when browser AI is available
        │
        ▼
AI-Kit JavaScript runtime
        │
        ├─ /admin/* routes for WP Admin and trusted operations
        └─ /frontend/* routes only when public features are enabled
                 │
                 ▼
          Amazon API Gateway
                 │
                 ▼
          AiHandlerFunction
  prompt, writer, rewriter, summarizer,
  translator, language detector, proofreader,
  upload URL helper and knowledge-base listing
                 │
        ┌────────┴────────┐
        ▼                 ▼
Amazon Bedrock        Knowledge Base pipeline
Nova models           S3 documents + S3 Vectors
Guardrails            DynamoDB debounce state
                      EventBridge scheduler
                      KnowledgeBaseSyncFunction

The frontend does not need model provider keys. WordPress does not need to proxy prompts through PHP. Static exports can still use frontend routes when those routes are explicitly enabled and protected with the selected public-access model.

What the AI‑Kit backend stack provisions

Building blockPurposeKey design choiceOperational note
API Gateway REST APIExposes admin and optional frontend AI routes/admin/* is always the trusted surface; /frontend/* appears only when feature toggles enable itKeep public and privileged routes separate even when they share handler code.
AiHandlerFunctionUnified Lambda handler for prompt and language capabilitiesOne warm execution path for prompt, write, rewrite, summarize, translate, proofread, detect-language and KB listingShared validation, metrics, guardrails and middleware reduce operational spread.
Amazon Bedrock modelsRuns generation, translation-style tasks and RAG answer synthesisFrontend model can be cheaper/lighter than admin model through parameterized model selectionModel IDs are architecture parameters, not hard-coded plugin assumptions.
Bedrock Knowledge Base + S3 VectorsProvides managed retrieval over documentation or client contentCreate a new KB or reuse an existing one via stack parametersKnowledgeBaseId and DataSourceId outputs are part of the integration contract.
Docs and temp assets S3 bucketStores KB documents and temporary prompt image uploadsSeparate document prefix, configuration prefix and temp-assets prefixTemp image objects should expire quickly; KB documents should be versioned deliberately.
KnowledgeBaseSyncFunctionRuns document ingestion workflowsS3/EventBridge/DynamoDB debounce loop avoids starting ingestion for every small file eventUseful when WordPress regenerates multiple KB documents during publishing.
reCAPTCHA + SSM/KMSProtects open frontend endpointsSecret stored as encrypted SSM parameter and fetched by handlersImportant when FrontendApiAuthMode is NONE for static-site friendliness.
AWS WAF and throttlingLimits abuse on public/admin API pathsSeparate allow/deny/rate rules for frontend and admin surfacesAI endpoints are cost-bearing; public access needs protection beyond CORS.
CloudWatch, DLQ and alertsProvides logs, metrics, failed invocation capture and optional notificationsPer-function log retention, custom metrics and shared SQS DLQAI features need observability because cost, latency and quality are all runtime concerns.

Endpoint surface: one backend, two trust zones

The stack’s most important product decision is the separation between admin and frontend routes. The same capability may exist in both zones, but the trust model is different.

CapabilityAdmin routeFrontend routeWhen frontend is createdDesign note
Prompt / chatbot / DocSearch/admin/prompt/frontend/promptEnableChatbotBackend=trueCan use KB, citations, regeneration, feedback metadata and optional image inputs.
Generate upload URL/admin/generate-upload-url/frontend/generate-upload-urlEnableChatbotBackend=trueUploads images to S3 via presigned PUT, then passes keys to prompt requests.
Summarize/admin/summarize/frontend/summarizeEnableSummarizerBackend=trueSummarization usually disables KB by default because the source text is already supplied.
Writer/admin/write/frontend/writeEnableLanguageAIBackend=trueUseful as backend fallback for editor or frontend generation features.
Rewriter/admin/rewrite/frontend/rewriteEnableLanguageAIBackend=trueHonors tone, format and length controls while keeping secrets out of the browser.
Translator/admin/translate/frontend/translateEnableLanguageAIBackend=trueCan be paired with automatic language detection flows.
Proofreader/admin/proofread/frontend/proofreadEnableLanguageAIBackend=trueReturns corrected text and structured correction metadata.
Language detector/admin/detect-language/frontend/detect-languageEnableLanguageAIBackend=trueUses a backend language detection path instead of assuming the browser always provides it.
Knowledge bases/admin/knowledge-basesNot a public routeAdmin onlyListing and selecting backend knowledge resources belongs to trusted configuration UX.

This is why “AI backend” is not one checkbox. Chat, summarization and language tools have different exposure, cost and abuse profiles. The stack lets them be enabled independently instead of publishing every route by default.

The RAG pipeline

User question or DocSearch query
        │
        ▼
Prompt route receives request
        │
        ├─ validate auth / reCAPTCHA / WAF path expectations
        ├─ apply guardrail and input checks
        │
        ▼
Query builder predicts category, subcategory and tags
        │
        ▼
Bedrock Knowledge Base retrieval
        │
        ├─ optional strict metadata filtering
        ├─ optional rerank
        └─ retrieved snippets with citation spans
        │
        ▼
Answer template selection
        ├─ KB_ONLY
        ├─ ASK_WHEN_NO_KB
        └─ KB_PREFERRED
        │
        ▼
Bedrock generation with citations and grounding policy
        │
        ▼
AI-Kit renders answer, sources and highlights in WordPress UI

The key point is that RAG is not just “send documents to a model.” The backend has to decide what sources are allowed, how categories map to metadata, whether an answer may fall back to general knowledge, and how citation spans are returned to the WordPress interface.

Grounding policy is a product decision

Different content categories should not all behave the same way. A product documentation chatbot can answer from general knowledge when asked about a generic concept. A medical, legal, financial or internal policy assistant may need to refuse when the knowledge base does not contain an answer.

Grounding modeWhen to useBehavior when KB has no relevant snippetsEditorial implication
KB_ONLYRegulated, high-stakes or strict documentation answersState that the documentation does not contain the requested informationAuthors must keep the KB complete enough for expected questions.
ASK_WHEN_NO_KBAmbiguous source sets or category-dependent answersAsk one clarification question instead of guessingMetadata taxonomy becomes part of UX design.
KB_PREFERREDMarketing, product education and general supportUse KB when available; otherwise answer with clear separation from retrieved docsGood balance for public websites that mix documentation and general explanation.

AI‑Kit exposes this through knowledge-base configuration rather than hard-coding one behavior for every site. Category policies, strict metadata filtering and prompt templates turn content governance into an operational layer that editors and developers can understand.

Knowledge Base ingestion lifecycle

A RAG system is only as useful as its ingestion lifecycle. For WordPress, that lifecycle starts in familiar places: posts, pages, custom post types, KB sections and editor-controlled source selection. The backend lifecycle starts when those documents are written to the S3 document bucket.

WordPress KB source selected or regenerated
        │
        ▼
Markdown document + optional metadata.json
        │
        ▼
S3 DocsBucket under documents/ prefix
        │
        ▼
S3 event or scheduled ingestion path
        │
        ▼
DynamoDB debounce lock/state
        │
        ▼
EventBridge Scheduler
        │
        ▼
KnowledgeBaseSyncFunction
        │
        ▼
Bedrock ingestion job updates Knowledge Base vectors

The debounce layer matters because publishing systems often update several files at once. Starting a full ingestion job for every small object event is noisy and expensive. A single sync worker with DynamoDB state and EventBridge scheduling gives the system room to batch changes into a more predictable ingestion rhythm.

Static export friendly AI

Static WordPress complicates many traditional plugin assumptions. The public visitor cannot call a WordPress AJAX endpoint if WordPress is not in the request path. AI‑Kit handles this by making frontend AI calls browser-to-backend calls, not browser-to-WordPress-to-model calls.

Static-safe

No PHP proxy required

The browser calls the configured backend API directly. WordPress can be offline or private after publishing, as long as the static site and backend API are reachable.

Public-safe

No model keys in the browser

The frontend receives an API URL and feature configuration. Bedrock access remains in Lambda IAM permissions, not JavaScript source code.

Abuse-aware

Open endpoints need controls

When frontend auth mode is NONE, the architecture expects controls such as reCAPTCHA, WAF, rate limits and route-level feature toggles.

Authentication and protection matrix

SurfaceDefault posturePossible auth modesRecommended protectionWhy
Admin AI routesTrusted/adminIAM or CognitoIAM by default, optional IP allow list, logs and alertsThese routes can expose broader capabilities and should not be public.
Frontend chatbotPublic or member-facingNONE, IAM or CognitoreCAPTCHA + WAF for public; Cognito scopes for member-onlyChat endpoints are cost-bearing and can receive arbitrary user input.
Frontend summarizer/language toolsFeature-gated public surfaceNONE, IAM or CognitoOnly enable needed routes; throttle and validate payload sizeEach route adds an abuse surface and a model-cost surface.
Image upload helperTemporary asset ingressFollows prompt surfaceStrict content type, size, key prefix and lifecycle expiryPresigned uploads are powerful and should be bounded tightly.
Knowledge Base managementAdmin onlyIAM or privileged CognitoNever expose as anonymous frontend endpointKB selection and backend resources are configuration, not visitor UX.

Model selection and cost posture

The backend deliberately separates model choice from plugin code. Admin routes can default to a stronger model, while frontend routes can use a lighter model first and fall back according to stack parameters. That matters for agencies because public chatbot usage, editor features and document search usually have different cost profiles.

WorkloadCost pressureQuality requirementArchitecture choice
Editor rewrite / translate / proofreadUsually moderate and admin-controlledConsistent output and low frictionTry local AI first, use backend fallback when unavailable or when policy requires backend processing.
Frontend chatbotPotentially high because visitors can trigger usageGrounded, safe, understandable answersEnable only needed routes, use lighter frontend model, reCAPTCHA/WAF, and bounded response settings.
DocSearchDepends on search volume and retrieved contextGood citations and accurate source selectionRAG-first approach; optional rerank only where relevance improvement justifies extra calls.
Multimodal promptHigher because images add storage and processingUseful for selected support or content workflowsUse presigned S3 uploads, object size limits and lifecycle expiration.

How the deploy wizard changes the operating model

The AI‑Kit backend is easier to explain when the Deployment Wizard is treated as part of the architecture, not just a helper page. It collects the few decisions that matter to a WordPress administrator — frontend feature set, auth mode and optional protections — and leaves developer-controlled values such as deployment version outside the normal UI.

Wizard decisionCloudFormation effectWordPress effect
Frontend featuresEnables only the backend routes required for chatbot, DocSearch, summarization or language tools.The plugin can expose only the surfaces the site actually wants to support.
Admin auth modeDefaults toward Cognito for admin operations and can enforce scopes.Admin/backend actions are not accidentally treated as public AI calls.
Template sourceUses an S3 template URL that CloudFormation can read.The user sees an AWS-native stack review flow instead of a hidden SaaS provisioning step.
OutputsProduces ApiBaseUrl and other stack outputs.WordPress stores the endpoint contract; it does not own the backend runtime.

Operational runbook

  1. Use the AI‑Kit Deployment Wizard to select only the frontend features, auth modes and optional protections the site actually needs, then open the prefilled CloudFormation Create stack review URL.
  2. Choose auth mode per surface: IAM/Cognito for admin, NONE + reCAPTCHA/WAF or Cognito for frontend.
  3. Copy the ApiBaseUrl stack output into WordPress → AI‑Kit Settings → API Settings, then connect any additional feature-specific outputs or identifiers required by the chosen backend mode.
  4. Define KB metadata categories, subcategories and tags before publishing many documents.
  5. Configure grounding policy for categories where hallucination risk matters.
  6. Publish KB documents from WordPress and run or schedule ingestion.
  7. Test editor, static frontend and authenticated/member flows separately.
  8. Watch CloudWatch logs, metrics, DLQ and model-call latency before increasing public exposure.

What makes this different from a typical AI plugin

Typical plugin

Vendor-owned runtime

Requests often leave WordPress for a shared SaaS endpoint, with limited visibility into model choice, logs, guardrails or retrieval architecture.

Local-only plugin

Great when available

Browser AI can be privacy-friendly and cheap, but availability, capability and frontend production needs vary across devices and browsers.

WP Suite pattern

Local-first + owned backend

Use on-device AI where it works, then fall back to a customer-owned AWS backend for RAG, public widgets, admin tooling and governed model access.

When this architecture is a good fit

  • WordPress sites that need AI features without sending all content through a shared plugin SaaS backend.
  • Static WordPress frontends that still need chatbot, DocSearch or frontend AI capabilities.
  • Agencies building repeatable private AI deployments for clients with different content governance requirements.
  • Documentation, support, healthcare-adjacent, legal-adjacent or enterprise sites where grounding policy matters.
  • Teams that want AWS logs, IAM, WAF, guardrails, S3 document storage and model access in the customer account.

When not to use it

  • The site only needs occasional editor-side rewriting and local browser AI already covers the workflow.
  • The team does not want to own AWS deployment, monitoring, WAF rules, ingestion and model-cost governance.
  • The chatbot can safely be a simple static FAQ with no retrieval, no personalization and no public AI endpoint.
  • The content taxonomy is too messy to define useful KB categories, metadata or grounding behavior.

Pillar

WordPress on AWS Reference Architecture

The broader WP Suite model for content, delivery, identity, runtime APIs, AI and workflows.

Runtime split

Static WordPress with Dynamic Runtime on AWS

How static delivery and browser-side runtime calls fit together across identity, AI, forms and custom APIs.

Solution

Private AI for WordPress

The less technical business and implementation framing for privacy-first AI features in WordPress.

FAQ

Does AI‑Kit always send content to AWS?

No. AI‑Kit is local-first where supported browser AI is available. The backend is used for Pro backend-only or fallback modes, frontend chatbot/DocSearch features, and workloads that need Bedrock, RAG, guardrails or server-side processing.

Can this work on a static WordPress site?

Yes. Frontend AI features call the configured backend API directly from the browser. WordPress does not need to proxy the request at page-view time, which makes the architecture compatible with static exports when CORS, auth and feature toggles are configured correctly.

Why separate admin and frontend endpoints?

Admin features and public visitor features have different trust boundaries. Admin routes can require IAM or privileged Cognito access, while frontend routes may need reCAPTCHA, WAF, throttling or member authentication. Keeping the surfaces separate makes the risk model clear.

Is the Knowledge Base mandatory?

No. The backend can run generation and language tasks without KB retrieval. A Knowledge Base becomes important when answers must be grounded in client documents, product docs, policies, support content or a curated public knowledge source.

Why not call model APIs directly from JavaScript?

Because browser-side model keys are not safe, public endpoints need abuse controls, and RAG requires backend orchestration. The backend keeps provider access, retrieval, guardrails, logs and cost controls behind an AWS service boundary.

Run WordPress AI where the trust boundary belongs.

Use local AI when it is enough, and deploy a customer-owned AWS backend when your WordPress site needs governed model access, RAG, citations and static-friendly frontend AI.