Architecture · Deep dive · AI backend
Private AI and RAG Backend Architecture for WordPress
A deep dive into the AI‑Kit backend: local-first WordPress AI, backend fallback, Bedrock-powered generation, Knowledge Base retrieval, frontend/admin API separation and static export friendly protection.
Architecture thesis: AI for WordPress should not be a black-box SaaS proxy hidden behind a plugin button. The safer pattern is a local-first plugin experience with an optional customer-owned AWS backend for model calls, RAG, citations, guardrails, reCAPTCHA, WAF, logs and knowledge-base ingestion.
Why WordPress AI needs an architecture, not only a prompt box
Adding an AI button to WordPress is easy. Designing where content is processed, which endpoints are public, how grounding works, how model costs are controlled, and how a static frontend can call the backend safely is the hard part.
AI‑Kit starts with the least invasive path: local, on-device browser AI when available. That works well for editor-side rewriting, translation, metadata generation and lightweight tasks. But production websites often need more: fallback when local AI is unavailable, frontend chatbots, DocSearch, multimodal prompts, citations, knowledge-base grounding and protection against public endpoint abuse.
The AI‑Kit backend is the serverless side of that model. It moves backend AI execution into the customer’s AWS account instead of routing every request through the WordPress server or a shared plugin vendor runtime.
System boundary
WordPress editor / Media Library / frontend blocks
│
├─ local mode when browser AI is available
│
▼
AI-Kit JavaScript runtime
│
├─ /admin/* routes for WP Admin and trusted operations
└─ /frontend/* routes only when public features are enabled
│
▼
Amazon API Gateway
│
▼
AiHandlerFunction
prompt, writer, rewriter, summarizer,
translator, language detector, proofreader,
upload URL helper and knowledge-base listing
│
┌────────┴────────┐
▼ ▼
Amazon Bedrock Knowledge Base pipeline
Nova models S3 documents + S3 Vectors
Guardrails DynamoDB debounce state
EventBridge scheduler
KnowledgeBaseSyncFunction
The frontend does not need model provider keys. WordPress does not need to proxy prompts through PHP. Static exports can still use frontend routes when those routes are explicitly enabled and protected with the selected public-access model.
What the AI‑Kit backend stack provisions
| Building block | Purpose | Key design choice | Operational note |
|---|---|---|---|
| API Gateway REST API | Exposes admin and optional frontend AI routes | /admin/* is always the trusted surface; /frontend/* appears only when feature toggles enable it | Keep public and privileged routes separate even when they share handler code. |
| AiHandlerFunction | Unified Lambda handler for prompt and language capabilities | One warm execution path for prompt, write, rewrite, summarize, translate, proofread, detect-language and KB listing | Shared validation, metrics, guardrails and middleware reduce operational spread. |
| Amazon Bedrock models | Runs generation, translation-style tasks and RAG answer synthesis | Frontend model can be cheaper/lighter than admin model through parameterized model selection | Model IDs are architecture parameters, not hard-coded plugin assumptions. |
| Bedrock Knowledge Base + S3 Vectors | Provides managed retrieval over documentation or client content | Create a new KB or reuse an existing one via stack parameters | KnowledgeBaseId and DataSourceId outputs are part of the integration contract. |
| Docs and temp assets S3 bucket | Stores KB documents and temporary prompt image uploads | Separate document prefix, configuration prefix and temp-assets prefix | Temp image objects should expire quickly; KB documents should be versioned deliberately. |
| KnowledgeBaseSyncFunction | Runs document ingestion workflows | S3/EventBridge/DynamoDB debounce loop avoids starting ingestion for every small file event | Useful when WordPress regenerates multiple KB documents during publishing. |
| reCAPTCHA + SSM/KMS | Protects open frontend endpoints | Secret stored as encrypted SSM parameter and fetched by handlers | Important when FrontendApiAuthMode is NONE for static-site friendliness. |
| AWS WAF and throttling | Limits abuse on public/admin API paths | Separate allow/deny/rate rules for frontend and admin surfaces | AI endpoints are cost-bearing; public access needs protection beyond CORS. |
| CloudWatch, DLQ and alerts | Provides logs, metrics, failed invocation capture and optional notifications | Per-function log retention, custom metrics and shared SQS DLQ | AI features need observability because cost, latency and quality are all runtime concerns. |
Endpoint surface: one backend, two trust zones
The stack’s most important product decision is the separation between admin and frontend routes. The same capability may exist in both zones, but the trust model is different.
| Capability | Admin route | Frontend route | When frontend is created | Design note |
|---|---|---|---|---|
| Prompt / chatbot / DocSearch | /admin/prompt | /frontend/prompt | EnableChatbotBackend=true | Can use KB, citations, regeneration, feedback metadata and optional image inputs. |
| Generate upload URL | /admin/generate-upload-url | /frontend/generate-upload-url | EnableChatbotBackend=true | Uploads images to S3 via presigned PUT, then passes keys to prompt requests. |
| Summarize | /admin/summarize | /frontend/summarize | EnableSummarizerBackend=true | Summarization usually disables KB by default because the source text is already supplied. |
| Writer | /admin/write | /frontend/write | EnableLanguageAIBackend=true | Useful as backend fallback for editor or frontend generation features. |
| Rewriter | /admin/rewrite | /frontend/rewrite | EnableLanguageAIBackend=true | Honors tone, format and length controls while keeping secrets out of the browser. |
| Translator | /admin/translate | /frontend/translate | EnableLanguageAIBackend=true | Can be paired with automatic language detection flows. |
| Proofreader | /admin/proofread | /frontend/proofread | EnableLanguageAIBackend=true | Returns corrected text and structured correction metadata. |
| Language detector | /admin/detect-language | /frontend/detect-language | EnableLanguageAIBackend=true | Uses a backend language detection path instead of assuming the browser always provides it. |
| Knowledge bases | /admin/knowledge-bases | Not a public route | Admin only | Listing and selecting backend knowledge resources belongs to trusted configuration UX. |
This is why “AI backend” is not one checkbox. Chat, summarization and language tools have different exposure, cost and abuse profiles. The stack lets them be enabled independently instead of publishing every route by default.
The RAG pipeline
User question or DocSearch query
│
▼
Prompt route receives request
│
├─ validate auth / reCAPTCHA / WAF path expectations
├─ apply guardrail and input checks
│
▼
Query builder predicts category, subcategory and tags
│
▼
Bedrock Knowledge Base retrieval
│
├─ optional strict metadata filtering
├─ optional rerank
└─ retrieved snippets with citation spans
│
▼
Answer template selection
├─ KB_ONLY
├─ ASK_WHEN_NO_KB
└─ KB_PREFERRED
│
▼
Bedrock generation with citations and grounding policy
│
▼
AI-Kit renders answer, sources and highlights in WordPress UI
The key point is that RAG is not just “send documents to a model.” The backend has to decide what sources are allowed, how categories map to metadata, whether an answer may fall back to general knowledge, and how citation spans are returned to the WordPress interface.
Grounding policy is a product decision
Different content categories should not all behave the same way. A product documentation chatbot can answer from general knowledge when asked about a generic concept. A medical, legal, financial or internal policy assistant may need to refuse when the knowledge base does not contain an answer.
| Grounding mode | When to use | Behavior when KB has no relevant snippets | Editorial implication |
|---|---|---|---|
| KB_ONLY | Regulated, high-stakes or strict documentation answers | State that the documentation does not contain the requested information | Authors must keep the KB complete enough for expected questions. |
| ASK_WHEN_NO_KB | Ambiguous source sets or category-dependent answers | Ask one clarification question instead of guessing | Metadata taxonomy becomes part of UX design. |
| KB_PREFERRED | Marketing, product education and general support | Use KB when available; otherwise answer with clear separation from retrieved docs | Good balance for public websites that mix documentation and general explanation. |
AI‑Kit exposes this through knowledge-base configuration rather than hard-coding one behavior for every site. Category policies, strict metadata filtering and prompt templates turn content governance into an operational layer that editors and developers can understand.
Knowledge Base ingestion lifecycle
A RAG system is only as useful as its ingestion lifecycle. For WordPress, that lifecycle starts in familiar places: posts, pages, custom post types, KB sections and editor-controlled source selection. The backend lifecycle starts when those documents are written to the S3 document bucket.
WordPress KB source selected or regenerated
│
▼
Markdown document + optional metadata.json
│
▼
S3 DocsBucket under documents/ prefix
│
▼
S3 event or scheduled ingestion path
│
▼
DynamoDB debounce lock/state
│
▼
EventBridge Scheduler
│
▼
KnowledgeBaseSyncFunction
│
▼
Bedrock ingestion job updates Knowledge Base vectors
The debounce layer matters because publishing systems often update several files at once. Starting a full ingestion job for every small object event is noisy and expensive. A single sync worker with DynamoDB state and EventBridge scheduling gives the system room to batch changes into a more predictable ingestion rhythm.
Static export friendly AI
Static WordPress complicates many traditional plugin assumptions. The public visitor cannot call a WordPress AJAX endpoint if WordPress is not in the request path. AI‑Kit handles this by making frontend AI calls browser-to-backend calls, not browser-to-WordPress-to-model calls.
Static-safe
No PHP proxy required
The browser calls the configured backend API directly. WordPress can be offline or private after publishing, as long as the static site and backend API are reachable.
Public-safe
No model keys in the browser
The frontend receives an API URL and feature configuration. Bedrock access remains in Lambda IAM permissions, not JavaScript source code.
Abuse-aware
Open endpoints need controls
When frontend auth mode is NONE, the architecture expects controls such as reCAPTCHA, WAF, rate limits and route-level feature toggles.
Authentication and protection matrix
| Surface | Default posture | Possible auth modes | Recommended protection | Why |
|---|---|---|---|---|
| Admin AI routes | Trusted/admin | IAM or Cognito | IAM by default, optional IP allow list, logs and alerts | These routes can expose broader capabilities and should not be public. |
| Frontend chatbot | Public or member-facing | NONE, IAM or Cognito | reCAPTCHA + WAF for public; Cognito scopes for member-only | Chat endpoints are cost-bearing and can receive arbitrary user input. |
| Frontend summarizer/language tools | Feature-gated public surface | NONE, IAM or Cognito | Only enable needed routes; throttle and validate payload size | Each route adds an abuse surface and a model-cost surface. |
| Image upload helper | Temporary asset ingress | Follows prompt surface | Strict content type, size, key prefix and lifecycle expiry | Presigned uploads are powerful and should be bounded tightly. |
| Knowledge Base management | Admin only | IAM or privileged Cognito | Never expose as anonymous frontend endpoint | KB selection and backend resources are configuration, not visitor UX. |
Model selection and cost posture
The backend deliberately separates model choice from plugin code. Admin routes can default to a stronger model, while frontend routes can use a lighter model first and fall back according to stack parameters. That matters for agencies because public chatbot usage, editor features and document search usually have different cost profiles.
| Workload | Cost pressure | Quality requirement | Architecture choice |
|---|---|---|---|
| Editor rewrite / translate / proofread | Usually moderate and admin-controlled | Consistent output and low friction | Try local AI first, use backend fallback when unavailable or when policy requires backend processing. |
| Frontend chatbot | Potentially high because visitors can trigger usage | Grounded, safe, understandable answers | Enable only needed routes, use lighter frontend model, reCAPTCHA/WAF, and bounded response settings. |
| DocSearch | Depends on search volume and retrieved context | Good citations and accurate source selection | RAG-first approach; optional rerank only where relevance improvement justifies extra calls. |
| Multimodal prompt | Higher because images add storage and processing | Useful for selected support or content workflows | Use presigned S3 uploads, object size limits and lifecycle expiration. |
How the deploy wizard changes the operating model
The AI‑Kit backend is easier to explain when the Deployment Wizard is treated as part of the architecture, not just a helper page. It collects the few decisions that matter to a WordPress administrator — frontend feature set, auth mode and optional protections — and leaves developer-controlled values such as deployment version outside the normal UI.
| Wizard decision | CloudFormation effect | WordPress effect |
|---|---|---|
| Frontend features | Enables only the backend routes required for chatbot, DocSearch, summarization or language tools. | The plugin can expose only the surfaces the site actually wants to support. |
| Admin auth mode | Defaults toward Cognito for admin operations and can enforce scopes. | Admin/backend actions are not accidentally treated as public AI calls. |
| Template source | Uses an S3 template URL that CloudFormation can read. | The user sees an AWS-native stack review flow instead of a hidden SaaS provisioning step. |
| Outputs | Produces ApiBaseUrl and other stack outputs. | WordPress stores the endpoint contract; it does not own the backend runtime. |
Operational runbook
- Use the AI‑Kit Deployment Wizard to select only the frontend features, auth modes and optional protections the site actually needs, then open the prefilled CloudFormation Create stack review URL.
- Choose auth mode per surface: IAM/Cognito for admin, NONE + reCAPTCHA/WAF or Cognito for frontend.
- Copy the
ApiBaseUrlstack output into WordPress → AI‑Kit Settings → API Settings, then connect any additional feature-specific outputs or identifiers required by the chosen backend mode. - Define KB metadata categories, subcategories and tags before publishing many documents.
- Configure grounding policy for categories where hallucination risk matters.
- Publish KB documents from WordPress and run or schedule ingestion.
- Test editor, static frontend and authenticated/member flows separately.
- Watch CloudWatch logs, metrics, DLQ and model-call latency before increasing public exposure.
What makes this different from a typical AI plugin
Typical plugin
Vendor-owned runtime
Requests often leave WordPress for a shared SaaS endpoint, with limited visibility into model choice, logs, guardrails or retrieval architecture.
Local-only plugin
Great when available
Browser AI can be privacy-friendly and cheap, but availability, capability and frontend production needs vary across devices and browsers.
WP Suite pattern
Local-first + owned backend
Use on-device AI where it works, then fall back to a customer-owned AWS backend for RAG, public widgets, admin tooling and governed model access.
When this architecture is a good fit
- WordPress sites that need AI features without sending all content through a shared plugin SaaS backend.
- Static WordPress frontends that still need chatbot, DocSearch or frontend AI capabilities.
- Agencies building repeatable private AI deployments for clients with different content governance requirements.
- Documentation, support, healthcare-adjacent, legal-adjacent or enterprise sites where grounding policy matters.
- Teams that want AWS logs, IAM, WAF, guardrails, S3 document storage and model access in the customer account.
When not to use it
- The site only needs occasional editor-side rewriting and local browser AI already covers the workflow.
- The team does not want to own AWS deployment, monitoring, WAF rules, ingestion and model-cost governance.
- The chatbot can safely be a simple static FAQ with no retrieval, no personalization and no public AI endpoint.
- The content taxonomy is too messy to define useful KB categories, metadata or grounding behavior.
Related architecture articles
Pillar
WordPress on AWS Reference Architecture
The broader WP Suite model for content, delivery, identity, runtime APIs, AI and workflows.
Runtime split
Static WordPress with Dynamic Runtime on AWS
How static delivery and browser-side runtime calls fit together across identity, AI, forms and custom APIs.
Solution
Private AI for WordPress
The less technical business and implementation framing for privacy-first AI features in WordPress.
FAQ
Does AI‑Kit always send content to AWS?
No. AI‑Kit is local-first where supported browser AI is available. The backend is used for Pro backend-only or fallback modes, frontend chatbot/DocSearch features, and workloads that need Bedrock, RAG, guardrails or server-side processing.
Can this work on a static WordPress site?
Yes. Frontend AI features call the configured backend API directly from the browser. WordPress does not need to proxy the request at page-view time, which makes the architecture compatible with static exports when CORS, auth and feature toggles are configured correctly.
Why separate admin and frontend endpoints?
Admin features and public visitor features have different trust boundaries. Admin routes can require IAM or privileged Cognito access, while frontend routes may need reCAPTCHA, WAF, throttling or member authentication. Keeping the surfaces separate makes the risk model clear.
Is the Knowledge Base mandatory?
No. The backend can run generation and language tasks without KB retrieval. A Knowledge Base becomes important when answers must be grounded in client documents, product docs, policies, support content or a curated public knowledge source.
Why not call model APIs directly from JavaScript?
Because browser-side model keys are not safe, public endpoints need abuse controls, and RAG requires backend orchestration. The backend keeps provider access, retrieval, guardrails, logs and cost controls behind an AWS service boundary.
Run WordPress AI where the trust boundary belongs.
Use local AI when it is enough, and deploy a customer-owned AWS backend when your WordPress site needs governed model access, RAG, citations and static-friendly frontend AI.
