AI Agent — Implementation & GCP Cloud Run Deployment
This page documents the implementation architecture and GCP Cloud Run deployment of the Med-SEAL AI Agent — the multi-agent clinical reasoning service that powers the platform’s 18 smart healthcare features.
1. Implementation Overview
1.1 Technology Stack
Component |
Technology |
|---|---|
Runtime |
Python 3.11 |
Web framework |
FastAPI + Uvicorn |
Agent framework |
LangGraph (LangChain) |
Clinical LLM |
SEA-LION v4-32B (AI Singapore) |
Conversation model |
SEA-LION v4 (AI Singapore) |
Safety guard |
SEA-Guard (AI Singapore) |
FHIR client |
Medplum (custom |
Session persistence |
SQLite (via |
Containerisation |
Docker ( |
Cloud hosting |
GCP Cloud Run (Singapore, |
Infrastructure |
GKE ( |
1.2 Agent Roster
The system builds and registers 7 LangGraph agent graphs at startup:
Agent ID |
Module |
Purpose |
|---|---|---|
|
|
Patient-facing conversational interface |
|
|
Clinical reasoning (drug interactions, lab interpretation) |
|
|
OpenEMR clinician chat & decision support |
|
|
Proactive reminders and escalation |
|
|
Dietary and lifestyle coaching |
|
|
Pre-visit brief synthesis |
|
|
FHIR-based pre-visit data aggregation |
Each agent is a compiled LangGraph StateGraph with its own tools, system prompt, and checkpointer.
1.3 LLM Backend — Dual Mode
The system supports two LLM backends via the LLM Factory (agent/core/llm_factory.py):
# config.py — defaults to SEA-LION
clinical_llm_backend: str = "azure" # or "vllm" for Med-SEAL V1
Mode |
Backend |
Model |
When |
|---|---|---|---|
|
SEA-LION v4-32B via API |
Qwen-SEA-LION-v4-32B-IT |
Production — no GPU needed |
|
Self-hosted vLLM |
Future — requires 2× H200 GPU |
Switching is a single env var change: MEDSEAL_CLINICAL_LLM_BACKEND=vllm.
1.4 API Endpoints
Method |
Path |
Surface |
Description |
|---|---|---|---|
|
|
Patient |
Create new chat session |
|
|
Patient |
Send message (sync) |
|
|
Patient |
Send message (SSE streaming) |
|
|
Patient |
Get conversation history |
|
|
Patient |
Delete session |
|
|
Clinician |
Generate pre-visit summary |
|
|
Clinician |
Doctor chat (SSE) |
|
|
Clinician |
Doctor chat (sync) |
|
|
CDS Hooks |
Trigger insight synthesis |
|
|
System |
Fire nudge/measurement triggers |
|
|
Admin |
List registered agents |
|
|
Admin |
Agent health check |
|
|
Admin |
System health check |
1.5 External Dependencies
Service |
Endpoint |
Purpose |
|---|---|---|
SEA-LION API |
|
Clinical reasoning + Conversation + Guard |
Medplum FHIR |
|
Patient health records (GKE) |
2. Project Structure
agent/
├── main.py # FastAPI app, lifespan, agent graph compilation
├── config.py # Pydantic Settings (env vars, MEDSEAL_ prefix)
├── agents/ # LangGraph agent definitions
│ ├── companion.py # A1 — patient conversation
│ ├── clinical.py # A2 — clinical reasoning
│ ├── doctor_cds.py # Doctor chat + CDS
│ ├── nudge.py # A3 — proactive nudges
│ ├── lifestyle.py # A4 — dietary coaching
│ ├── insight.py # A5 — insight synthesis
│ ├── previsit.py # Pre-visit summary
│ └── measurement.py # A6 — analytics
├── api/
│ └── routes.py # All FastAPI route handlers
├── core/
│ ├── orchestrator.py # Intent classification + agent routing
│ ├── guard.py # SEA-LION Guard (input/output safety)
│ ├── llm_factory.py # SEA-LION / vLLM backend selector
│ ├── graph.py # Legacy agent graph
│ ├── identity.py # Agent identity definitions
│ ├── language.py # Language detection
│ └── router.py # Message routing logic
└── tools/
├── fhir_client.py # Medplum FHIR client (httpx)
├── fhir_tools_clinical.py # FHIR tools for A2
├── fhir_tools_companion.py # FHIR tools for A1
├── fhir_tools_nudge.py # FHIR tools for A3
├── fhir_tools_lifestyle.py # FHIR tools for A4
├── fhir_tools_insight.py # FHIR tools for A5
├── fhir_tools_previsit.py # FHIR tools for pre-visit
├── fhir_tools_measurement.py # FHIR tools for A6
├── fhir_tools_appointment.py # Appointment management
└── medical_tools.py # Medical knowledge tools
3. GCP Cloud Run Deployment
3.1 GCP Resources
Resource |
Type |
Configuration |
|---|---|---|
Project |
|
Gemini Project1 |
Cloud Run Service |
|
|
GKE Cluster |
|
2× |
Artifact Registry |
|
Docker images |
Ingress IP |
|
GKE external load balancer |
3.2 Environment Variables
# Clinical LLM (SEA-LION)
MEDSEAL_CLINICAL_LLM_BACKEND=azure
MEDSEAL_SEALION_API_URL=https://api.sea-lion.ai/v1
MEDSEAL_SEALION_API_KEY=<your-sea-lion-key>
MEDSEAL_SEALION_MODEL=aisingapore/Qwen-SEA-LION-v4-32B-IT
MEDSEAL_SEAGUARD_MODEL=aisingapore/SEA-Guard
# Medplum FHIR (GKE)
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4
MEDSEAL_MEDPLUM_EMAIL=admin@example.com
MEDSEAL_MEDPLUM_PASSWORD=medplum_admin
# App behaviour
MEDSEAL_MAX_RECURSION=5
MEDSEAL_TEMPERATURE=0.6
3.3 Startup Command
uvicorn agent.main:app --host 0.0.0.0 --port 8000
3.4 Deployment Steps
Prerequisites
Google Cloud SDK (
gcloud) installed and authenticatedGCP project with billing enabled
Access to SEA-LION API (AI Singapore)
Step 1: Authenticate & Configure Project
export PATH="$HOME/google-cloud-sdk/bin:$PATH"
gcloud auth login --no-launch-browser
gcloud config set project gen-lang-client-0538005727
Step 2: Enable Required APIs
gcloud services enable \
run.googleapis.com \
cloudbuild.googleapis.com \
artifactregistry.googleapis.com
Step 3: Deploy from Source
gcloud run deploy medseal-agent \
--source /path/to/Med-SEAL \
--region asia-southeast1 \
--port 8000 \
--memory 1Gi \
--cpu 1 \
--timeout 60 \
--allow-unauthenticated \
--set-env-vars="\
MEDSEAL_SEALION_API_KEY=<key>,\
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4" \
--quiet
Step 4: Verify
# Health check
curl https://medseal-agent-74997794842.asia-southeast1.run.app/health
# List agents
curl https://medseal-agent-74997794842.asia-southeast1.run.app/agents
Expected health response:
{
"status": "ok",
"vllm": "unreachable",
"redis": "ok",
"medplum": "ok",
"agents": {}
}
vllm: unreachableis expected — Med-SEAL V1 (med-r1) is not deployed. The system uses SEA-LION via API instead.
3.5 Dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential curl \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt agent/requirements_agent.txt ./
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements_agent.txt
COPY agent /app/agent
ENV PYTHONPATH=/app
EXPOSE 8000
CMD ["uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8000"]
3.6 Requirements (Deployment)
The slim deployment requirements (requirements_deploy.txt) include only runtime dependencies:
fastapi>=0.115
uvicorn[standard]
pydantic-settings
langchain>=0.3
langchain-openai
langchain-community
langgraph>=0.2.70
langgraph-checkpoint-redis
langdetect
ddgs
httpx
sqlalchemy
aiosqlite
Training dependencies (PyTorch, transformers, datasets, etc.) are excluded to keep the deployment package small.
4. Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ GCP Cloud Run (asia-southeast1) │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ FastAPI (uvicorn :8000) │ │
│ │ │ │
│ │ /sessions/* → Orchestrator → Agent Router │ │
│ │ /openemr/sessions/* → Orchestrator → Doctor CDS │ │
│ │ /cds-services/* → Orchestrator → Insight Agent │ │
│ │ /triggers/* → Orchestrator → Nudge Agent │ │
│ │ /patients/*/previsit → Previsit Agent │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │ │
│ │ │ Companion A1 │ │ Clinical A2 │ │ Doctor CDS │ │ │
│ │ │ Lifestyle A4 │ │ Nudge A3 │ │ Insight A5 │ │ │
│ │ │ Previsit │ │ Measurement │ │ Guard │ │ │
│ │ └──────────────┘ └──────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ SQLite (medseal_sessions.db) — session checkpointing │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────┬──────────────────┬──────────────────┬─────────────┘
│ │ │
┌─────▼─────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ SEA-LION │ │ SEA-Guard │ │ Medplum │
│ v4-32B │ │ Safety LLM │ │ FHIR R4 │
│(clinical + │ │ api.sea-lion│ │ (GKE) │
│ companion) │ │ .ai/v1 │ │ │
└────────────┘ └─────────────┘ └─────────────┘
5. Monitoring & Troubleshooting
5.1 Logs
# Cloud Run logs
gcloud run services logs read medseal-agent --region asia-southeast1 --limit 50
# GKE pod logs
kubectl logs -n medseal deployment/ai-service --tail=50
5.2 Common Issues
Issue |
Cause |
Fix |
|---|---|---|
|
Expected — Med-SEAL V1 not deployed |
No action; SEA-LION handles clinical LLM |
|
GKE FHIR server down or URL misconfigured |
Check GKE pods: |
|
Missing env vars or API key issues |
Check |
|
Rate limit or network issue |
Retry; check SEA-LION API status |
|
File permission or dependency issue |
Falls back to in-memory MemorySaver; sessions won’t persist across restarts |
5.3 Scaling
Cloud Run auto-scales based on traffic. To adjust limits:
# Update CPU/memory
gcloud run services update medseal-agent \
--region asia-southeast1 \
--cpu 2 \
--memory 2Gi
# Set max instances
gcloud run services update medseal-agent \
--region asia-southeast1 \
--max-instances 10
Note: When scaling to multiple instances, switch session persistence from SQLite to Redis (
MEDSEAL_REDIS_URL) since SQLite is per-instance.
6. Live Deployment URLs
Service |
URL |
Platform |
|---|---|---|
AI Agent (API) |
|
Cloud Run |
Swagger UI |
|
Cloud Run |
Patient App |
|
GKE |
FHIR Server |
|
GKE |
OpenEMR |
|
GKE |
Medplum Admin |
|
GKE |