AI Agent — Implementation & GCP Cloud Run Deployment

This page documents the implementation architecture and GCP Cloud Run deployment of the Med-SEAL AI Agent — the multi-agent clinical reasoning service that powers the platform’s 18 smart healthcare features.

1. Implementation Overview

1.1 Technology Stack

Component	Technology
Runtime	Python 3.11
Web framework	FastAPI + Uvicorn
Agent framework	LangGraph (LangChain)
Clinical LLM	SEA-LION v4-32B (AI Singapore)
Conversation model	SEA-LION v4 (AI Singapore)
Safety guard	SEA-Guard (AI Singapore)
FHIR client	Medplum (custom `httpx`-based client)
Session persistence	SQLite (via `langgraph-checkpoint-sqlite`)
Containerisation	Docker (`python:3.11-slim`)
Cloud hosting	GCP Cloud Run (Singapore, `asia-southeast1`)
Infrastructure	GKE (`medseal-cluster`) for full stack

1.2 Agent Roster

The system builds and registers 7 LangGraph agent graphs at startup:

Agent ID	Module	Purpose
`companion-agent`	`agent.agents.companion`	Patient-facing conversational interface
`clinical-reasoning-agent`	`agent.agents.clinical`	Clinical reasoning (drug interactions, lab interpretation)
`doctor-cds-agent`	`agent.agents.doctor_cds`	OpenEMR clinician chat & decision support
`nudge-agent`	`agent.agents.nudge`	Proactive reminders and escalation
`lifestyle-agent`	`agent.agents.lifestyle`	Dietary and lifestyle coaching
`insight-synthesis-agent`	`agent.agents.insight`	Pre-visit brief synthesis
`previsit-summary-agent`	`agent.agents.previsit`	FHIR-based pre-visit data aggregation

Each agent is a compiled LangGraph StateGraph with its own tools, system prompt, and checkpointer.

1.3 LLM Backend — Dual Mode

The system supports two LLM backends via the LLM Factory (agent/core/llm_factory.py):

# config.py — defaults to SEA-LION
clinical_llm_backend: str = "azure"  # or "vllm" for Med-SEAL V1

Mode	Backend	Model	When
`azure` (current)	SEA-LION v4-32B via API	Qwen-SEA-LION-v4-32B-IT	Production — no GPU needed
`vllm`	Self-hosted vLLM	`aagdeyogipramana/Med-SEAL-V1`	Future — requires 2× H200 GPU

Switching is a single env var change: MEDSEAL_CLINICAL_LLM_BACKEND=vllm.

1.4 API Endpoints

Method	Path	Surface	Description
`POST`	`/sessions`	Patient	Create new chat session
`POST`	`/sessions/{id}/messages`	Patient	Send message (sync)
`POST`	`/sessions/{id}/messages/stream`	Patient	Send message (SSE streaming)
`GET`	`/sessions/{id}/messages`	Patient	Get conversation history
`DELETE`	`/sessions/{id}`	Patient	Delete session
`POST`	`/patients/{id}/previsit-summary`	Clinician	Generate pre-visit summary
`POST`	`/openemr/sessions/{id}/chat`	Clinician	Doctor chat (SSE)
`POST`	`/openemr/sessions/{id}/chat/sync`	Clinician	Doctor chat (sync)
`POST`	`/cds-services/patient-view`	CDS Hooks	Trigger insight synthesis
`POST`	`/triggers/{type}`	System	Fire nudge/measurement triggers
`GET`	`/agents`	Admin	List registered agents
`GET`	`/agents/{id}/health`	Admin	Agent health check
`GET`	`/health`	Admin	System health check

1.5 External Dependencies

Service	Endpoint	Purpose
SEA-LION API	`api.sea-lion.ai/v1`	Clinical reasoning + Conversation + Guard
Medplum FHIR	`fhir.medseal.34.54.226.15.nip.io/fhir/R4`	Patient health records (GKE)

2. Project Structure

agent/
├── main.py                      # FastAPI app, lifespan, agent graph compilation
├── config.py                    # Pydantic Settings (env vars, MEDSEAL_ prefix)
├── agents/                      # LangGraph agent definitions
│   ├── companion.py             #   A1 — patient conversation
│   ├── clinical.py              #   A2 — clinical reasoning
│   ├── doctor_cds.py            #   Doctor chat + CDS
│   ├── nudge.py                 #   A3 — proactive nudges
│   ├── lifestyle.py             #   A4 — dietary coaching
│   ├── insight.py               #   A5 — insight synthesis
│   ├── previsit.py              #   Pre-visit summary
│   └── measurement.py           #   A6 — analytics
├── api/
│   └── routes.py                # All FastAPI route handlers
├── core/
│   ├── orchestrator.py          # Intent classification + agent routing
│   ├── guard.py                 # SEA-LION Guard (input/output safety)
│   ├── llm_factory.py           # SEA-LION / vLLM backend selector
│   ├── graph.py                 # Legacy agent graph
│   ├── identity.py              # Agent identity definitions
│   ├── language.py              # Language detection
│   └── router.py                # Message routing logic
└── tools/
    ├── fhir_client.py           # Medplum FHIR client (httpx)
    ├── fhir_tools_clinical.py   # FHIR tools for A2
    ├── fhir_tools_companion.py  # FHIR tools for A1
    ├── fhir_tools_nudge.py      # FHIR tools for A3
    ├── fhir_tools_lifestyle.py  # FHIR tools for A4
    ├── fhir_tools_insight.py    # FHIR tools for A5
    ├── fhir_tools_previsit.py   # FHIR tools for pre-visit
    ├── fhir_tools_measurement.py # FHIR tools for A6
    ├── fhir_tools_appointment.py # Appointment management
    └── medical_tools.py         # Medical knowledge tools

3. GCP Cloud Run Deployment

3.1 GCP Resources

Resource	Type	Configuration
Project	`gen-lang-client-0538005727`	Gemini Project1
Cloud Run Service	`medseal-agent`	`asia-southeast1` (Singapore)
GKE Cluster	`medseal-cluster`	2× `ek-standard-8` nodes
Artifact Registry	`cloud-run-source-deploy`	Docker images
Ingress IP	`34.54.226.15`	GKE external load balancer

3.2 Environment Variables

# Clinical LLM (SEA-LION)
MEDSEAL_CLINICAL_LLM_BACKEND=azure
MEDSEAL_SEALION_API_URL=https://api.sea-lion.ai/v1
MEDSEAL_SEALION_API_KEY=<your-sea-lion-key>
MEDSEAL_SEALION_MODEL=aisingapore/Qwen-SEA-LION-v4-32B-IT
MEDSEAL_SEAGUARD_MODEL=aisingapore/SEA-Guard

# Medplum FHIR (GKE)
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4
MEDSEAL_MEDPLUM_EMAIL=admin@example.com
MEDSEAL_MEDPLUM_PASSWORD=medplum_admin

# App behaviour
MEDSEAL_MAX_RECURSION=5
MEDSEAL_TEMPERATURE=0.6

3.3 Startup Command

uvicorn agent.main:app --host 0.0.0.0 --port 8000

3.4 Deployment Steps

Prerequisites

Google Cloud SDK (gcloud) installed and authenticated
GCP project with billing enabled
Access to SEA-LION API (AI Singapore)

Step 1: Authenticate & Configure Project

export PATH="$HOME/google-cloud-sdk/bin:$PATH"
gcloud auth login --no-launch-browser
gcloud config set project gen-lang-client-0538005727

Step 2: Enable Required APIs

gcloud services enable \
  run.googleapis.com \
  cloudbuild.googleapis.com \
  artifactregistry.googleapis.com

Step 3: Deploy from Source

gcloud run deploy medseal-agent \
  --source /path/to/Med-SEAL \
  --region asia-southeast1 \
  --port 8000 \
  --memory 1Gi \
  --cpu 1 \
  --timeout 60 \
  --allow-unauthenticated \
  --set-env-vars="\
MEDSEAL_SEALION_API_KEY=<key>,\
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4" \
  --quiet

Step 4: Verify

# Health check
curl https://medseal-agent-74997794842.asia-southeast1.run.app/health

# List agents
curl https://medseal-agent-74997794842.asia-southeast1.run.app/agents

Expected health response:

{
  "status": "ok",
  "vllm": "unreachable",
  "redis": "ok",
  "medplum": "ok",
  "agents": {}
}

vllm: unreachable is expected — Med-SEAL V1 (med-r1) is not deployed. The system uses SEA-LION via API instead.

3.5 Dockerfile

FROM python:3.11-slim

WORKDIR /app

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential curl \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt agent/requirements_agent.txt ./
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements_agent.txt

COPY agent /app/agent

ENV PYTHONPATH=/app
EXPOSE 8000

CMD ["uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8000"]

3.6 Requirements (Deployment)

The slim deployment requirements (requirements_deploy.txt) include only runtime dependencies:

fastapi>=0.115
uvicorn[standard]
pydantic-settings
langchain>=0.3
langchain-openai
langchain-community
langgraph>=0.2.70
langgraph-checkpoint-redis
langdetect
ddgs
httpx
sqlalchemy
aiosqlite

Training dependencies (PyTorch, transformers, datasets, etc.) are excluded to keep the deployment package small.

4. Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│              GCP Cloud Run (asia-southeast1)                 │
│                                                             │
│  ┌────────────────────────────────────────────────────────┐ │
│  │              FastAPI (uvicorn :8000)                    │ │
│  │                                                        │ │
│  │  /sessions/*          → Orchestrator → Agent Router    │ │
│  │  /openemr/sessions/*  → Orchestrator → Doctor CDS      │ │
│  │  /cds-services/*      → Orchestrator → Insight Agent   │ │
│  │  /triggers/*          → Orchestrator → Nudge Agent     │ │
│  │  /patients/*/previsit → Previsit Agent                 │ │
│  │                                                        │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌─────────────┐  │ │
│  │  │ Companion A1 │  │ Clinical A2  │  │ Doctor CDS  │  │ │
│  │  │ Lifestyle A4 │  │  Nudge A3    │  │ Insight A5  │  │ │
│  │  │ Previsit     │  │ Measurement  │  │   Guard     │  │ │
│  │  └──────────────┘  └──────────────┘  └─────────────┘  │ │
│  │                                                        │ │
│  │  SQLite (medseal_sessions.db) — session checkpointing  │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────┬──────────────────┬──────────────────┬─────────────┘
          │                  │                  │
    ┌─────▼─────┐    ┌──────▼──────┐    ┌──────▼──────┐
    │ SEA-LION   │    │  SEA-Guard  │    │   Medplum   │
    │ v4-32B     │    │ Safety LLM  │    │  FHIR R4    │
    │(clinical + │    │ api.sea-lion│    │ (GKE)       │
    │ companion) │    │ .ai/v1      │    │             │
    └────────────┘    └─────────────┘    └─────────────┘

5. Monitoring & Troubleshooting

5.1 Logs

# Cloud Run logs
gcloud run services logs read medseal-agent --region asia-southeast1 --limit 50

# GKE pod logs
kubectl logs -n medseal deployment/ai-service --tail=50

5.2 Common Issues

Issue	Cause	Fix
`vllm: unreachable` in health	Expected — Med-SEAL V1 not deployed	No action; SEA-LION handles clinical LLM
`medplum: unreachable`	GKE FHIR server down or URL misconfigured	Check GKE pods: `kubectl get pods -n medseal`
`STARTUP FAILED — running in degraded mode`	Missing env vars or API key issues	Check `MEDSEAL_*` env vars in Cloud Run
`SEA-LION API timeout`	Rate limit or network issue	Retry; check SEA-LION API status
`SQLite checkpointer unavailable`	File permission or dependency issue	Falls back to in-memory MemorySaver; sessions won’t persist across restarts

5.3 Scaling

Cloud Run auto-scales based on traffic. To adjust limits:

# Update CPU/memory
gcloud run services update medseal-agent \
  --region asia-southeast1 \
  --cpu 2 \
  --memory 2Gi

# Set max instances
gcloud run services update medseal-agent \
  --region asia-southeast1 \
  --max-instances 10

Note: When scaling to multiple instances, switch session persistence from SQLite to Redis (MEDSEAL_REDIS_URL) since SQLite is per-instance.

6. Live Deployment URLs

Service	URL	Platform
AI Agent (API)	`https://medseal-agent-74997794842.asia-southeast1.run.app`	Cloud Run
Swagger UI	`https://medseal-agent-74997794842.asia-southeast1.run.app/docs`	Cloud Run
Patient App	`app.medseal.34.54.226.15.nip.io`	GKE
FHIR Server	`fhir.medseal.34.54.226.15.nip.io`	GKE
OpenEMR	`emr.medseal.34.54.226.15.nip.io`	GKE
Medplum Admin	`medplum.medseal.34.54.226.15.nip.io`	GKE