AI Agent — Implementation & GCP Cloud Run Deployment

This page documents the implementation architecture and GCP Cloud Run deployment of the Med-SEAL AI Agent — the multi-agent clinical reasoning service that powers the platform’s 18 smart healthcare features.


1. Implementation Overview

1.1 Technology Stack

Component

Technology

Runtime

Python 3.11

Web framework

FastAPI + Uvicorn

Agent framework

LangGraph (LangChain)

Clinical LLM

SEA-LION v4-32B (AI Singapore)

Conversation model

SEA-LION v4 (AI Singapore)

Safety guard

SEA-Guard (AI Singapore)

FHIR client

Medplum (custom httpx-based client)

Session persistence

SQLite (via langgraph-checkpoint-sqlite)

Containerisation

Docker (python:3.11-slim)

Cloud hosting

GCP Cloud Run (Singapore, asia-southeast1)

Infrastructure

GKE (medseal-cluster) for full stack

1.2 Agent Roster

The system builds and registers 7 LangGraph agent graphs at startup:

Agent ID

Module

Purpose

companion-agent

agent.agents.companion

Patient-facing conversational interface

clinical-reasoning-agent

agent.agents.clinical

Clinical reasoning (drug interactions, lab interpretation)

doctor-cds-agent

agent.agents.doctor_cds

OpenEMR clinician chat & decision support

nudge-agent

agent.agents.nudge

Proactive reminders and escalation

lifestyle-agent

agent.agents.lifestyle

Dietary and lifestyle coaching

insight-synthesis-agent

agent.agents.insight

Pre-visit brief synthesis

previsit-summary-agent

agent.agents.previsit

FHIR-based pre-visit data aggregation

Each agent is a compiled LangGraph StateGraph with its own tools, system prompt, and checkpointer.

1.3 LLM Backend — Dual Mode

The system supports two LLM backends via the LLM Factory (agent/core/llm_factory.py):

# config.py — defaults to SEA-LION
clinical_llm_backend: str = "azure"  # or "vllm" for Med-SEAL V1

Mode

Backend

Model

When

azure (current)

SEA-LION v4-32B via API

Qwen-SEA-LION-v4-32B-IT

Production — no GPU needed

vllm

Self-hosted vLLM

aagdeyogipramana/Med-SEAL-V1

Future — requires 2× H200 GPU

Switching is a single env var change: MEDSEAL_CLINICAL_LLM_BACKEND=vllm.

1.4 API Endpoints

Method

Path

Surface

Description

POST

/sessions

Patient

Create new chat session

POST

/sessions/{id}/messages

Patient

Send message (sync)

POST

/sessions/{id}/messages/stream

Patient

Send message (SSE streaming)

GET

/sessions/{id}/messages

Patient

Get conversation history

DELETE

/sessions/{id}

Patient

Delete session

POST

/patients/{id}/previsit-summary

Clinician

Generate pre-visit summary

POST

/openemr/sessions/{id}/chat

Clinician

Doctor chat (SSE)

POST

/openemr/sessions/{id}/chat/sync

Clinician

Doctor chat (sync)

POST

/cds-services/patient-view

CDS Hooks

Trigger insight synthesis

POST

/triggers/{type}

System

Fire nudge/measurement triggers

GET

/agents

Admin

List registered agents

GET

/agents/{id}/health

Admin

Agent health check

GET

/health

Admin

System health check

1.5 External Dependencies

Service

Endpoint

Purpose

SEA-LION API

api.sea-lion.ai/v1

Clinical reasoning + Conversation + Guard

Medplum FHIR

fhir.medseal.34.54.226.15.nip.io/fhir/R4

Patient health records (GKE)


2. Project Structure

agent/
├── main.py                      # FastAPI app, lifespan, agent graph compilation
├── config.py                    # Pydantic Settings (env vars, MEDSEAL_ prefix)
├── agents/                      # LangGraph agent definitions
│   ├── companion.py             #   A1 — patient conversation
│   ├── clinical.py              #   A2 — clinical reasoning
│   ├── doctor_cds.py            #   Doctor chat + CDS
│   ├── nudge.py                 #   A3 — proactive nudges
│   ├── lifestyle.py             #   A4 — dietary coaching
│   ├── insight.py               #   A5 — insight synthesis
│   ├── previsit.py              #   Pre-visit summary
│   └── measurement.py           #   A6 — analytics
├── api/
│   └── routes.py                # All FastAPI route handlers
├── core/
│   ├── orchestrator.py          # Intent classification + agent routing
│   ├── guard.py                 # SEA-LION Guard (input/output safety)
│   ├── llm_factory.py           # SEA-LION / vLLM backend selector
│   ├── graph.py                 # Legacy agent graph
│   ├── identity.py              # Agent identity definitions
│   ├── language.py              # Language detection
│   └── router.py                # Message routing logic
└── tools/
    ├── fhir_client.py           # Medplum FHIR client (httpx)
    ├── fhir_tools_clinical.py   # FHIR tools for A2
    ├── fhir_tools_companion.py  # FHIR tools for A1
    ├── fhir_tools_nudge.py      # FHIR tools for A3
    ├── fhir_tools_lifestyle.py  # FHIR tools for A4
    ├── fhir_tools_insight.py    # FHIR tools for A5
    ├── fhir_tools_previsit.py   # FHIR tools for pre-visit
    ├── fhir_tools_measurement.py # FHIR tools for A6
    ├── fhir_tools_appointment.py # Appointment management
    └── medical_tools.py         # Medical knowledge tools

3. GCP Cloud Run Deployment

3.1 GCP Resources

Resource

Type

Configuration

Project

gen-lang-client-0538005727

Gemini Project1

Cloud Run Service

medseal-agent

asia-southeast1 (Singapore)

GKE Cluster

medseal-cluster

ek-standard-8 nodes

Artifact Registry

cloud-run-source-deploy

Docker images

Ingress IP

34.54.226.15

GKE external load balancer

3.2 Environment Variables

# Clinical LLM (SEA-LION)
MEDSEAL_CLINICAL_LLM_BACKEND=azure
MEDSEAL_SEALION_API_URL=https://api.sea-lion.ai/v1
MEDSEAL_SEALION_API_KEY=<your-sea-lion-key>
MEDSEAL_SEALION_MODEL=aisingapore/Qwen-SEA-LION-v4-32B-IT
MEDSEAL_SEAGUARD_MODEL=aisingapore/SEA-Guard

# Medplum FHIR (GKE)
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4
MEDSEAL_MEDPLUM_EMAIL=admin@example.com
MEDSEAL_MEDPLUM_PASSWORD=medplum_admin

# App behaviour
MEDSEAL_MAX_RECURSION=5
MEDSEAL_TEMPERATURE=0.6

3.3 Startup Command

uvicorn agent.main:app --host 0.0.0.0 --port 8000

3.4 Deployment Steps

Prerequisites

  • Google Cloud SDK (gcloud) installed and authenticated

  • GCP project with billing enabled

  • Access to SEA-LION API (AI Singapore)

Step 1: Authenticate & Configure Project

export PATH="$HOME/google-cloud-sdk/bin:$PATH"
gcloud auth login --no-launch-browser
gcloud config set project gen-lang-client-0538005727

Step 2: Enable Required APIs

gcloud services enable \
  run.googleapis.com \
  cloudbuild.googleapis.com \
  artifactregistry.googleapis.com

Step 3: Deploy from Source

gcloud run deploy medseal-agent \
  --source /path/to/Med-SEAL \
  --region asia-southeast1 \
  --port 8000 \
  --memory 1Gi \
  --cpu 1 \
  --timeout 60 \
  --allow-unauthenticated \
  --set-env-vars="\
MEDSEAL_SEALION_API_KEY=<key>,\
MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4" \
  --quiet

Step 4: Verify

# Health check
curl https://medseal-agent-74997794842.asia-southeast1.run.app/health

# List agents
curl https://medseal-agent-74997794842.asia-southeast1.run.app/agents

Expected health response:

{
  "status": "ok",
  "vllm": "unreachable",
  "redis": "ok",
  "medplum": "ok",
  "agents": {}
}

vllm: unreachable is expected — Med-SEAL V1 (med-r1) is not deployed. The system uses SEA-LION via API instead.

3.5 Dockerfile

FROM python:3.11-slim

WORKDIR /app

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential curl \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt agent/requirements_agent.txt ./
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements_agent.txt

COPY agent /app/agent

ENV PYTHONPATH=/app
EXPOSE 8000

CMD ["uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8000"]

3.6 Requirements (Deployment)

The slim deployment requirements (requirements_deploy.txt) include only runtime dependencies:

fastapi>=0.115
uvicorn[standard]
pydantic-settings
langchain>=0.3
langchain-openai
langchain-community
langgraph>=0.2.70
langgraph-checkpoint-redis
langdetect
ddgs
httpx
sqlalchemy
aiosqlite

Training dependencies (PyTorch, transformers, datasets, etc.) are excluded to keep the deployment package small.


4. Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│              GCP Cloud Run (asia-southeast1)                 │
│                                                             │
│  ┌────────────────────────────────────────────────────────┐ │
│  │              FastAPI (uvicorn :8000)                    │ │
│  │                                                        │ │
│  │  /sessions/*          → Orchestrator → Agent Router    │ │
│  │  /openemr/sessions/*  → Orchestrator → Doctor CDS      │ │
│  │  /cds-services/*      → Orchestrator → Insight Agent   │ │
│  │  /triggers/*          → Orchestrator → Nudge Agent     │ │
│  │  /patients/*/previsit → Previsit Agent                 │ │
│  │                                                        │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌─────────────┐  │ │
│  │  │ Companion A1 │  │ Clinical A2  │  │ Doctor CDS  │  │ │
│  │  │ Lifestyle A4 │  │  Nudge A3    │  │ Insight A5  │  │ │
│  │  │ Previsit     │  │ Measurement  │  │   Guard     │  │ │
│  │  └──────────────┘  └──────────────┘  └─────────────┘  │ │
│  │                                                        │ │
│  │  SQLite (medseal_sessions.db) — session checkpointing  │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────┬──────────────────┬──────────────────┬─────────────┘
          │                  │                  │
    ┌─────▼─────┐    ┌──────▼──────┐    ┌──────▼──────┐
    │ SEA-LION   │    │  SEA-Guard  │    │   Medplum   │
    │ v4-32B     │    │ Safety LLM  │    │  FHIR R4    │
    │(clinical + │    │ api.sea-lion│    │ (GKE)       │
    │ companion) │    │ .ai/v1      │    │             │
    └────────────┘    └─────────────┘    └─────────────┘

5. Monitoring & Troubleshooting

5.1 Logs

# Cloud Run logs
gcloud run services logs read medseal-agent --region asia-southeast1 --limit 50

# GKE pod logs
kubectl logs -n medseal deployment/ai-service --tail=50

5.2 Common Issues

Issue

Cause

Fix

vllm: unreachable in health

Expected — Med-SEAL V1 not deployed

No action; SEA-LION handles clinical LLM

medplum: unreachable

GKE FHIR server down or URL misconfigured

Check GKE pods: kubectl get pods -n medseal

STARTUP FAILED running in degraded mode

Missing env vars or API key issues

Check MEDSEAL_* env vars in Cloud Run

SEA-LION API timeout

Rate limit or network issue

Retry; check SEA-LION API status

SQLite checkpointer unavailable

File permission or dependency issue

Falls back to in-memory MemorySaver; sessions won’t persist across restarts

5.3 Scaling

Cloud Run auto-scales based on traffic. To adjust limits:

# Update CPU/memory
gcloud run services update medseal-agent \
  --region asia-southeast1 \
  --cpu 2 \
  --memory 2Gi

# Set max instances
gcloud run services update medseal-agent \
  --region asia-southeast1 \
  --max-instances 10

Note: When scaling to multiple instances, switch session persistence from SQLite to Redis (MEDSEAL_REDIS_URL) since SQLite is per-instance.


6. Live Deployment URLs

Service

URL

Platform

AI Agent (API)

https://medseal-agent-74997794842.asia-southeast1.run.app

Cloud Run

Swagger UI

https://medseal-agent-74997794842.asia-southeast1.run.app/docs

Cloud Run

Patient App

app.medseal.34.54.226.15.nip.io

GKE

FHIR Server

fhir.medseal.34.54.226.15.nip.io

GKE

OpenEMR

emr.medseal.34.54.226.15.nip.io

GKE

Medplum Admin

medplum.medseal.34.54.226.15.nip.io

GKE