# AI Agent — Implementation & GCP Cloud Run Deployment This page documents the implementation architecture and **GCP Cloud Run** deployment of the **Med-SEAL AI Agent** — the multi-agent clinical reasoning service that powers the platform's 18 smart healthcare features. --- ## 1. Implementation Overview ### 1.1 Technology Stack | Component | Technology | |---|---| | **Runtime** | Python 3.11 | | **Web framework** | FastAPI + Uvicorn | | **Agent framework** | LangGraph (LangChain) | | **Clinical LLM** | SEA-LION v4-32B (AI Singapore) | | **Conversation model** | SEA-LION v4 (AI Singapore) | | **Safety guard** | SEA-Guard (AI Singapore) | | **FHIR client** | Medplum (custom `httpx`-based client) | | **Session persistence** | SQLite (via `langgraph-checkpoint-sqlite`) | | **Containerisation** | Docker (`python:3.11-slim`) | | **Cloud hosting** | GCP Cloud Run (Singapore, `asia-southeast1`) | | **Infrastructure** | GKE (`medseal-cluster`) for full stack | ### 1.2 Agent Roster The system builds and registers 7 LangGraph agent graphs at startup: | Agent ID | Module | Purpose | |---|---|---| | `companion-agent` | `agent.agents.companion` | Patient-facing conversational interface | | `clinical-reasoning-agent` | `agent.agents.clinical` | Clinical reasoning (drug interactions, lab interpretation) | | `doctor-cds-agent` | `agent.agents.doctor_cds` | OpenEMR clinician chat & decision support | | `nudge-agent` | `agent.agents.nudge` | Proactive reminders and escalation | | `lifestyle-agent` | `agent.agents.lifestyle` | Dietary and lifestyle coaching | | `insight-synthesis-agent` | `agent.agents.insight` | Pre-visit brief synthesis | | `previsit-summary-agent` | `agent.agents.previsit` | FHIR-based pre-visit data aggregation | Each agent is a compiled LangGraph `StateGraph` with its own tools, system prompt, and checkpointer. ### 1.3 LLM Backend — Dual Mode The system supports two LLM backends via the **LLM Factory** (`agent/core/llm_factory.py`): ```python # config.py — defaults to SEA-LION clinical_llm_backend: str = "azure" # or "vllm" for Med-SEAL V1 ``` | Mode | Backend | Model | When | |---|---|---|---| | `azure` *(current)* | SEA-LION v4-32B via API | Qwen-SEA-LION-v4-32B-IT | Production — no GPU needed | | `vllm` | Self-hosted vLLM | [`aagdeyogipramana/Med-SEAL-V1`](https://huggingface.co/aagdeyogipramana/Med-SEAL-V1) | Future — requires 2× H200 GPU | Switching is a single env var change: `MEDSEAL_CLINICAL_LLM_BACKEND=vllm`. ### 1.4 API Endpoints | Method | Path | Surface | Description | |---|---|---|---| | `POST` | `/sessions` | Patient | Create new chat session | | `POST` | `/sessions/{id}/messages` | Patient | Send message (sync) | | `POST` | `/sessions/{id}/messages/stream` | Patient | Send message (SSE streaming) | | `GET` | `/sessions/{id}/messages` | Patient | Get conversation history | | `DELETE` | `/sessions/{id}` | Patient | Delete session | | `POST` | `/patients/{id}/previsit-summary` | Clinician | Generate pre-visit summary | | `POST` | `/openemr/sessions/{id}/chat` | Clinician | Doctor chat (SSE) | | `POST` | `/openemr/sessions/{id}/chat/sync` | Clinician | Doctor chat (sync) | | `POST` | `/cds-services/patient-view` | CDS Hooks | Trigger insight synthesis | | `POST` | `/triggers/{type}` | System | Fire nudge/measurement triggers | | `GET` | `/agents` | Admin | List registered agents | | `GET` | `/agents/{id}/health` | Admin | Agent health check | | `GET` | `/health` | Admin | System health check | ### 1.5 External Dependencies | Service | Endpoint | Purpose | |---|---|---| | **SEA-LION API** | `api.sea-lion.ai/v1` | Clinical reasoning + Conversation + Guard | | **Medplum FHIR** | `fhir.medseal.34.54.226.15.nip.io/fhir/R4` | Patient health records (GKE) | --- ## 2. Project Structure ``` agent/ ├── main.py # FastAPI app, lifespan, agent graph compilation ├── config.py # Pydantic Settings (env vars, MEDSEAL_ prefix) ├── agents/ # LangGraph agent definitions │ ├── companion.py # A1 — patient conversation │ ├── clinical.py # A2 — clinical reasoning │ ├── doctor_cds.py # Doctor chat + CDS │ ├── nudge.py # A3 — proactive nudges │ ├── lifestyle.py # A4 — dietary coaching │ ├── insight.py # A5 — insight synthesis │ ├── previsit.py # Pre-visit summary │ └── measurement.py # A6 — analytics ├── api/ │ └── routes.py # All FastAPI route handlers ├── core/ │ ├── orchestrator.py # Intent classification + agent routing │ ├── guard.py # SEA-LION Guard (input/output safety) │ ├── llm_factory.py # SEA-LION / vLLM backend selector │ ├── graph.py # Legacy agent graph │ ├── identity.py # Agent identity definitions │ ├── language.py # Language detection │ └── router.py # Message routing logic └── tools/ ├── fhir_client.py # Medplum FHIR client (httpx) ├── fhir_tools_clinical.py # FHIR tools for A2 ├── fhir_tools_companion.py # FHIR tools for A1 ├── fhir_tools_nudge.py # FHIR tools for A3 ├── fhir_tools_lifestyle.py # FHIR tools for A4 ├── fhir_tools_insight.py # FHIR tools for A5 ├── fhir_tools_previsit.py # FHIR tools for pre-visit ├── fhir_tools_measurement.py # FHIR tools for A6 ├── fhir_tools_appointment.py # Appointment management └── medical_tools.py # Medical knowledge tools ``` --- ## 3. GCP Cloud Run Deployment ### 3.1 GCP Resources | Resource | Type | Configuration | |---|---|---| | **Project** | `gen-lang-client-0538005727` | Gemini Project1 | | **Cloud Run Service** | `medseal-agent` | `asia-southeast1` (Singapore) | | **GKE Cluster** | `medseal-cluster` | 2× `ek-standard-8` nodes | | **Artifact Registry** | `cloud-run-source-deploy` | Docker images | | **Ingress IP** | `34.54.226.15` | GKE external load balancer | ### 3.2 Environment Variables ```bash # Clinical LLM (SEA-LION) MEDSEAL_CLINICAL_LLM_BACKEND=azure MEDSEAL_SEALION_API_URL=https://api.sea-lion.ai/v1 MEDSEAL_SEALION_API_KEY= MEDSEAL_SEALION_MODEL=aisingapore/Qwen-SEA-LION-v4-32B-IT MEDSEAL_SEAGUARD_MODEL=aisingapore/SEA-Guard # Medplum FHIR (GKE) MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4 MEDSEAL_MEDPLUM_EMAIL=admin@example.com MEDSEAL_MEDPLUM_PASSWORD=medplum_admin # App behaviour MEDSEAL_MAX_RECURSION=5 MEDSEAL_TEMPERATURE=0.6 ``` ### 3.3 Startup Command ```bash uvicorn agent.main:app --host 0.0.0.0 --port 8000 ``` ### 3.4 Deployment Steps #### Prerequisites - Google Cloud SDK (`gcloud`) installed and authenticated - GCP project with billing enabled - Access to SEA-LION API (AI Singapore) #### Step 1: Authenticate & Configure Project ```bash export PATH="$HOME/google-cloud-sdk/bin:$PATH" gcloud auth login --no-launch-browser gcloud config set project gen-lang-client-0538005727 ``` #### Step 2: Enable Required APIs ```bash gcloud services enable \ run.googleapis.com \ cloudbuild.googleapis.com \ artifactregistry.googleapis.com ``` #### Step 3: Deploy from Source ```bash gcloud run deploy medseal-agent \ --source /path/to/Med-SEAL \ --region asia-southeast1 \ --port 8000 \ --memory 1Gi \ --cpu 1 \ --timeout 60 \ --allow-unauthenticated \ --set-env-vars="\ MEDSEAL_SEALION_API_KEY=,\ MEDSEAL_MEDPLUM_URL=http://fhir.medseal.34.54.226.15.nip.io/fhir/R4" \ --quiet ``` #### Step 4: Verify ```bash # Health check curl https://medseal-agent-74997794842.asia-southeast1.run.app/health # List agents curl https://medseal-agent-74997794842.asia-southeast1.run.app/agents ``` Expected health response: ```json { "status": "ok", "vllm": "unreachable", "redis": "ok", "medplum": "ok", "agents": {} } ``` > `vllm: unreachable` is expected — Med-SEAL V1 (`med-r1`) is not deployed. The system uses SEA-LION via API instead. ### 3.5 Dockerfile ```dockerfile FROM python:3.11-slim WORKDIR /app RUN apt-get update && apt-get install -y --no-install-recommends \ build-essential curl \ && rm -rf /var/lib/apt/lists/* COPY requirements.txt agent/requirements_agent.txt ./ RUN pip install --no-cache-dir -r requirements.txt RUN pip install --no-cache-dir -r requirements_agent.txt COPY agent /app/agent ENV PYTHONPATH=/app EXPOSE 8000 CMD ["uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8000"] ``` ### 3.6 Requirements (Deployment) The slim deployment requirements (`requirements_deploy.txt`) include only runtime dependencies: ``` fastapi>=0.115 uvicorn[standard] pydantic-settings langchain>=0.3 langchain-openai langchain-community langgraph>=0.2.70 langgraph-checkpoint-redis langdetect ddgs httpx sqlalchemy aiosqlite ``` Training dependencies (PyTorch, transformers, datasets, etc.) are excluded to keep the deployment package small. --- ## 4. Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────┐ │ GCP Cloud Run (asia-southeast1) │ │ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ FastAPI (uvicorn :8000) │ │ │ │ │ │ │ │ /sessions/* → Orchestrator → Agent Router │ │ │ │ /openemr/sessions/* → Orchestrator → Doctor CDS │ │ │ │ /cds-services/* → Orchestrator → Insight Agent │ │ │ │ /triggers/* → Orchestrator → Nudge Agent │ │ │ │ /patients/*/previsit → Previsit Agent │ │ │ │ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │ │ │ │ │ Companion A1 │ │ Clinical A2 │ │ Doctor CDS │ │ │ │ │ │ Lifestyle A4 │ │ Nudge A3 │ │ Insight A5 │ │ │ │ │ │ Previsit │ │ Measurement │ │ Guard │ │ │ │ │ └──────────────┘ └──────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ SQLite (medseal_sessions.db) — session checkpointing │ │ │ └────────────────────────────────────────────────────────┘ │ └─────────┬──────────────────┬──────────────────┬─────────────┘ │ │ │ ┌─────▼─────┐ ┌──────▼──────┐ ┌──────▼──────┐ │ SEA-LION │ │ SEA-Guard │ │ Medplum │ │ v4-32B │ │ Safety LLM │ │ FHIR R4 │ │(clinical + │ │ api.sea-lion│ │ (GKE) │ │ companion) │ │ .ai/v1 │ │ │ └────────────┘ └─────────────┘ └─────────────┘ ``` --- ## 5. Monitoring & Troubleshooting ### 5.1 Logs ```bash # Cloud Run logs gcloud run services logs read medseal-agent --region asia-southeast1 --limit 50 # GKE pod logs kubectl logs -n medseal deployment/ai-service --tail=50 ``` ### 5.2 Common Issues | Issue | Cause | Fix | |---|---|---| | `vllm: unreachable` in health | Expected — Med-SEAL V1 not deployed | No action; SEA-LION handles clinical LLM | | `medplum: unreachable` | GKE FHIR server down or URL misconfigured | Check GKE pods: `kubectl get pods -n medseal` | | `STARTUP FAILED — running in degraded mode` | Missing env vars or API key issues | Check `MEDSEAL_*` env vars in Cloud Run | | `SEA-LION API timeout` | Rate limit or network issue | Retry; check SEA-LION API status | | `SQLite checkpointer unavailable` | File permission or dependency issue | Falls back to in-memory MemorySaver; sessions won't persist across restarts | ### 5.3 Scaling Cloud Run auto-scales based on traffic. To adjust limits: ```bash # Update CPU/memory gcloud run services update medseal-agent \ --region asia-southeast1 \ --cpu 2 \ --memory 2Gi # Set max instances gcloud run services update medseal-agent \ --region asia-southeast1 \ --max-instances 10 ``` > **Note:** When scaling to multiple instances, switch session persistence from SQLite to Redis (`MEDSEAL_REDIS_URL`) since SQLite is per-instance. --- ## 6. Live Deployment URLs | Service | URL | Platform | |---|---|---| | **AI Agent (API)** | `https://medseal-agent-74997794842.asia-southeast1.run.app` | Cloud Run | | **Swagger UI** | `https://medseal-agent-74997794842.asia-southeast1.run.app/docs` | Cloud Run | | **Patient App** | `app.medseal.34.54.226.15.nip.io` | GKE | | **FHIR Server** | `fhir.medseal.34.54.226.15.nip.io` | GKE | | **OpenEMR** | `emr.medseal.34.54.226.15.nip.io` | GKE | | **Medplum Admin** | `medplum.medseal.34.54.226.15.nip.io` | GKE | --- ## 7. Related Pages - {doc}`../technical-report-v1` — Med-SEAL V1 base model ([`aagdeyogipramana/Med-SEAL-V1`](https://huggingface.co/aagdeyogipramana/Med-SEAL-V1)) technical report - {doc}`overview` — Multi-agent roster and orchestration - {doc}`../architecture` — Full system architecture - {doc}`gcp-deployment` — Detailed GCP deployment guide - {doc}`../developer-guide/api-reference` — REST API reference - {doc}`../developer-guide/environment-setup` — Local development setup