Deploy: 7 Services on Cloud Run

Deploy command

Each agent deploys with gcloud run deploy. The command builds the image with Cloud Build, pushes it to Artifact Registry, and creates the Cloud Run service — all in one step:

gcloud run deploy platform-architect \
  --source ./agents/platform-architect \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi \
  --timeout=300

Key flags

Flag	Value	Why
`--source`	Agent directory	Cloud Build creates the Dockerfile image automatically
`--region`	`us-central1`	Lowest latency for Gemini API
`--allow-unauthenticated`	—	Simplifies orchestrator access (use IAM in production)
`--set-secrets`	`GEMINI_API_KEY=GEMINI_API_KEY:latest`	Injects the Secret Manager value as an env var
`--memory`	`1Gi`	Some agents need more memory for context processing
`--timeout`	`300`	5 minutes max — Web Portal can take ~45s

Deploying all 7 agents

Run the deploy command for each agent, changing the service name and source directory:

# Agent 1
gcloud run deploy platform-architect \
  --source ./agents/platform-architect \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 2
gcloud run deploy infrastructure \
  --source ./agents/infrastructure \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 3
gcloud run deploy security \
  --source ./agents/security \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 4
gcloud run deploy cicd \
  --source ./agents/cicd \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 5
gcloud run deploy observability \
  --source ./agents/observability \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 6
gcloud run deploy devex \
  --source ./agents/devex \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

# Agent 7
gcloud run deploy web-portal \
  --source ./agents/web-portal \
  --region us-central1 \
  --allow-unauthenticated \
  --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \
  --memory=1Gi --timeout=300

Each deploy takes 2-4 minutes (Cloud Build + image push + service creation).

Verify deployed services

List all Cloud Run services to confirm the 7 agents are running:

gcloud run services list --region us-central1

You should see 7 services with status ✔ Ready:

SERVICE              REGION        URL                                              LAST DEPLOYED
platform-architect   us-central1   https://platform-architect-HASH.run.app          2026-04-14
infrastructure       us-central1   https://infrastructure-HASH.run.app              2026-04-14
security             us-central1   https://security-HASH.run.app                    2026-04-14
cicd                 us-central1   https://cicd-HASH.run.app                        2026-04-14
observability        us-central1   https://observability-HASH.run.app               2026-04-14
devex                us-central1   https://devex-HASH.run.app                       2026-04-14
web-portal           us-central1   https://web-portal-HASH.run.app                  2026-04-14

Health check — agent.json

Each agent exposes its agent.json at the root URL. Use curl to verify an agent is alive and responding:

curl https://platform-architect-HASH.run.app/agent.json

Expected response:

{
  "name": "platform-architect",
  "description": "Analyzes task requirements and decides the complete technology stack",
  "url": "https://platform-architect-HASH.run.app",
  "version": "1.0.0",
  "capabilities": {
    "streaming": false,
    "pushNotifications": false
  }
}

If the response is empty or returns an error, check the service logs:

gcloud run services logs read platform-architect --region us-central1 --limit 20

Scale-to-zero behavior

After ~15 minutes without traffic, Cloud Run scales the service to zero instances. This means:

No cost when agents are not in use
Cold start of 3-5 seconds on the first request after idle
The orchestrator accounts for this with appropriate timeouts

Next step: Demo — HTTP orchestrator and full chain →