Deploy: 7 Services on Cloud Run
Deploy command
Section titled “Deploy command”Each agent deploys with gcloud run deploy. The command builds the image with Cloud Build, pushes it to Artifact Registry, and creates the Cloud Run service — all in one step:
gcloud run deploy platform-architect \ --source ./agents/platform-architect \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi \ --timeout=300Key flags
Section titled “Key flags”| Flag | Value | Why |
|---|---|---|
--source | Agent directory | Cloud Build creates the Dockerfile image automatically |
--region | us-central1 | Lowest latency for Gemini API |
--allow-unauthenticated | — | Simplifies orchestrator access (use IAM in production) |
--set-secrets | GEMINI_API_KEY=GEMINI_API_KEY:latest | Injects the Secret Manager value as an env var |
--memory | 1Gi | Some agents need more memory for context processing |
--timeout | 300 | 5 minutes max — Web Portal can take ~45s |
Deploying all 7 agents
Section titled “Deploying all 7 agents”Run the deploy command for each agent, changing the service name and source directory:
# Agent 1gcloud run deploy platform-architect \ --source ./agents/platform-architect \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 2gcloud run deploy infrastructure \ --source ./agents/infrastructure \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 3gcloud run deploy security \ --source ./agents/security \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 4gcloud run deploy cicd \ --source ./agents/cicd \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 5gcloud run deploy observability \ --source ./agents/observability \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 6gcloud run deploy devex \ --source ./agents/devex \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300
# Agent 7gcloud run deploy web-portal \ --source ./agents/web-portal \ --region us-central1 \ --allow-unauthenticated \ --set-secrets=GEMINI_API_KEY=GEMINI_API_KEY:latest \ --memory=1Gi --timeout=300Each deploy takes 2-4 minutes (Cloud Build + image push + service creation).
Verify deployed services
Section titled “Verify deployed services”List all Cloud Run services to confirm the 7 agents are running:
gcloud run services list --region us-central1You should see 7 services with status ✔ Ready:
SERVICE REGION URL LAST DEPLOYEDplatform-architect us-central1 https://platform-architect-HASH.run.app 2026-04-14infrastructure us-central1 https://infrastructure-HASH.run.app 2026-04-14security us-central1 https://security-HASH.run.app 2026-04-14cicd us-central1 https://cicd-HASH.run.app 2026-04-14observability us-central1 https://observability-HASH.run.app 2026-04-14devex us-central1 https://devex-HASH.run.app 2026-04-14web-portal us-central1 https://web-portal-HASH.run.app 2026-04-14Health check — agent.json
Section titled “Health check — agent.json”Each agent exposes its agent.json at the root URL. Use curl to verify an agent is alive and responding:
curl https://platform-architect-HASH.run.app/agent.jsonExpected response:
{ "name": "platform-architect", "description": "Analyzes task requirements and decides the complete technology stack", "url": "https://platform-architect-HASH.run.app", "version": "1.0.0", "capabilities": { "streaming": false, "pushNotifications": false }}If the response is empty or returns an error, check the service logs:
gcloud run services logs read platform-architect --region us-central1 --limit 20Scale-to-zero behavior
Section titled “Scale-to-zero behavior”After ~15 minutes without traffic, Cloud Run scales the service to zero instances. This means:
- No cost when agents are not in use
- Cold start of 3-5 seconds on the first request after idle
- The orchestrator accounts for this with appropriate timeouts
Next step: Demo — HTTP orchestrator and full chain →