The rollout track
background sub-tasks · SCIM + RBAC · spend caps · OpenTelemetry · SSO · network egress
Personal Cowork is a productivity unlock. Team Cowork is an IT decision. The moment three people on your team are running scheduled agents that touch shared connectors, you have a fleet to manage. This module is the rollout playbook — the admin features that exist, the gotchas that surface only at the SCIM-and-OTel stage, and the order to enable things so you don't have to redo the work.
Anthropic's enterprise admin guide is the right reference for the dashboards. It doesn't tell you which group spend cap to set first, where the SSO-enforcement timing trap is, or how to read the OpenTelemetry stream so on-call gets a useful page instead of noise.
Five workflows: background sub-task trees, SCIM + RBAC, group spend limits, OpenTelemetry, and the SSO + network egress gotchas. Closing section: the 7-step rollout playbook.
// Anthropic's launch keynote — the admin / governance section. Watch on YouTube →
Parallel agents, one task panel
// audience: anyone scaling beyond single-threaded runsCowork can spawn sub-tasks — agents that run in parallel and report back to the parent. The task panel shows them as a tree. This is the surface that makes "process 200 tickets" finish in 4 minutes instead of 40.
Three patterns you'll use:
- Fan-out, gather. Parent says "for each item, do X." Sub-agents work in parallel. Parent aggregates results. Pattern fits report generation, batch analysis, multi-source research.
- Pipeline. Sub-agents form a chain — one's output is the next's input. Pattern fits ETL-shaped work where each step is its own bounded run.
- Worker pool. Parent holds a queue; pool of sub-agents pulls work. Pattern fits long queues where you want bounded concurrency.
The cost note: a fan-out of 50 spends 50x the tokens of one run. Bound the concurrency (worker-pool pattern) or cap items per parent run. Group spend limits (workflow 3) are your backstop.
Codex (cloud agents): can run multiple branch agents in parallel, but composition is shaped around PRs, not arbitrary work. Copilot Workspace: single-track per task. Custom orchestration via API: always an option in any of the three, with the cost of building and running the orchestration yourself.
Provisioning that survives churn
// audience: IT / IAM ownersOn the Team and Enterprise plans, Cowork supports SCIM provisioning from your identity provider (Entra ID, Okta, Google Workspace). The win isn't onboarding speed — it's offboarding. When a contractor leaves, SCIM revokes the seat and the connector grants in one move; manual provisioning leaves orphan grants behind.
The role model:
- Member — can use Cowork; can install marketplace plugins for themselves; can wire personal hooks.
- Admin — same plus connector approvals at the team level, spend cap administration, audit log read.
- Owner — same plus billing, SSO config, network egress, deletion.
The SCIM group pattern we recommend — align Cowork groups with IdP groups, not with one-off "Cowork users":
cowork-members-eng— everyone in engineering, default member.cowork-members-ops— ops/IT, often with a tighter connector list.cowork-admins-platform— the two or three platform-team people who curate plugins and audit grants.cowork-owners— the two billing/security owners.
The gotcha: if your IdP doesn't surface job-function attributes, you'll need to maintain Cowork groups manually. That defeats the SCIM win. Map the source-of-truth attribute (department, costCenter, employeeType) to Cowork groups in the SCIM connector once, and the mapping holds.
ChatGPT Enterprise and M365 Copilot both ship SCIM + similar role hierarchies. This is the workflow where the three are most comparable — pick by which IdP/connector ecosystem you already trust, not by feature checkbox.
The conversation finance asks about
// audience: anyone with a quarterly budgetCowork meters at three levels: per-task, per-user, per-group. The admin dashboard exposes spend caps at the latter two. The rule of thumb from real rollouts:
- Cap groups, not individuals. Per-user caps are tedious and they don't catch the actual failure mode — one person's runaway schedule eating the team's budget. Group cap with per-user alerting fires earlier.
- Set the cap at 1.3× the previous month's actual. Tight enough that a runaway shows up; loose enough that legitimate growth doesn't page on-call.
- Use anomaly alerts, not hard cutoffs, for the first quarter. A hard cutoff in week 3 of rollout will block someone's actual work. Alerts give you the signal without the false positive cost.
Where spend hides (the same shapes from C02, restated for admin):
- Unbounded fan-out in sub-task spawning (workflow 1 above).
- Scheduled tasks that run on weekends/holidays for no reason — tighten cron.
- Long-context reads against large source sets — cap source count in the prompt.
- Plugin / skill that quietly loads with a connector you didn't expect — the operator's layer (C04) is the audit surface here.
Cost telemetry to a dashboard: Cowork exposes spend in the admin API. We pipe ours into Grafana with a weekly digest. The actionable view isn't "total spend" — it's "spend slope per group week-over-week." Flat is fine; doubling is the signal.
ChatGPT Enterprise meters per-seat; M365 Copilot meters per-license. Cowork's per-group cap with per-task metering is more granular — useful if your usage is uneven across teams, less so if everyone uses roughly the same.
Read the runs like infra
// audience: anyone with an observability stackThe Enterprise plan exports Cowork events as OpenTelemetry traces. Run lifecycle, tool invocations, connector calls, errors, spend per step. Your existing observability stack (Grafana, Datadog, Honeycomb, Splunk) ingests it.
The three dashboards worth wiring on day one:
- Run health — runs by status (succeeded / failed / timed out / cancelled) grouped by user and skill. The "is the loop working" view. Page on a sudden spike in failures.
- Tool latency — p50/p95 per connector and MCP. A slow connector poisons every workflow that touches it; this is the surface where you catch it.
- Spend per group, per skill — the cost-anomaly view from workflow 3, now living in the same panel as your infra dashboards.
The on-call rule: page only on the run-health spike. Tool latency and spend get a daily digest. The volume-shaped signals are not page-worthy — if you wake someone for a spend bump that turned out to be a legitimate batch run, they'll mute the channel next week and miss the actual outage.
ChatGPT Enterprise exposes logs via API. M365 Copilot writes to the M365 audit log; you'd pull via the unified audit API into your SIEM. OTel-shaped streams are the cleanest of the three for plugging into an existing observability stack with traces, not just discrete log lines.
Two gotchas worth knowing in advance
// audience: IT / security ownersGotcha #1 — SSO enforcement timing. When you flip SSO from "optional" to "required" on the Cowork workspace, every existing session that didn't authenticate via SSO is terminated. If half the team logged in with email/password during the trial, they're all going to bounce at once. The fix:
- Provision the SSO connection. Verify with one test user.
- Send a heads-up notice 24h before enforcement — "you'll be asked to re-auth via SSO tomorrow at 09:00."
- Flip the toggle outside business hours.
- Monitor the help-desk queue for the first 4 hours after — there will be a handful of "I can't log in" tickets; usually browser cache or stale extensions.
Gotcha #2 — network egress allowlists. If your network blocks outbound traffic by default, Cowork needs allowlist entries for: Anthropic API endpoints, the connector OAuth endpoints (varies per connector), MCP server hostnames, and your enabled marketplace plugin endpoints. Miss one and the symptom is silent: the agent thinks for a long time, then fails the tool call with a generic network error. Two specific moves that save weeks of debugging:
- Allowlist by hostname, not by IP. Anthropic and connector hosts rotate IPs. Hostname-based allowlists survive the rotation; IP-based ones go down on the first deploy.
- Verify with a canary prompt after each allowlist change — a multi-connector prompt that exercises Slack + Drive + an MCP. If it succeeds, your allowlist is correct; if it fails on the second tool, you missed an entry.
The audit log is where both gotchas show their tracks. Failed SSO attempts, blocked egress — both leave a line. Read it during rollout, not after.
Both gotchas have direct equivalents in ChatGPT Enterprise and M365 Copilot rollouts — SSO timing and egress allowlists are universal IT rollout patterns, not Cowork-specific. The Cowork detail to know is the canary-prompt pattern using multi-connector composition; that's the fast verification surface.
From personal use to team capability
// audience: whoever owns the rolloutSequence matters. This is the order we recommend, with the gate between each step:
- One operator runs Cowork solo for two weeks. Gate: they can name their three highest-ROI workflows. If they can't, the team-wide rollout is premature.
- Add two more operators on the same plan. Gate: each runs the prior operator's top skill at least once and reports back. The skill library starts here.
- Move to Team plan, provision SSO and one SCIM group. Gate: deactivating a test user revokes connectors cleanly. SSO timing gotcha rehearsed.
- Wire group spend limits at 1.3× current actual. Gate: a deliberately runaway test prompt hits the alert, not the cap.
- Stand up OpenTelemetry to your observability stack. Gate: three dashboards live, on-call alert wired for run-health spike only.
- Open marketplace plugins to the team, with a curation policy. Gate: one admin owns the approved-list and the quarterly re-audit.
- Document the operator's layer (C04) as a team artifact. Gate: skills, hooks, plugin list, memory hygiene are all in the team handbook or repo. The rollout is done when this lives outside any one person's head.
Each step is reversible. Skip none. Compressing the timeline rarely saves more than re-doing one step costs.
You ran the rollout. We help with the next stack.
Module C05 of five in the Cowork for IT Pros track. If you want to layer this with the Second Brain Field Guide or our Vibe Coder / Vibe Management tracks, the Track Packs at /level-up bundle everything.
← Back to Cowork track ← C04 · Curate See the Track Packs →