The Scenario
You're an operations analyst at a mid-sized consultancy. Your manager has asked you to "start using Cowork for real work" but wants evidence that you understand what it's actually doing under the hood — not just that you can type a prompt and get a result. She wants a written analysis of a single Cowork session: how it planned the task, what tools it used, whether it spawned sub-agents, and what security boundaries were in play.
This tutorial walks you through running a deliberately complex task, then auditing every observable aspect of the session. By the end, you'll have a document that demonstrates genuine architectural understanding — the kind of analysis that separates someone who uses Cowork from someone who understands it.
Most people skip straight to "getting work done" with Cowork. That's understandable — the tool is designed to be productive immediately. But understanding the agentic architecture beneath the interface changes how you delegate tasks, how you write prompts, and how you troubleshoot when things go wrong. This audit gives you that understanding through direct observation rather than theory.
What You Will Learn
By completing this audit, you will be able to:
- Read and interpret an execution plan before granting permission — and know what to look for
- Identify when and why Cowork spawns sub-agents for parallel processing
- Verify that the sandboxed VM enforces folder-level isolation
- Estimate token consumption and its implications for team-wide deployment
- Articulate the difference between what Cowork planned and what it actually executed
- Produce a professional assessment document suitable for a team lead or security reviewer
Prerequisites
- Claude Desktop installed with Cowork enabled (Pro or Max plan required)
- A folder containing at least 15–20 mixed files — PDFs, CSVs, images, text documents, spreadsheets. The more variety, the better the audit. If you don't have files handy, create a folder called
Audit-Labon your desktop and populate it with sample files from different projects. - A text editor or word processor for writing your analysis
Use real work files if you can — sanitised copies of actual reports, data exports, or project documents. The audit is far more valuable when Cowork is processing realistic content rather than dummy data.
Step 1: Prepare Your Working Directory
Create a dedicated folder called Cowork-Audit-Lab on your desktop. Copy your 15–20 mixed files into it. Don't point Cowork at your Documents or Desktop folder — you want a scoped workspace that demonstrates the principle of least privilege.
Before you start, open the folder in Finder and note the exact file count, total size, and the types present. Write this down. You'll need it later to verify what Cowork could and couldn't see.
Checkpoint: You have a folder with 15+ files of mixed types, and you have recorded the file inventory manually.
Step 2: Design a Multi-Step Task
You need a task complex enough to trigger planning, multiple tool calls, and ideally sub-agent parallelism. A simple "summarise this file" won't show you much.
Use this prompt as your starting point — adapt it to match whatever files you actually have:
Analyse every file in this folder and produce three outputs: (1) an inventory spreadsheet listing each file's name, type, size, and a one-sentence summary of its contents; (2) a folder structure recommendation that groups these files logically by project or topic, with proposed subfolder names; (3) a brief report (saved as analysis-report.md) explaining what patterns you found across the files — common themes, gaps, or inconsistencies.
Let's knock something off your list
Analyse every file in this folder and produce three outputs: (1) an inventory spreadsheet listing each file's name, type, size, and a one-sentence summary of its contents; (2) a folder structure recommendation that groups these files logically by project or topic; (3) a brief report explaining what patterns you found across the files.
A deliberately broad prompt designed to trigger planning, batch processing, and sub-agent parallelism.
This prompt is deliberately broad. It requires Cowork to read every file (triggering batch processing), produce structured output (spreadsheet), make judgement calls (grouping logic), and write multiple output files. That complexity is what makes it auditable.
Checkpoint: You have a multi-step prompt ready that will require Cowork to plan, read many files, and produce multiple outputs.
Step 3: Submit the Task and Observe the Execution Plan
Open Claude Desktop, switch to Cowork, and select your Cowork-Audit-Lab folder as the working directory. Paste your task prompt and submit it.
Don't click Allow immediately. This is the single most important moment of the entire audit — and the habit that separates someone who uses Cowork from someone who understands it.
Read the execution plan Cowork presents. For your analysis document, capture the following:
- Task decomposition: How did Cowork break your single prompt into discrete steps? List each step in the order proposed.
- Tool selection: What tools does the plan mention? File reading, file writing, analysis operations?
- Sequencing logic: Are any steps marked as dependent on previous steps? Which ones could theoretically run in parallel?
If Cowork doesn't show a detailed execution plan and instead starts working immediately, your task may have been too simple. Cancel and add more complexity to your prompt — request additional outputs, more analysis, or cross-file comparisons.
Pay particular attention to the plan's structure. A well-decomposed plan should look something like:
- Scan all files in the directory and catalogue their types
- Read each file and extract a content summary
- Determine logical groupings based on content themes
- Create subfolder structure
- Move files into appropriate subfolders
- Generate the inventory spreadsheet
- Write the analysis report
If the plan is vague ("I'll process the files and create the outputs"), that tells you something important: the task may not have been specific enough to trigger detailed planning. Note this — it's a finding about how prompt specificity affects plan quality.
Checkpoint: You have captured the execution plan in your notes before clicking Allow.
Step 4: Monitor Execution and Log Observations
Click Allow and watch the execution carefully. Don't switch to another window — you're auditing, not multitasking.
Analysing your files...
Cowork shows its execution progress as it processes your files. Note the parallel sub-agents on step 2.
As Cowork works, note:
- Sub-agent activity: Did Cowork spawn parallel workers? You'll see indicators in the interface when sub-agents are active. How many appeared? What was each one doing?
- File access patterns: Which files were read first? Did Cowork process them sequentially or in batches?
- Tool calls: Count how many distinct operations Cowork performed. Each file read, each file write, each analysis step is a separate tool call.
- Time elapsed: Record the total execution time from Allow to completion.
- Permission prompts: Did Cowork ask for any additional permissions during execution? (It should not need to if your folder scoping was correct, but note it if it happens.)
Create a simple observation log as you watch:
| Timestamp | Observation | Category |
|---|---|---|
| 0:00 | Task submitted, plan displayed | Planning |
| 0:15 | Plan approved, execution started | Execution |
| 0:22 | First file read (invoice.pdf) | File access |
| 0:30 | Sub-agent spawned for parallel processing | Parallelism |
| ... | ... | ... |
This timestamped log becomes the backbone of your audit document. Without it, you are working from memory — which is unreliable and unpersuasive.
If the task completes in under two minutes with no visible sub-agents, your file set may be too small to trigger parallelism. Cowork only spawns sub-agents when the task is large enough to justify the overhead. This is itself a useful finding — document the threshold.
Checkpoint: You have a running log of execution observations — sub-agent count, file processing order, tool calls observed, and total time.
Step 5: Verify Outputs Against the Plan
Task complete
File inventory spreadsheet created
Folder structure recommendation saved
Analysis report written to analysis-report.md
All three outputs produced. Now compare what was delivered against what the plan promised.
Once Cowork finishes, open your Cowork-Audit-Lab folder and examine what was created. Check:
- Did every planned output actually get produced?
- Open the inventory spreadsheet. Does it list every file you started with? Are the summaries accurate?
- Read the folder structure recommendation. Does it make logical sense given your files?
- Open
analysis-report.md. Does it reference specific files and identify genuine patterns?
Note any discrepancies between what the plan promised and what was actually delivered. These gaps are the most interesting part of your audit — they reveal where Cowork's execution engine differs from its planning engine.
Common discrepancies to look for:
- Missing outputs: The plan mentioned creating a file that doesn't exist
- Extra outputs: Cowork created files that weren't in the plan (this can indicate scope creep in the execution engine)
- Partial completion: A file was created but is incomplete — headers are present but content is missing
- Quality variation: Some file summaries in the inventory are detailed and accurate; others are vague or wrong. This often indicates sub-agent quality variation, where parallel workers produced inconsistent outputs.
- Structural choices: Did the grouping logic match what you would have done? If Cowork grouped your files differently than you expected, that is worth documenting — it reveals how the planning engine interprets ambiguous instructions.
Create a comparison table:
| Planned Output | Actually Produced? | Quality (1-5) | Notes |
|---|---|---|---|
| Inventory spreadsheet | Yes/No/Partial | ||
| Folder structure | Yes/No/Partial | ||
| Analysis report | Yes/No/Partial |
Checkpoint: You have compared planned outputs with actual outputs and noted any discrepancies.
Step 6: Test the Sandbox Boundary
Now test what Cowork cannot do. In the same session, ask:
List all files on my Desktop that are not in this working folder.
Cowork should either refuse or explain that it can't access files outside the granted directory. This confirms the sandbox boundary is enforced at the VM level, not just as a courtesy.
Try one more boundary test:
Create a file called test.txt in my Downloads folder.
Again, Cowork should be unable to write outside the scoped directory. Document exactly what it says when you make these requests.
If Cowork somehow accesses files outside your working directory, you may have inadvertently granted broader access than intended. Check your session's folder scope in the Cowork sidebar.
For a thorough boundary assessment, also test these edge cases:
- Symlink traversal: If you have symbolic links inside your working folder that point to directories outside it, can Cowork follow them? This is a subtle but important security question.
- Parent directory reference: Ask Cowork to read
../some-file.txt— can it traverse upward from the working directory? - Absolute path access: Ask it to read a specific file using its full path (e.g.,
/Users/yourname/Documents/sensitive-file.pdf). The sandbox should block this regardless of how the path is specified.
Document the results of each test in a table:
| Boundary Test | Request | Result | Security Implication |
|---|---|---|---|
| Files outside working folder | "List Desktop files" | Blocked/Allowed | |
| Write outside scope | "Create file in Downloads" | Blocked/Allowed | |
| Parent directory traversal | "Read ../file.txt" | Blocked/Allowed | |
| Absolute path access | "Read /Users/.../file" | Blocked/Allowed |
Checkpoint: You have documented at least four sandbox boundary tests and their results.
Step 7: Analyse Token Consumption
Look at the session's token usage. In Claude Desktop, you can see usage indicators that reflect how much of your rate limit was consumed. Note:
- How much of your rate limit did this single task consume?
- Compare this mentally with a simple Chat conversation. A task that processes 15-20 files with sub-agents will use dramatically more tokens than asking a question in Chat.
- If you are on a Max plan, note whether the task triggered any rate limit warnings.
This matters because it directly impacts how you plan workloads. A team of five people all running complex Cowork tasks simultaneously will hit rate limits far faster than the same team using Chat.
Checkpoint: You have noted token consumption observations and compared them with typical Chat usage.
Step 8: Write Your Audit Document
Now compile everything into a structured analysis. Create a new document (in your text editor, not in Cowork — you're the analyst here, not the AI) with these sections:
Recommended Audit Document Structure
1. Task Overview — What you asked Cowork to do, and why you chose this specific task.
2. Execution Plan Analysis — How Cowork decomposed the task. List each step, note dependencies, and identify which steps could have been parallelised.
3. Sub-Agent Observations — How many sub-agents were spawned, what each handled, and whether parallelism improved performance versus a hypothetical sequential execution.
4. Sandbox Boundary Assessment — Results of your boundary tests. Confirm that the VM-level isolation worked as documented.
5. Output Quality Review — Did the outputs match the plan? Where did execution diverge from planning? Rate the quality of each output on a simple scale.
6. Token Consumption Analysis — How resource-intensive was this task? What are the implications for team-wide deployment?
7. Recommendations — Based on this audit, what types of tasks are well-suited to Cowork? What would you avoid delegating?
Writing Guidance
For each section, aim for specificity over generality. "Cowork spawned sub-agents" is a description. "Cowork spawned three sub-agents to process 18 files in parallel, reducing what would have been sequential processing from an estimated 8 minutes to 3 minutes" is an analysis. Your document should read like the latter.
The recommendations section is where your audit earns its keep. Base your recommendations on what you actually observed, not on what the marketing material promises. If sub-agents did not improve performance for your file set, say so and explain why. If the sandbox boundary held perfectly, document that as evidence for the security team. If the execution plan missed a step, recommend how to write better prompts that prevent that in future.
Frame your recommendations around three audiences: yourself (what changes in how you use Cowork), your team (what guidelines should everyone follow), and leadership (what this audit tells them about the technology's readiness for broader adoption).
Checkpoint: You have a complete audit document with all seven sections filled in.
Expected Output
Your deliverable is a written audit document of 2-4 pages that demonstrates you can:
- Read and interpret a Cowork execution plan before granting permission
- Identify sub-agent parallelism and explain when it helps
- Verify sandbox boundaries through practical testing
- Assess token consumption implications for team deployment
- Distinguish between planned and actual execution outcomes
This is the kind of analysis a team lead or security reviewer needs before approving Cowork for department-wide use.
What a Strong Audit Looks Like
The difference between a mediocre audit and a persuasive one is specificity. Compare:
Mediocre: "Cowork processed the files and created the outputs. Sub-agents were used. The sandbox worked."
Strong: "Cowork decomposed the task into 7 steps, spawning 3 sub-agents to read 18 files in parallel (steps 2-4), then serialised steps 5-7 for cross-referencing and output creation. Total execution time was 4 minutes 12 seconds. Of 3 planned outputs, all 3 were produced. The inventory spreadsheet contained 18/18 files with accurate summaries for 16 (two PDFs were summarised as 'financial document' without specific content — likely due to complex table formatting). Sandbox boundary tests confirmed that all four access attempts outside the working directory were blocked at the VM level with appropriate error messages."
The second version gives the reader everything they need to make a decision. Aim for that level of detail throughout your document.
Extension Challenges
-
Run the same task in Chat mode — upload the same files to a standard Claude Chat conversation, give the same instructions, and compare the experience. Document what Chat could and could not do that Cowork handled natively (direct file creation, batch processing, folder-level access).
-
Repeat with a sequential-only task — design a task where every step depends on the previous one (e.g., "Read file A, use its findings to analyse file B, then compare both to file C"). Observe whether Cowork still attempts parallelism or correctly serialises the steps.
-
Audit a scheduled task — if you have a recurring Cowork task set up, audit one of its automated runs. Compare the execution pattern of an unattended scheduled task with the interactive session you just analysed.