Domain 2 · Task Statement 2.2

Document Analysis & Synthesis

TL;DR

Analyse PDFs, spreadsheets, and presentations directly from local folders, use sub-agents for cross-document synthesis, extract structured data from unstructured sources, and specify output formats that produce actionable deliverables.

What You Need to Know

Most knowledge workers have a version of this ritual: open a PDF, copy the important bits into a note, open the next PDF, repeat. The spreadsheet means scanning columns. The slide deck means clicking through forty slides looking for the one chart that matters. By the time you've pulled insights from six or seven documents, an hour has passed and the synthesis — the part that actually requires judgement — hasn't even started.

Cowork skips the extraction grind. Point it at a folder, describe the analysis you need, and it reads every document in the directory: PDFs, XLSX spreadsheets, PPTX presentations, CSVs, plain text files. No uploading. No 30MB file-size limits. No 20-file caps. The finance team's revenue spreadsheet, the marketing team's campaign deck, and the operations team's quarterly PDF can all sit in one folder and get analysed in a single session.

How cross-document synthesis actually works

Cross-document synthesis is the most heavily tested capability in this task statement. Get this wrong on the exam and you'll drop marks fast.

When you prompt something like "Analyse every document in this folder and compare the findings across all reports," Cowork doesn't grind through them sequentially. It spins up parallel sub-agents, each assigned to read and summarise a different document at the same time. Once every sub-agent finishes, the main agent synthesises their individual outputs into a single coherent analysis.

[!]

Exam Trap: Sub-Agent Context Isolation

A common distractor claims that sub-agents share context during parallel analysis, meaning each agent can see what the others have found in real time. This is false. Sub-agents work independently — they can't read each other's outputs. The main agent performs the synthesis step after all sub-agents have finished. If you see an option suggesting parallel agents "collaborate" or "cross-reference during processing," it's wrong.

In Chat, you'd upload documents one at a time, mentally stitch together insights across separate conversations, and hope nothing important slips through. With Cowork, the cross-referencing is structural — built into the execution pipeline rather than dependent on your memory.

The "for each one" pattern

The phrase "for each one" is your trigger for parallel batch operations. When your prompt says "Analyse every PDF in this folder and extract the executive summary from each one," Cowork recognises this as parallelisable work and assigns sub-agents to handle files simultaneously. What takes an hour of manual uploading and copying in Chat finishes in minutes.

This pattern scales. Twelve monthly performance reports, four quarterly budget spreadsheets, two strategy presentations — eighteen documents total — processed in a single pass. The key: keep all source files in the same working folder so Cowork has access to the full set during one session.

Structured data extraction from unstructured sources

Pulling structured data out of documents that were never designed to be structured is where Cowork saves the most time for knowledge workers. A scanned invoice contains dates, amounts, vendor names, and line items, but they're embedded in free-form text and inconsistent layouts. A narrative project status report buries deadlines, risk ratings, and budget figures across paragraphs.

Cowork reads the unstructured source, interprets the content, and writes the extracted data directly to a spreadsheet on your file system. The pipeline: read source, extract fields, write structured output. No copy-paste. No manual formatting. Forty invoices that need a master spreadsheet with date, vendor, amount, and category columns? That's a single Cowork prompt, not a weekend of data entry.

Combining qualitative and quantitative analysis

A single Cowork task can handle both numbers and narrative. "Analyse the sales data in revenue.xlsx and the customer feedback in surveys.pdf, then write a combined report" is a perfectly valid prompt. Cowork reads both file types, runs quantitative analysis on the spreadsheet (trends, comparisons, totals) and qualitative analysis on the PDF (themes, sentiment, recurring complaints), then produces a unified deliverable that integrates both.

[~]

Cross-Modal Analysis in Practice

The real value is in the integration, not the individual analyses. Anyone can summarise a spreadsheet or a survey separately. The expert move is asking Cowork to correlate: "Where do the revenue dips in the spreadsheet align with negative feedback themes in the surveys? Highlight any quarters where both metrics declined and explain what the qualitative data suggests about why."

This cross-modal analysis is where exam questions get difficult. Scenarios will present a situation with mixed document types and ask you to pick the prompt that produces genuine cross-referencing, not just individual summaries pasted together.

Output format specification is not optional

The mistake that costs the most time: asking for an analysis without specifying how you want it delivered. "Analyse these reports" gets you a conversational text summary in the chat window. Useful for a quick glance. Useless for sharing with your team, building on in future sessions, or importing into another tool.

Always define the deliverable format. "Create an Excel file with one tab per source document and a Summary tab cross-referencing the key findings" produces something you can actually use. "Summarise these files" does not. Include column names, section headings, chart types, and file naming conventions. The more precise your output specification, the more actionable the result.

[!]

Format Specification Changes Everything

Without a format specification, Claude defaults to text in the chat window. With one, it writes a file directly to your file system — an Excel workbook, a Word document, a formatted PDF. The difference isn't cosmetic. It determines whether your analysis output is a transient chat message or a persistent, shareable, editable deliverable.

The exam tests this hard. When two answer options differ only in whether the prompt specifies an output format, the one with the format specification is almost always correct. Precision beats brevity in every scenario.

What the exam expects you to know

Domain 2 carries 18% of the exam weight — the second-highest tier. For task statement 2.2, expect scenario-analysis questions that present a set of mixed documents and ask you to choose the best analysis approach. Four traps appear in different disguises:

  1. "Cowork can only analyse text-based documents" — False. It reads PDFs, XLSX, PPTX, CSV, and many other formats directly from the file system.
  2. "Cross-document synthesis requires uploading to a Project knowledge base" — False. Cowork reads directly from local folders. No upload step.
  3. "Sub-agents share context during parallel analysis" — False. They work independently. The main agent synthesises after all sub-agents complete.
  4. "Analysis results are always delivered as text in the chat window" — False. Cowork writes files directly to your file system when you specify an output format.

Know these cold. At least two will appear on your exam, and they're designed to catch people who've used Chat but haven't internalised how Cowork's architecture works differently.


Common Mistakes

Common Mistake

Analysing files one at a time in separate Cowork sessions, losing the ability to cross-reference and synthesise across the full document set.

Instead: Put all related documents in a single working folder and process them in one session. Cross-document synthesis only works when Cowork can see every file at once. Splitting documents across sessions is like asking five different consultants to each read one chapter and then expecting a coherent book review.

Common Mistake

Asking for analysis without specifying an output format, then getting a conversational text summary in the chat window that cannot be shared, edited, or built upon.

Instead: Always define the deliverable: 'Create an Excel file called Analysis.xlsx with one tab per source document and a Summary tab' or 'Write a Word document called Findings.docx with sections for each report and an executive summary.' Format specification is what turns a chat message into a usable business deliverable.

Common Mistake

Uploading documents to Chat for analysis when the same files are sitting on your local file system and Cowork can read them directly.

Instead: Use Cowork's direct file access. Chat uploads hit context window limits, a 30MB file size cap, and a 20-file-per-conversation ceiling. Cowork reads from disk with none of these constraints and writes output files straight to your folder.

Requesting document analysis

Before

Look at the files in this folder and tell me what they say.

After

Analyse every document in this folder. For each one, extract: (1) key findings, (2) risks identified, and (3) recommended actions. Create a comparison table in Excel showing these three dimensions across all documents, with a final row summarising the overall themes.

Synthesising quarterly reports

Before

Summarise the quarterly reports.

After

Read all four quarterly report PDFs in this folder. Create a Word document called 'Annual-Trends.docx' with: a section per quarter highlighting revenue, headcount, and customer satisfaction; a trends section showing how each metric changed Q-over-Q; and an appendix listing any data points that appear inconsistent across quarters.


Hands-On Activity

Hands-On Activity

Cross-Document Synthesis in Action

15 min

Process a mixed set of documents in a single Cowork session to see parallel sub-agent analysis, cross-document synthesis, and structured output delivery in action. You'll produce a multi-tab Excel file that cross-references findings across different file types.

What you will learn

  • Use Cowork's direct file access to analyse multiple document types in one session
  • Trigger parallel sub-agent processing with the 'for each one' pattern
  • Produce structured Excel output with per-document tabs and a cross-reference summary
  • Tell the difference between genuine cross-referencing and individual summaries pasted together
  1. 01

    Create a folder called 'Synthesis-Lab'. Place 4-6 different document types inside — at minimum, one PDF, one spreadsheet (CSV or XLSX), and one text file. Use real work documents if possible, or create sample files with different content themes.

    Why: Multi-format analysis is one of Cowork's strongest capabilities. Using diverse file types shows how it reads across formats in a single operation.

    Expected: A folder containing a realistic mix of document types, each with distinct content.

  2. 02

    In Cowork, select the Synthesis-Lab folder and prompt: 'Read every document in this folder. For each one, identify the three most important findings. Then create a file called Cross-Reference.xlsx with a tab for each source document and a Summary tab that highlights where the documents agree, disagree, or complement each other.'

    Why: This tests the full pipeline: multi-file reading, per-document extraction, cross-document comparison, and structured output delivery. The 'for each one' phrasing triggers parallel sub-agents.

    Expected: Claude generates an execution plan showing parallel document analysis followed by a synthesis step. The plan should mention creating the Excel file with multiple tabs.

  3. 03

    Review the plan, then click Allow. While Claude works, watch for sub-agent progress indicators showing parallel processing of different documents.

    Why: Watching the parallel execution shows how Cowork handles batch analysis differently from sequential Chat processing. Each sub-agent works on a different document independently.

    Expected: Multiple sub-agents processing different documents simultaneously, followed by a synthesis phase that combines their outputs into the Excel file.

  4. 04

    Open Cross-Reference.xlsx and verify that each source document has its own tab with extracted findings, and that the Summary tab genuinely cross-references across documents — not individual summaries stacked on top of each other.

    Why: This quality check ensures the synthesis is real. A good Summary tab references specific findings from the individual document tabs, notes where sources agree or contradict, and surfaces patterns that only emerge when you read across the full set.

    Expected: A multi-tab Excel file where the Summary tab references specific findings from the individual document tabs, noting agreements, contradictions, and overarching themes.


Practice Question

Practice Question

A department head has a folder containing 12 monthly performance reports (PDFs), 4 quarterly budget spreadsheets (XLSX), and 2 strategy presentations (PPTX). They need a single executive summary synthesising all 18 documents. What is the best approach?


Sources