Skip to main content

This site is for educational purposes only. Nothing here constitutes financial advice.

Lesson 9 — Reading audits: Code4rena, Spearbit, Trail of Bits

Audit reports are the most-cited and least-read document in DeFi. Today: how to read them — severity ratings, attack preconditions, mitigation types — and which findings should make you walk away.

Advanced
Evergreen
22 min readUpdated 2026-06-16Block Clarity Hub Editorial Team

Every serious protocol publishes audit reports. Most users — and a surprising number of investors — never read them. The reports are the primary document by which the security community communicates what's wrong with a contract, what was fixed, and what was accepted as residual risk. This lesson is how to read them critically: severity scales, status codes, what an unaddressed high-severity finding means, and the limits of even the best audit.

**The three dominant report formats.** (1) **Code4rena** (and similar contest-style platforms like Sherlock, Cantina): public competitions where many independent researchers submit findings, judged and deduplicated by sponsor / lead reviewers; reports are public on the platform with full submission history. (2) **Spearbit / Cantina private engagements** (and similar): paid engagements with a small team of senior reviewers; reports typically published shortly after fixes ship. (3) **Trail of Bits / OpenZeppelin / ConsenSys Diligence / Halborn**: traditional consulting engagements with formal report structure (executive summary, finding-by-finding detail, appendices). All three follow similar internal structure; the differences are mostly business-model.

**Severity ratings.** Most reports use a four-tier scale: **Critical** (loss of all funds or complete protocol compromise), **High** (loss of significant funds, governance compromise, or denial of service of a major function), **Medium** (loss of small funds under specific conditions, or significant correctness issue), **Low** (best-practice violation, minor correctness issue, or hardening recommendation). Some platforms add **Informational** / **Gas** (style and gas-optimisation notes). The severity assigned reflects both impact and likelihood — a 'high' finding usually means significant impact with practical exploitability under realistic conditions.

**Status codes.** After the protocol's response, each finding is marked as: **Fixed** (code changed to address it; the report should include a 'fix commit hash'), **Acknowledged** (recognised but not fixed; protocol accepts the residual risk — usually justified in the report), **Won't fix** (disagreement or out-of-scope), **Disputed** (protocol disagrees with the finding). Critical and High findings marked anything other than 'Fixed' deserve special scrutiny; a protocol that 'acknowledges' a Critical finding is operating with a known severe vulnerability, which is rarely justifiable.

**Pre-mortems and out-of-scope material.** A skilled report distinguishes 'this is the protocol's code' from 'this is dependencies and assumed-correct components.' The Compound governance audit didn't audit Chainlink; the Lido audit didn't audit the underlying Ethereum consensus. Out-of-scope sections are the trust boundary — if a protocol relies on a third-party oracle, custodian, or off-chain server, the audit hasn't validated that dependency.

**What audits don't cover.** Audits are time-bounded reviews of a specific code commit. They don't cover: (1) **Post-audit code changes** — any commit after the audited hash is unaudited unless re-reviewed. (2) **Off-chain dependencies** — relayers, keepers, oracles, signers. (3) **Economic / game-theoretic attacks** that aren't pure code bugs (e.g., the Mango exploit was a 'legal market action' that the protocol's economic design allowed). (4) **Operational failures** — multisig signers being phished, hot-wallet keys leaking, governance being captured. A clean audit doesn't mean a safe protocol; it means a narrow class of bugs were searched for at a specific point in time.

**Reading a report fast — what to scan.** (1) The number and severity of findings — a report with 0 Critical / 0 High is unusual; some Highs being found is normal even on good code. (2) The status of high-severity findings — are they all Fixed? Any Acknowledged or Disputed? (3) The fix commit hash — does it match what's deployed on-chain? Many protocols publish an audit with fixes but the deployed contract isn't at the fixed commit (sometimes intentionally to delay disclosure, sometimes by mistake). (4) The auditor's scope statement — what wasn't reviewed? (5) Re-audits — was the fixed code re-reviewed, or did the auditor just trust the developer's claim of fix?

**Multi-auditor signal.** The strongest evidence of a thorough review is multiple independent audits by reputable firms, all of which conclude the code is well-secured. The weakest is a single audit by an unknown firm. Multi-audit also reveals coverage gaps — different teams find different bugs; high overlap between findings is reassuring, low overlap suggests they're searching different surfaces.

Example

An audit-report scan of a hypothetical protocol. Step 1: open the report from a top-tier firm; scan the findings table. Count: 1 Critical, 3 High, 8 Medium, 12 Low. The 1 Critical and 2 of 3 High findings are marked Fixed; 1 High is marked Acknowledged ('the team accepts the small-window risk because the fix would require a longer redeploy'). Status: orange flag — Acknowledged High findings require justification, and 'a longer redeploy' is rarely a satisfying answer. Step 2: locate the fix commit hash for the Fixed findings. Compare it to the protocol's on-chain deployment via Etherscan + the protocol's GitHub. If the deployed code is at a *different* commit than the fixed version, the on-chain contract may not include the fix. Step 3: read the scope statement. Was the upgrade-authorisation code in scope? Were the oracle integrations reviewed? Was the off-chain keeper system reviewed? If all three are in-scope, the audit was thorough; if any are explicitly out-of-scope, that's a known unaudited surface. Step 4: check for re-audits. Did the team commission a separate firm to verify the fixes? Multi-auditor coverage of the Critical + Highs is the strongest signal. The whole scan takes 20-30 minutes for a competent reader and produces a calibrated assessment that 'the protocol has been audited' alone cannot.

Common mistakes

  • Treating 'has been audited' as a guarantee. Audits are time-bounded reviews of a specific commit; recent changes can be unaudited.
  • Skipping the severity table. The mix and status of Critical / High findings is the headline data.
  • Not verifying the deployed commit matches the audited + fixed commit. Easy to deploy a slightly different version.
  • Ignoring out-of-scope sections. The trust boundary is where unaudited dependencies live.
  • Trusting a single audit from an unknown firm over multi-audit coverage from reputable ones.

Check your understanding

A protocol's most recent audit found 1 Critical, 3 High, 6 Medium, and 10 Low findings. The Critical is marked 'Fixed.' Two of the High findings are 'Fixed,' one is 'Acknowledged' (team plans to address in a future upgrade). What is the defensible reading?

Key terms covered

Sources & further reading

We prioritise primary sources. Where a topic moves quickly (regulation, security incidents), we re-check sources on the cadence shown by the page's "Next review" date.

Go deeper