DePIN Architecture And Design
1. DePIN Fundamentals and System Boundaries
1.1 What DePIN Is and What It Is Not Using a Concrete Example
DePIN is a design pattern for building networks that coordinate real-world workâlike sensing, measuring, or delivering servicesâusing a shared protocol and verifiable records. The âphysicalâ part means the work happens outside the computer. The ânetworkâ part means multiple independent participants can contribute and be rewarded or penalized based on what the protocol can verify.
A useful way to understand DePIN is to start with a concrete example: a network that measures road conditions using roadside devices.
Concrete example: Road-condition measurement
Imagine thousands of roadside nodes that report measurements such as surface temperature and vibration levels. A client (say, a city dashboard or an app) wants a reliable estimate for a specific road segment.
A DePIN-style workflow looks like this:
- A client requests a measurement for a location and time window.
- Nodes submit signed measurements that claim they observed the physical conditions.
- The protocol verifies what it can (for example, freshness, signature validity, and consistency checks).
- Rewards are assigned based on verification outcomes and quality scoring.
- A record is stored so the client can audit what happened.
The key point is not that everything is âon-chain.â The key point is that the protocol defines how contributions are requested, validated, and accounted for, and that validation is tied to verifiable evidence.
What DePIN is
DePIN is best described as a system with three properties working together:
-
Independent operators can participate Nodes are run by different parties, not a single organization. The protocol provides a common way to join, submit work, and be evaluated.
-
Work is tied to physical evidence The protocol expects measurements or service outputs to correspond to real-world activity. Evidence might be sensor readings, signed attestations, or proofs derived from physical observations.
-
The protocol defines verifiable accounting The network records requests, submissions, verification results, and outcomes. Even when full verification is impossible, the protocol can still enforce rules like âfreshness,â âno replay,â and âminimum quality thresholds.â
In the road example, âverifiable accountingâ might mean the protocol stores:
- which nodes were eligible for the request,
- which submissions were accepted or rejected,
- the quality score used for rewards,
- and the final result returned to the client.
What DePIN is not
DePIN is often confused with several nearby ideas. Hereâs what it is not.
-
Not a centralized data pipeline If a single operator collects all sensor data and decides what is correct, you have a centralized service. There may still be a database and an API, but the ânetworkâ part is missing because participation and evaluation are not shared.
-
Not âblockchain for storageâ Putting hashes or logs on a ledger does not automatically make the system DePIN. The protocol must connect those records to real-world work and define how contributions are verified and rewarded.
-
Not a pure marketplace with no verification rules If anyone can claim they measured something and get paid without checks, the system becomes a billing channel, not a measurement network. DePIN requires that the protocolâs rules constrain what counts as a valid contribution.
-
Not necessarily âfully on-chainâ Some parts can be off-chain for performance and cost. What matters is that the protocolâs verification and accounting are consistent and auditable.
A quick comparison using the same example
Consider three ways to build the road-condition measurement system.
| Approach | Who decides correctness? | How is payment determined? | Does it rely on verifiable evidence? |
|---|---|---|---|
| Centralized dashboard | City operator | Operator pays nodes internally | Mostly no shared verification rules |
| Ledger-only logging | No shared correctness rules | Often manual or arbitrary | Hashes without protocol constraints |
| DePIN-style network | Protocol + verifiers | Rewards based on verification outcomes | Yes: freshness, signatures, quality checks |
The table is intentionally blunt: DePIN is about protocol-defined constraints that link physical claims to verifiable outcomes.
Mind map: DePIN in one page
The âprotocol glueâ intuition
A DePIN system is not just a collection of sensors or a list of transactions. The glue is the protocolâs definition of:
- eligibility (who can submit for a request),
- timing (what âfreshâ means),
- evidence format (what a submission must contain),
- verification rules (what gets accepted, rejected, or challenged), and
- accounting (how outcomes map to rewards).
In the road example, if you remove any one of thoseâsay, you allow late submissions without freshness rulesâthen the system becomes easier to game. Thatâs why âwhat it isâ and âwhat it is notâ are both about constraints, not about buzzwords.
A small, practical checklist
When you evaluate whether something is DePIN, ask these questions:
- Can multiple independent operators contribute without asking permission each time?
- Does the protocol specify what evidence a submission must include?
- Are there concrete verification rules tied to that evidence?
- Is payment or reputation connected to verification outcomes?
- Is the resulting record auditable (even if verification is partly off-chain)?
If most answers are âno,â youâre likely looking at a centralized service, a logging system, or an unverified marketplaceânot a DePIN architecture.
1.2 Core Actors and Responsibilities
A DePIN network is easiest to reason about when you name the roles and state what each role is allowed to do. The trick is to separate âwho produces evidenceâ from âwho checks itâ and âwho decides the rules.â When those boundaries are clear, you can design incentives, failure handling, and audits without guessing.
The four core actors
1) Node Operator
Responsibility: Operate physical infrastructure and produce measurement or service results.
A node operator typically runs software that:
- Registers the node identity (keys, metadata, capabilities).
- Receives tasks or opportunities to contribute.
- Collects data (sensor readings, service logs, connectivity proofs).
- Packages results into a proof format.
- Submits proofs and keeps enough local evidence to respond to challenges.
What the operator should not do:
- Decide whether a proof is valid.
- Modify protocol rules.
- Rewrite the meaning of a measurement after the fact.
Concrete example: A roadside air-quality node measures PM2.5 every minute. The operatorâs software timestamps readings, signs them, and submits a proof that includes the measurement plus the freshness data needed to prevent replay.
2) Client
Responsibility: Request work, define what âgoodâ means for a specific use case, and pay for successful outcomes.
A client typically:
- Creates a request with constraints (location, time window, required quality threshold).
- Selects or discovers eligible nodes.
- Submits the request to the network (on-chain or via a client API).
- Receives results and initiates settlement.
What the client should not do:
- Trust a single operator blindly.
- Assume that âa submission existsâ implies âitâs correct.â
Concrete example: A logistics company wants temperature readings for a shipment route. It requests readings for specific checkpoints and time windows, and it sets a minimum acceptable accuracy. If the network returns proofs that donât meet the threshold, the client expects the protocol to handle it via dispute or re-try.
3) Verifier
Responsibility: Check proofs against the protocolâs rules and the requestâs requirements.
A verifier can be:
- A smart contract (on-chain verification),
- An off-chain service run by the network (off-chain verification),
- Or a committee / threshold of verifiers.
A verifierâs job is to answer questions like:
- Does the proof match the request parameters?
- Is the measurement fresh and correctly signed?
- Does the proof satisfy quality bounds?
- Are there signs of tampering or inconsistent evidence?
What the verifier should not do:
- Collect payments directly without following the settlement rules.
- Accept malformed proofs âbecause they look plausible.â
Concrete example: For the air-quality node, the verifier checks that the timestamp is within the allowed window, the signature matches the registered key, and the measurement is within physically reasonable bounds defined by the protocol. If the protocol uses Merkle commitments, the verifier also checks inclusion proofs.
4) Governance
Responsibility: Define and update protocol parameters and policies that govern eligibility, verification rules, and settlement.
Governance is not âanother actor that runs nodes.â It is the mechanism that changes how the system behaves.
Governance typically manages:
- Parameter sets (quality thresholds, allowed proof formats, challenge windows).
- Membership rules (who can register, how keys are rotated).
- Upgrade policies (what can change without breaking compatibility).
- Dispute policy (what evidence is admissible and how itâs evaluated).
What governance should not do:
- Decide outcomes for individual requests outside the defined rules.
- Override verification logic in an ad-hoc way.
Concrete example: If sensor drift is discovered for a specific hardware model, governance updates the quality model and eligibility rules so that affected nodes must provide additional calibration evidence to keep earning rewards.
How responsibilities map to system boundaries
A useful mental model is to treat each role as owning a different kind of âtruth.â
- The operator owns raw evidence (what the device measured, what logs exist).
- The verifier owns rule-based truth (whether evidence satisfies protocol constraints).
- The client owns use-case truth (what the request requires and what it pays for).
- governance owns policy truth (the rules themselves).
When these truths are mixed, you get design problems like âthe verifier canât explain why it rejected,â or âthe operator can game the clientâs expectations.â
Mind maps
Mind map: roles and outputs
Mind map: responsibilities by lifecycle stage
A concrete end-to-end example (with clear handoffs)
-
Registration: An operator registers a node with a public key and declares capabilities like âsupports calibration proof v2.â Governance defines that only nodes with that capability can serve certain request types.
-
Request: A client submits a request for readings at checkpoint A between 10:00 and 10:10, requiring a minimum quality score of 0.85. The client also sets a dispute window of 30 minutes.
-
Proof submission: The operator collects readings, computes a quality score, signs the measurement bundle, and submits a proof that includes freshness data and the calibration evidence.
-
Verification: The verifier checks:
- The signature matches the registered key.
- The timestamps are within the allowed window.
- The calibration evidence corresponds to the claimed device model.
- The quality score meets the threshold.
-
Settlement: If accepted, the protocol releases payment according to the reward schedule. If rejected, the client can either retry with another node or start a dispute using the evidence the operator is required to retain.
-
Governance involvement (only at the rule level): If the verifier repeatedly rejects proofs from a specific hardware batch due to a known calibration issue, governance updates the policy so that future requests require an additional calibration step.
Practical design rules for keeping roles honest
- Make inputs explicit: Verifiers should validate against request parameters, not against what they âthink the client meant.â
- Require evidence retention: Operators should keep enough data to answer challenges; otherwise disputes become theater.
- Version everything that affects meaning: If proof formats or quality models change, governance should publish versioned rules so verifiers can interpret submissions correctly.
- Separate decision from payment: Even if a verifier is off-chain, the settlement logic should depend on verifiable outputs (signatures, commitments, or on-chain decisions).
Summary
Node operators produce signed evidence from physical infrastructure. Clients define requirements and initiate settlement. Verifiers apply protocol rules to decide whether evidence is valid for a specific request. Governance sets and updates the rules that determine eligibility, verification, and reward behavior. Keeping these responsibilities distinct makes the system easier to implement, test, and debugâespecially when things go wrong.
1.3 Network Scope and Trust Boundaries: Defining What Must Be Verified
A DePIN network is only as reliable as the boundary it draws around âwhat we trustâ and âwhat we verify.â Scope is the list of claims the network will accept; trust boundaries are the lines that separate who is responsible for each claim. If you blur those lines, you end up paying for measurements you never actually checked, or rejecting valid work because you checked the wrong thing.
Start with claims, not components
Instead of beginning with devices, nodes, or contracts, begin with the claims the system must accept. A claim is a statement that affects rewards, eligibility, or state.
Common claim types in DePIN:
- Identity claim: âThis operator is authorized to submit work for this network.â
- Task claim: âThis submission corresponds to a specific task instance and time window.â
- Measurement claim: âThe reported measurement is correct within defined bounds.â
- Quality claim: âThe measurement meets quality requirements (e.g., coverage, freshness, completeness).â
- Availability claim: âThe data/proof is available for verification and dispute.â
For each claim, decide:
- Who can make it (client, operator, verifier, or contract).
- What evidence supports it (signed payload, proof artifact, on-chain record).
- What verification step confirms it (signature check, proof verification, cross-check, or threshold rule).
A practical example: suppose the network pays for âtemperature readings from sensors.â The system must accept a measurement claim, but it should not automatically accept that the sensor is the one it claims to be. Identity and measurement are separate claims, so they get separate checks.
Define the scope of verification
Verification scope answers: âWhat exactly does the network check before paying?â A clean approach is to list verification gates in order.
Example verification gates for a single task instance:
- Authorization gate: confirm operator membership and task eligibility.
- Freshness gate: confirm the submission is for the correct time window and not a replay.
- Binding gate: confirm the submission is bound to the task ID (and any challenge parameters).
- Proof gate: verify the measurement/proof artifact against the required format.
- Quality gate: apply quality thresholds (e.g., minimum confidence, minimum coverage).
- Settlement gate: record the verified result and compute rewards deterministically.
If you skip a gate, you must compensate elsewhere. For instance, if you donât do a freshness gate, you need a different mechanism to prevent replay, such as nonces embedded in the task binding.
Separate trust levels by evidence strength
Not all evidence is equal. A signature proves authorship; a proof proves correctness under a defined model; a cross-check proves consistency with other observations.
A useful way to map evidence to trust:
- Cryptographic evidence (strong for authenticity): signatures, certificate chains, key rotation logs.
- Algorithmic evidence (strong for correctness under assumptions): proof verification, deterministic checks.
- Statistical evidence (useful but weaker): confidence scores, anomaly detection outputs.
- Human or off-chain assertions (weak unless corroborated): manual reports, unverified logs.
Example: if an operator submits a âcoverage scoreâ computed from raw sensor data, the network should verify the raw data integrity (hash commitments, signatures) and the computation method (proof or deterministic recomputation). If the network only verifies the final score, youâve trusted the operatorâs computation more than you think.
Specify trust boundaries as âwho verifies whatâ
Trust boundaries are easiest to express as a matrix: rows are claims, columns are verifiers. Each cell states whether that verifier is responsible for confirming the claim.
| Claim | Operator | Client | Verifier Node | Smart Contract |
|---|---|---|---|---|
| Authorization | Signs identity | â | Checks membership | Enforces eligibility |
| Task binding | Includes task ID | Creates task request | Checks task ID match | Enforces task ID match |
| Freshness | Includes nonce/window | Generates challenge | Checks freshness rules | Enforces replay protection |
| Measurement correctness | Provides proof | Optionally pre-checks | Verifies proof | Records verified result |
| Quality thresholds | Provides quality data | Optionally pre-checks | Applies thresholds | Enforces reward eligibility |
| Dispute evidence availability | Uploads artifacts | Tracks receipts | Confirms availability | Opens/records dispute outcome |
This table prevents a common failure mode: assuming that âthe contract will catch it.â Contracts are great at enforcing rules and recording outcomes, but they typically cannot verify rich off-chain measurements without a proof system or deterministic recomputation.
Make assumptions explicit and testable
Every verification step relies on assumptions. The goal is to write assumptions in a way that can be tested or constrained.
Examples of assumptions that should be explicit:
- Measurement model assumption: the proof system assumes a particular measurement pipeline (e.g., sensor sampling rate, calibration method).
- Network assumption: the verifier node can fetch required artifacts within a bounded time.
- Clock assumption: time windows use a defined source (block timestamp, task-issued nonce, or client-provided window with verification).
- Data availability assumption: the data needed for verification is either on-chain, retrievable from a content-addressed store, or provided during dispute.
If an assumption cannot be tested, reduce its impact by moving the claim to a weaker category (e.g., treat it as a quality hint rather than a payout determinant).
Mind map: scope and trust boundaries
Mind map: Network Scope and Trust Boundaries
Worked example: âPay for coverageâ without trusting everything
Imagine a network that pays for âcoverage of a regionâ using operator-submitted observations.
Step 1: Identify claims.
- Operator is authorized.
- Submission is for task T and window W.
- Observations correspond to the claimed region.
- Coverage score meets threshold.
Step 2: Decide verification gates.
- Authorization gate: membership check.
- Freshness gate: submission must include nonce tied to T and W.
- Binding gate: region identifier and task ID are hashed into the signed payload.
- Proof gate: operator provides a proof that observations were generated by the claimed pipeline (or deterministic recomputation is possible).
- Quality gate: verifier computes coverage from the verified observations and applies threshold rules.
Step 3: Place trust boundaries.
- The operator may compute a coverage score, but the network does not trust that number.
- The verifier recomputes coverage from verified observations.
- The contract only records the verifierâs outcome and enforces eligibility.
This design keeps the network from paying for a number that only exists in the operatorâs head. It also keeps the contract from needing to understand sensor semantics; it just enforces the rules and stores results.
Common boundary mistakes to avoid
- Mixing identity with measurement: treating âauthorized operatorâ as proof that the measurement is correct.
- Trusting derived values without verifying inputs: accepting a final score without checking the data/proof that produced it.
- Assuming time is obvious: forgetting that replay protection depends on a specific time binding mechanism.
- Letting one verifier do everything: if only one component checks a claim, you may create a single point of logical failure.
A good scope and boundary definition makes these mistakes harder to make. You can look at the claim list, the verification gates, and the evidence strength mapping and immediately see which parts of the system are actually responsible for correctness.
1.4 Performance and Reliability Targets: Turning Requirements Into Metrics
Performance and reliability targets are where âwe need it to workâ becomes âhere is what working means.â In a DePIN network, the tricky part is that performance is not just speed; it is speed under realistic load, with predictable failure behavior, and with measurable impact on rewards and user experience.
Start with requirements that already imply metrics
Good requirements mention at least one of: time, frequency, correctness, availability, or cost. If a requirement says âverification should be fast,â you still need to decide what âfastâ means in the systemâs lifecycle.
A practical approach is to map each requirement to a measurable outcome:
- Latency: time from request to verified result.
- Throughput: number of verified tasks per unit time.
- Success rate: fraction of tasks that reach a final verified state.
- Quality: how often proofs meet acceptance criteria.
- Availability: fraction of time the network can accept and process tasks.
- Cost: resource usage per task (compute, bandwidth, on-chain operations).
Define the lifecycle points you will measure
Most DePIN systems have a request lifecycle. Pick explicit âcheckpointsâ so you can measure end-to-end and isolate bottlenecks.
Example lifecycle checkpoints:
- Submission accepted: client request is received and validated for format.
- Task assigned: a node is selected and begins work.
- Proof produced: node submits measurement/proof artifacts.
- Proof verified: verifier checks validity and quality.
- Settlement finalized: rewards/fees are recorded and final.
If you only measure âend-to-end,â you will eventually end up guessing where time went. If you measure checkpoints, you can fix the right thing.
Choose metrics that match the failure modes
Reliability is not just âuptime.â In DePIN, failures often look like:
- Timeouts (work takes too long).
- Invalid proofs (fails verification rules).
- Inconsistent data (fails quality bounds).
- Duplicate submissions (retries cause double counting).
- Network partitions (nodes canât reach verifiers or clients).
For each failure mode, define a metric that quantifies it.
Concrete metric set for a proof-of-measurement flow:
- p50 / p95 / p99 proof latency from âtask assignedâ to âproof verified.â
- Verification success rate: verified / submitted.
- Invalid proof rate: rejected due to format/signature vs rejected due to measurement quality.
- Timeout rate: tasks that exceed a defined deadline.
- Duplicate rate: number of duplicate submissions per 1,000 tasks (should be near zero if idempotency is correct).
Turn targets into numbers with clear units
Targets should be written as thresholds with units and time windows.
Example targets (illustrative, but concrete):
- p95 proof latency †45 s over a rolling 7-day window.
- Verification success rate â„ 98% over a rolling 30-day window.
- Timeout rate †1% over a rolling 30-day window.
- Invalid proof rate †0.5% (excluding cases where the client cancels).
- Network availability â„ 99.5% for accepting new tasks.
The ârolling windowâ detail matters because it prevents a single bad day from dominating decisions.
Use SLOs and error budgets to connect reliability to operations
A Service Level Objective (SLO) is the target; the error budget is what you can âspendâ while staying within the target.
If an SLO is 99.5% availability over a month, the error budget is: \[ \text{Error budget} = 1 - 0.995 = 0.005 \] Over 30 days, that is 0.005 Ă 30 Ă 24 hours = 3.6 hours of allowed downtime (or equivalent failure time).
This budget becomes actionable when you define what counts as âfailure.â For example, you might count a task as failed if it cannot reach âproof verifiedâ within the deadline.
Mind map: Performance and reliability metrics
Example: deriving metrics from a âfast verificationâ requirement
Requirement: âClients should receive verification results quickly enough to keep their workflow moving.â
Step 1: identify the user-visible moment. Suppose the client needs the result before it can display a receipt.
Step 2: choose the checkpoint that corresponds to âresult.â If the receipt is issued after on-chain settlement, end-to-end includes settlement time. If the receipt is issued after off-chain verification, it excludes settlement.
Step 3: define latency percentiles. If most tasks finish quickly but a few take long due to retries, p95 is usually more informative than average.
Example metric derivation:
- If receipt is issued after proof verified: target p95 †45 s.
- If receipt is issued after settlement finalized: target p95 †2 min.
Both can be true, but they measure different parts of the system.
Example: reliability targets that prevent reward accounting surprises
Requirement: âRewards must be settled correctly even when clients retry.â
This requirement is not about speed; it is about correctness under retries.
Metrics to support it:
- Idempotent acceptance rate: fraction of retries that map to an existing task/proof record.
- Duplicate settlement rate: number of settlements that would double-pay per 1,000 tasks (should be zero).
- Reconciliation mismatch rate: difference between off-chain accounting and on-chain events.
To make these measurable, you need identifiers:
- A task ID derived from client request content (or a client-provided idempotency key).
- A proof submission ID that verifiers can treat as unique.
Example: throughput targets that donât hide queueing pain
Throughput alone can be misleading because systems can âacceptâ requests while queueing them for a long time.
So pair throughput with queueing metrics:
- Tasks accepted per minute.
- Queue wait time from âtask assignedâ to âproof produced.â
- Queue depth at the verifier.
If throughput rises but p95 latency also rises, you likely increased load without scaling the verification pipeline.
Practical metric design rules
- Measure what you can act on. If you canât change it operationally, it wonât help.
- Separate time from correctness. A system can be fast and wrong; track both.
- Use percentiles for latency. Averages hide tail behavior.
- Break down invalid outcomes by reason. âRejectedâ is not a root cause.
- Define deadlines explicitly. Deadlines turn âslowâ into a countable failure.
A compact template for writing targets
Use a consistent format so teams can compare changes.
- Metric: p95 proof latency (task assigned â proof verified)
- Target: †45 s
- Window: rolling 7 days
- Alert threshold: > 60 s for 30 minutes
- Owner: verifier pipeline
- Primary cause hypotheses: slow node responses, verifier CPU saturation, storage retrieval delays
When targets are written this way, performance work becomes engineering rather than interpretation.
1.5 Data Types in DePIN On-Chain, Off-Chain, and Hybrid Workflows
A DePIN system is mostly a data pipeline with rules about who can submit which data, how itâs checked, and when it becomes final. The key design decision is not âon-chain vs off-chainâ in general; itâs which data type needs which properties: public verifiability, immutability, low latency, privacy, or high throughput.
Mind map: where each data type belongs
On-chain data: what consensus needs to remember
On-chain data should be limited to items that benefit from shared agreement and long-term auditability.
- Identity and membership records
- What it is: A nodeâs public key, role, and eligibility status.
- Why on-chain: Everyone must agree who is allowed to submit measurements or receive rewards.
- Example: A registry contract stores
{nodeId, operatorPubKey, eligibilityVersion}. If a node is removed, the change is visible and final.
- Commitments and anchors
- What it is: Hashes (or Merkle roots) that bind off-chain evidence to an on-chain event.
- Why on-chain: You get tamper-evidence without storing large payloads.
- Example: When a verifier accepts a measurement bundle, the contract records
commitment = hash(proofBundleHash || taskId || epoch). Later, anyone can check that an off-chain bundle matches the commitment.
- Accounting and settlement state
- What it is: Reward accrual, escrow balances, and dispute outcomes.
- Why on-chain: Settlement must be consistent across participants.
- Example: After a successful verification, the contract moves funds from escrow to an operatorâs claimable balance and emits an event with the exact reward amount.
- Parameters and policy versions
- What it is: Scoring weights, quality thresholds, and challenge windows.
- Why on-chain: Verification rules must be unambiguous for past epochs.
- Example: Store
policyVersionper epoch. A proof submitted under version 3 is evaluated using version 3 rules, even if version 4 exists later.
Off-chain data: what needs speed, volume, or specialized formats
Off-chain storage is where you put large or frequently changing data that doesnât need consensus-level permanence.
- Raw measurements and task results
- What it is: Sensor readings, network measurements, or service outputs.
- Why off-chain: Payloads can be large, and you often want flexible schemas.
- Example: A node submits a JSON payload with
{timestamp, location, metrics: {latencyMs, packetLoss}}plus a signature.
- Proof artifacts and evidence bundles
- What it is: The actual proof data used to convince verifiers.
- Why off-chain: Proof blobs may be too large for on-chain storage, and verification can be staged.
- Example: A verifier might check a signature and freshness off-chain, then produce a compact statement (hash + metadata) that the contract can accept.
- Indexing and caching layers
- What it is: Read models for fast queries.
- Why off-chain: Query patterns evolve, and you donât want to pay on-chain costs for every view.
- Example: An indexer builds a table keyed by
(taskId, epoch)that returns âlatest accepted measurementâ and âcurrent dispute status.â
- Operational metadata
- What it is: Health checks, logs, and diagnostic info.
- Why off-chain: Itâs useful for operators and monitoring, but itâs not the settlement truth.
- Example: Heartbeat logs can be stored off-chain and summarized on-chain only if they affect eligibility.
Hybrid workflows: the practical middle ground
Hybrid designs use on-chain data as a verifiable receipt system, while off-chain data carries the heavy lifting.
Pattern A: Evidence bundle with on-chain commitment
Goal: Prove that a specific off-chain bundle was accepted under specific rules.
- Off-chain: Operator/verifier prepares
proofBundleand computesbundleHash. - On-chain: Contract stores
bundleHash(or a Merkle root) tied totaskIdandepoch. - Later: Anyone can fetch the bundle and verify it matches the stored commitment.
Concrete example:
- Task: âMeasure bandwidth for route R during epoch E.â
- Off-chain bundle includes raw samples, aggregation method, and signatures.
- Contract stores
commitment = hash(bundleHash || routeId || epoch || policyVersion). - A dispute later can reference the commitment without requiring the contract to store the entire bundle.
Pattern B: Challenge flow with off-chain evidence submission
Goal: Keep the chain lean while still enabling adversarial review.
- On-chain: A challenge is opened by referencing an existing commitment.
- Off-chain: Challenger submits evidence (or counter-evidence) to verifiers.
- On-chain: Outcome is recorded as a small state transition:
accepted/rejected/slashed.
Concrete example:
- A nodeâs measurement is accepted and escrowed.
- During the challenge window, a challenger uploads a counter-analysis bundle.
- The contract only records the final decision and the resulting reward adjustment.
Data typing: treat each field as having a purpose
A common mistake is to treat âdataâ as one category. In practice, each field has a role in the workflow.
- Identifiers (e.g.,
taskId,epoch,nodeId): used for routing, indexing, and binding evidence. - Claims (e.g., âbandwidth was Xâ): the content that must be verified.
- Proofs (e.g., signatures, Merkle proofs, zk proofs): the mechanism that supports claims.
- Policy references (e.g.,
policyVersion): the rules under which claims are evaluated. - Receipts (e.g.,
commitment,receiptId): compact records that link on-chain state to off-chain artifacts.
Example mapping:
taskIdandepochshould be on-chain because they anchor settlement.rawSamplesshould be off-chain because theyâre large.bundleHashshould be on-chain because itâs the verifiable link.policyVersionshould be on-chain because it makes past evaluations reproducible.
Practical checklist: choosing the location for each data type
Use this rule of thumb: put data on-chain only when you need shared finality or public verifiability for that exact item.
- If multiple parties must agree on it long-term â on-chain.
- If itâs large, frequently updated, or only needed for verification tooling â off-chain.
- If it must be referenced later without storing everything â hybrid (commitment on-chain, payload off-chain).
Mini example: end-to-end data flow with three data types
- Raw measurement (off-chain):
- Operator submits
rawSamplesand a signature to a verifier service.
- Proof receipt (hybrid):
- Verifier checks freshness and validity, then computes
bundleHash. - Contract stores
bundleHashunder(taskId, epoch, policyVersion).
- Settlement (on-chain):
- After the challenge window closes, contract finalizes reward accounting and emits a settlement event.
This separation keeps the chain focused on what must be consistent, while still allowing anyone to audit the off-chain evidence through commitments.
2. Reference Architecture for DePIN Networks
2.1 Layered Architecture From Device to Protocol to Application
A layered architecture keeps responsibilities from smearing across the system. In a DePIN network, the clean separation is especially helpful because different parts fail in different ways: devices go offline, verifiers disagree, and applications mis-handle user intent. This section describes a practical layering that you can implement without inventing a new religion.
The three layers and what each one owns
Device layer (measurement and connectivity)
- Produces raw observations (e.g., sensor readings, bandwidth samples, location proofs).
- Signs or otherwise authenticates what it claims to have measured.
- Handles local constraints: power, network variability, and hardware quirks.
Protocol layer (rules, verification, and settlement)
- Defines what counts as a valid contribution.
- Orchestrates verification workflows and dispute handling.
- Records eligibility, proofs, and reward outcomes.
Application layer (user workflows and business logic)
- Translates user goals into protocol requests.
- Presents receipts, statuses, and explanations.
- Manages user-specific policies like budgets, retry preferences, and UI-level constraints.
A useful mental model: the device answers âwhat did I observe?â, the protocol answers âis it acceptable and what do we pay?â, and the application answers âwhat does the user want and how do we show progress?â.
Mind map: responsibilities by layer
Data flow example: from a sensor to a settled reward
Consider a simple âcoverage measurementâ use case where a client wants evidence that a node observed a region during a time window.
-
Device prepares a measurement package
- The device samples its sensors and produces a payload containing:
measurement_value(e.g., signal strength summary)time_window(start/end timestamps)nonce(to prevent replay)device_id(or a reference to it)
- It signs the payload with its device key.
- It may also include a hash of any large raw data stored elsewhere.
- The device samples its sensors and produces a payload containing:
-
Device submits to the protocolâs verification endpoints
- The submission includes the signed payload and any proof artifacts.
- The transport layer uses an idempotency key derived from
(device_id, nonce, time_window)so retries donât create duplicates.
-
Protocol verifies and scores
- The protocol checks:
- Signature validity for the device identity.
- Freshness: the
time_windowis within allowed bounds. - Replay resistance: the nonce hasnât been used for that device.
- Proof structure: required fields exist and hashes match.
- It then computes a quality score using deterministic rules (for example, penalizing measurements that are too sparse or inconsistent).
- The protocol checks:
-
Protocol records eligibility and schedules settlement
- The protocol stores minimal state needed to later settle rewards.
- It emits events that the application can index:
ContributionAcceptedContributionChallengedContributionFinalized
-
Application presents status and receipt
- The application tracks the contribution by an ID returned from the protocol.
- It shows the user:
- âSubmittedâ while verification is pending.
- âAcceptedâ after protocol checks pass.
- âFinalizedâ after the dispute window closes.
This flow works because each layer speaks its own language: device payloads are measurement-centric, protocol records are verification-centric, and application receipts are user-centric.
Interface contracts: what each layer must expose
To keep layers decoupled, define explicit interfaces. You donât need fancy formalism; you need stable inputs and outputs.
Device â Protocol interface
SignedMeasurementPackagedevice_idmeasurement_payloadnoncetime_windowpayload_hashes(optional but useful)signature
Protocol â Application interface
ContributionStatuscontribution_idstatein{pending, accepted, challenged, finalized, rejected}quality_score(if accepted)receipt_reference(pointer to proof artifacts)
Application â Protocol interface
RequestForMeasurementclient_idtarget_spec(what âcoverageâ means)time_windowbudgetandmax_feeverification_preferences(e.g., allow fallback verifiers)
Mind map: interface boundaries and failure modes
Why layering prevents common design mistakes
-
Device logic doesnât need to know reward math If the device tries to predict how rewards will be computed, you end up with duplicated rules and inconsistent behavior across firmware versions.
-
Protocol logic doesnât need to know UI preferences The protocol can expose deterministic states and receipts. The application decides whether to retry, how to display partial progress, and what to do when a userâs budget is exhausted.
-
Applications donât need to parse raw sensor formats If the protocol provides a receipt reference and a structured status, the application can remain stable even when device payload formats evolve.
A compact âlayer responsibilitiesâ checklist
- Device: measure, sign, package, and upload with idempotency.
- Protocol: validate, verify, score, manage disputes, and settle.
- Application: translate user intent into requests, track status, and present receipts.
When you build from these boundaries, you get a system where failures are easier to diagnose and changes are easier to localize. The device can be swapped, the protocol rules can be versioned, and the application can evolve without rewriting measurement firmware or verification logic.
2.2 End-to-End Flow Example From Registration to Reward Settlement
This example walks through a single âmeasurement requestâ from the moment a node joins the network to the moment it gets paid. The goal is to show how responsibilities stay separated: identity and admission are handled early, measurement and proof happen during the request, and reward settlement happens after verification.
Actors and roles (concrete)
- Node Operator: runs a node that can measure physical infrastructure (e.g., a sensor reading, a bandwidth test, or a service availability check).
- Client: requests work and pays for it.
- Verifier/Coordinator: checks proofs and decides whether a submission is eligible for rewards.
- Smart Contracts: store minimal state needed for accounting, eligibility, and settlement.
Mind map: the end-to-end flow
Step 1: Node registration (identity and eligibility)
A node operator starts by creating a node identity that is stable across restarts. In practice, this is a keypair plus metadata (operator name, region, supported measurement types). The network should not trust metadata alone; it uses it for routing and display.
Next, the operator submits an admission request to the coordinator (or directly to a registry contract, depending on your design). The admission request includes:
- a public key (or certificate) for the node identity,
- a statement of capabilities (e.g., âcan measure latency to endpoint Xâ),
- proof of eligibility (for example, âI control the endpointâ via a signed challenge).
Easy example: the coordinator sends a random challenge string. The operator signs it with the node key and returns the signature. If the signature verifies, the node is admitted.
The contract (or registry) records a membership entry with:
- node identity hash,
- supported measurement types,
- an activation timestamp,
- an optional stake/escrow requirement.
Step 2: Node liveness (so requests donât go to dead nodes)
After admission, the node begins sending heartbeats at a fixed interval. The coordinator uses these to mark nodes as âactiveâ for a time window.
Easy example: if heartbeats are expected every 60 seconds and the timeout is 180 seconds, then a node is eligible for assignment only if it has sent at least one heartbeat within the last 180 seconds.
This matters because it prevents wasted work and reduces the number of âno responseâ outcomes that would otherwise complicate reward logic.
Step 3: Client creates a measurement request (and pays)
A client wants a measurement for a specific target. The client submits a request with:
- target (what to measure),
- measurement spec (what âgoodâ looks like),
- constraints (time window, required endpoints, acceptable error bounds),
- payment terms (max price, reward per successful proof, and dispute window).
The client also deposits funds into escrow managed by the contract. The escrow ties payment to a specific request ID.
Easy example: request ID R-1024 measures âavailability of service Sâ between 12:00:00 and 12:05:00, paying up to 1.0 token per valid proof.
Step 4: Coordinator assigns tasks (deterministic enough to audit)
The coordinator selects one or more active nodes based on the measurement type and any constraints (region, supported endpoints, or load balancing). The assignment should be auditable.
A common pattern is:
- coordinator computes an assignment list (node IDs) for request
R-1024, - it sends each node a task message containing:
- request ID,
- a freshness nonce (to prevent replay),
- the measurement spec,
- the time window.
Easy example: the coordinator includes nonce N=7f3a... and the node must include it in the signed proof.
Step 5: Node performs measurement and submits proof
The node executes the measurement according to the spec. It then produces:
- the measurement result (raw values and derived metrics),
- a proof artifact (e.g., signed evidence, a commitment to raw data, or a Merkle root of samples),
- the coordinator-provided nonce and request ID,
- a signature from the node identity.
The node submits these to the verifier/coordinator endpoint. The verifier checks basic validity first:
- signature matches node identity,
- request ID and nonce match an outstanding task,
- proof format matches the expected measurement type.
Easy example: if the node submits a proof with nonce N that doesnât match the one in the task message, it is rejected immediately without touching reward accounting.
Step 6: Verification and outcome recording
Verification can be multi-stage. A practical approach is:
- Proof validity: cryptographic checks and schema checks.
- Spec compliance: does the measurement meet required constraints?
- Quality scoring: compute a score used for reward multipliers.
The verifier then reports an outcome to the contract for request R-1024. Outcomes are typically one of:
Accepted(eligible for reward),Rejected(not eligible),NeedsReview(if you support asynchronous or dispute-based verification).
Easy example: if the measurement is within the allowed error bound, it gets Accepted with a quality score of 0.9; otherwise it gets Rejected.
Step 7: Reward settlement (escrow release with accounting)
Once the contract has enough verified outcomes, it settles rewards. The contract should keep accounting deterministic and minimal.
A clean settlement flow is:
- contract stores per-request totals (escrow amount, reward budget),
- for each node outcome, contract computes the reward using a fixed formula,
- contract releases funds to node operators for
Acceptedoutcomes, - it records events for transparency.
A simple reward formula might be:
\[
\text{reward}_i = \text{baseReward} \times \text{qualityMultiplier}_i
\]
where qualityMultiplier_i is clamped to a range like \([0,1]\).
Easy example: baseReward is 1.0 token. If qualityMultiplier is 0.9, node gets 0.9 token.
The contract also handles the case where no node is accepted. In that case, the escrow can be refunded to the client or partially refunded based on policy.
Step 8: Dispute window and finality of settlement
If your design supports challenges, the contract should not release all funds instantly. Instead, it can:
- mark outcomes as âprovisional,â
- start a challenge timer,
- release rewards only after the timer expires or after disputes are resolved.
Easy example: settlement is scheduled for R-1024 at block time T + 10 minutes. If no challenge arrives, rewards finalize.
Step 9: Events and reconciliation (so operators can audit)
After settlement, the contract emits events such as:
RequestCreated(R-1024, client, escrowAmount)TaskAssigned(R-1024, nodeId, nonceHash)OutcomeRecorded(R-1024, nodeId, status, qualityScore)RewardSettled(R-1024, nodeId, amount)
These events let node operators reconcile what they submitted versus what was paid. Clients can also verify that the network followed the agreed terms.
Mermaid: end-to-end sequence diagram
sequenceDiagram participant Op as Node Operator participant Coord as Coordinator/Verifier participant C as Client participant SC as Smart Contract Op->>SC: Register node ïŒidentity + eligibility proofïŒ SC-->>Op: Membership active Op->>Coord: Heartbeats C->>SC: Create request R-1024 + deposit escrow SC-->>C: Request recorded Coord->>SC: Select eligible nodes ïŒoff-chain or on-chainïŒ Coord->>Op: TaskïŒR-1024, nonce N, spec, windowïŒ Op->>Coord: Submit result + proof + nonce N + signature Coord->>SC: Record outcome ïŒAccepted/Rejected, scoreïŒ SC-->>Op: Release reward after challenge window SC-->>C: Finalize request accounting
Practical checklist for implementing the flow
- Registration: verify node identity and eligibility once; store minimal membership state.
- Liveness: gate assignment using heartbeat freshness.
- Request: escrow funds are tied to a request ID.
- Freshness: include a nonce in both task and proof.
- Verification: reject invalid proofs before reward logic.
- Settlement: compute rewards deterministically from stored outcomes.
- Finality: optionally delay release until the challenge window ends.
- Events: emit enough data for reconciliation without storing bulky artifacts.
2.3 Component Interfaces: Contracts for Identity, Measurement, and Payment
A DePIN system is easiest to reason about when each component has a small, explicit contract: what it accepts, what it produces, and what it guarantees. The contracts below are written as interface specifications you can implement in any language. They also make testing straightforward because you can swap real components for fakes that obey the same rules.
Identity interface contract
Identity answers: âWho is this node, and is it allowed to participate?â The contract should separate identity claims from authorization decisions.
Mind map: Identity contracts
Interface: RegisterNode
Inputs
node_pubkey: public key used for signing requestsnode_metadata: optional fields (e.g., location label, hardware class)control_proof: signature over a server-provided noncerequested_role: e.g.,operator,verifier,client(if applicable)
Outputs
identity_id: stable identifier derived from the public key (or a registry-assigned id)authorization_token: signed token or on-chain reference proving admissionvalid_until: timestamp or block height
Guarantees
- The server verifies
control_proofagainstnode_pubkey. - The returned
authorization_tokenbinds the node key to the admission decision.
Example: admission with a simple nonce
- Server sends
nonce = 9f3a.... - Node signs
hash(nonce || node_pubkey). - Server verifies the signature, checks policy (e.g., allowlist), then issues
authorization_token.
This prevents âborrowed keysâ because the node must prove control at registration time.
Interface: RotateKey
Inputs
identity_idold_key_proof: signature using the current keynew_node_pubkeynew_key_proof: signature over a fresh nonce using the new key
Outputs
- updated
authorization_token grace_period_untilduring which both keys may be accepted (optional but useful)
Guarantees
- Rotation is atomic from the systemâs perspective: measurement and payment requests reference the correct key for verification.
Measurement interface contract
Measurement answers: âWhat did the node observe, and how can others verify it?â The contract should define a measurement record format and a proof format.
Mind map: Measurement contracts
Interface: SubmitMeasurement
Inputs
authorization_token(oridentity_id+ proof)task_idmeasurement_payload: raw data or a structured summarymeasurement_timestampfreshness_nonce: value provided by the client or verifiernode_signature: signature over a canonical encoding of the aboveproof_bundle: optional fields (e.g., Merkle proof, attestations)
Outputs
measurement_idverification_status:pending | accepted | rejectedquality_inputs: normalized fields used for scoring
Guarantees
- The verifier can recompute a canonical hash of the measurement record.
- The node signature covers the freshness nonce, preventing replay.
Example: canonical record and replay resistance
Canonical encoding rule (conceptual):
- Serialize fields in a fixed order.
- Use exact byte representations for integers.
- Hash the concatenation to get
record_hash.
Then the node signs record_hash. If an attacker replays an old payload, the freshness nonce mismatch causes rejection.
Interface: VerifyMeasurement
Inputs
measurement_idor full recordtask_spec: includes expected ranges, sampling rules, and acceptable proof typesverifier_policy: thresholds for acceptance
Outputs
is_valid: booleanquality_score: numeric value or structured componentsevidence_hash: hash of what was checkedrejection_reason: if invalid
Guarantees
- Verification is deterministic given the same inputs.
- Quality scoring uses only fields that are either provided by the node with proof or derived from verifiable data.
Payment interface contract
Payment answers: âWho gets paid, for what, and when can it be challenged?â Payment contracts should be explicit about settlement states and dispute evidence.
Mind map: Payment contracts
Interface: InitiateSettlement
Inputs
client_idtask_idmeasurement_id(optional at initiation)fee_terms: e.g.,client_max_fee,operator_rate,quality_multiplier_rulesescrow_amount
Outputs
settlement_idescrow_referencechallenge_deadline
Guarantees
- Funds are locked before measurement is accepted.
- The challenge window is defined in advance.
Interface: SubmitSettlementProof
Inputs
settlement_idmeasurement_idevidence_hash(what the verifier checked)quality_scoreinputs (or a reference to them)- signatures/attestations required by the protocol
Outputs
- updated settlement state:
proof_submitted - computed
operator_payoutandverifier_fee(if applicable)
Guarantees
- The payout calculation is reproducible from on-chain or committed data.
Example: payout calculation inputs
Instead of sending a raw floating-point score, define a structured score:
quality_numeratorquality_denominatormin_quality_threshold
Then payout can be computed as: \[ \text{payout} = \text{base_rate} \times \frac{\text{quality_numerator}}{\text{quality_denominator}} \]
Using integers avoids rounding drift across implementations.
Interface: ChallengeSettlement
Inputs
settlement_idchallenger_idmeasurement_idchallenge_reason_codechallenge_evidence: minimal data needed to refute acceptance
Outputs
- settlement state:
challenged resolution_deadline
Guarantees
- Challenges must reference the exact
measurement_idandevidence_hash. - The protocol defines which evidence fields are admissible.
Putting the contracts together: an end-to-end example
- Identity: Operator registers and receives
authorization_tokenvalid untilvalid_until. - Measurement: Client issues a task with
freshness_nonce. Operator submits a signed measurement record and proof bundle. - Verification: Verifier checks signature, freshness, and proof type, then outputs
quality_inputs. - Payment: Client initiates escrow for
task_id. After verification, the system submits settlement proof referencingmeasurement_idandevidence_hash. - Dispute: If someone challenges, they must provide evidence tied to the same
measurement_idand committed evidence hash.
Each interface contract is small enough to implement and test in isolation, yet strict enough that the systemâs behavior is predictable when components are swapped or upgraded.
2.4 State Management Patterns: On-Chain State vs Off-Chain State
State is where DePIN systems keep the facts that matter. The trick is deciding which facts must be globally agreed upon (on-chain) and which facts can be trusted locally but still verified when needed (off-chain). A good design makes the on-chain part small, deterministic, and auditable, while the off-chain part handles bulk data, heavy computation, and fast iteration.
The core decision: âconsensus-criticalâ vs âevidence-carryingâ
A practical way to split state is to ask two questions for each piece of information:
- Consensus-critical: If different participants disagree, does the systemâs correctness break? If yes, store a compact commitment on-chain.
- Evidence-carrying: Can the system function by verifying evidence later? If yes, keep the full data off-chain and store a hash/commitment on-chain.
Example: Suppose you run a measurement network where nodes submit sensor readings.
- On-chain: eligibility status, current reward parameters, and a commitment to the submitted proof (e.g., a hash of the proof artifact).
- Off-chain: the raw sensor data, intermediate computation outputs, and large proof payloads.
This separation keeps the chain from becoming a storage system and keeps verification logic deterministic.
Mind map: state categories and where they live
Pattern 1: On-chain commitments, off-chain payloads
What it looks like: The chain stores a commitment (hash or Merkle root). Off-chain storage holds the payload that produced that commitment.
Why it works: The chain can verify that a submitted payload corresponds to the commitment without storing the payload itself.
Concrete example (hash commitment):
- Client requests a measurement for a task ID.
- Node produces a proof artifact
Pand computesh = H(P). - Node submits
hon-chain along with task ID and a signature. - During settlement, the verifier checks that the provided
Phashes to the on-chainh.
Design detail: Use a canonical encoding for P before hashing. If two implementations serialize differently, you get âvalid data, invalid commitment.â A simple fix is to define a byte-level schema for the proof artifact.
Pattern 2: Merkleized evidence for partial disclosure
When evidence is large, you often want to verify only parts of it. Merkle trees let you commit to a whole dataset while revealing only the necessary leaves.
Concrete example (Merkle root for sensor samples):
- Node has a list of samples
s_1..s_n. - Node builds a Merkle tree over
s_iand stores the rootRon-chain. - For verification, the node provides a Merkle proof for the specific samples used to compute the final measurement.
Why itâs useful in DePIN: Challenges often target specific claims (e.g., âthese samples were fabricatedâ or âthis time window is wrongâ). Merkle proofs keep the challenge payload small.
Pattern 3: Minimal on-chain accounting, off-chain reconciliation
Accounting is the place where designs often get messy. A clean approach is to keep on-chain accounting minimal and treat off-chain reconciliation as a deterministic process driven by chain events.
Concrete example (reward ledger):
- On-chain stores:
rewardEntry(taskId, nodeId, amount, status). - Off-chain stores: a database that reconstructs balances by replaying events.
Operational benefit: If off-chain indexes are lost, you can rebuild them from the chain. If on-chain is wrong, you fix it via protocol logic, not by editing off-chain records.
Design detail: Make event schemas stable. If you change field meanings, your off-chain rebuild becomes a guessing game.
Pattern 4: Versioned state transitions
On-chain state should evolve in predictable steps. Off-chain state can be recomputed, but on-chain transitions should be explicit.
Concrete example (task lifecycle):
Assigned(on-chain): task ID mapped to an eligible node set.Submitted(on-chain): commitment to proof.Challenged(on-chain): challenge exists and includes a reference to evidence.Settled(on-chain): final outcome.
Off-chain workers can react to these events to fetch payloads, verify proofs, and prepare challenge evidence. The chain remains the source of truth for the lifecycle.
Pattern 5: Off-chain state as âreplaceable caches,â not authority
Off-chain systems frequently store âhelpfulâ state: indexes, computed metrics, and intermediate results. The rule is simple: caches may be wrong; commitments must not be.
Concrete example (node health cache):
- Off-chain monitors node uptime and stores a health score.
- On-chain stores only whether a node is currently eligible.
- If the off-chain health score is wrong, it affects scheduling, not correctness.
This prevents subtle bugs where a cached value accidentally becomes an authority.
Handling disputes: what must be available during the challenge window
Disputes require evidence availability. The chain canât store everything, so you need a clear policy:
- On-chain: store the commitment and the dispute metadata (who challenged, when, and what claim).
- Off-chain: store the full evidence needed to verify the claim.
Concrete example (challenge evidence bundle):
- Node submitted commitment
h. - Challenger submits a claim type (e.g., âtimestamp mismatchâ) and a pointer to evidence bundle
E. - During the challenge window, the verifier fetches
E, checksH(E)=h, and then runs the relevant verification logic.
Design detail: Define the minimal evidence required for each claim type. Otherwise, challengers may submit huge bundles, and verifiers may do unnecessary work.
A simple checklist for choosing what goes where
- Put on-chain: anything that affects eligibility, settlement, or final outcomes.
- Put off-chain: anything that is large, recomputable, or useful mainly for verification.
- Bridge with: hashes, Merkle roots, and canonical encodings.
- Make transitions explicit: lifecycle states should be on-chain.
- Treat off-chain as rebuildable: indexes should be derivable from chain events.
Worked example: designing state for a measurement task
Assume a task where a node must report a measurement for a location at a time window.
On-chain state
Task(taskId, clientId, timeWindow, status)Eligibility(nodeId, stakeRef, active)Submission(taskId, nodeId, proofHash, submittedAt)Dispute(taskId, challengerId, claimType, evidenceHash, status)Settlement(taskId, nodeId, outcome, rewardAmount)
Off-chain state
- Raw sensor data and logs for the time window.
- Proof artifact
Pthat hashes toproofHash. - Optional Merkle tree data to support partial verification.
- Indexes for fast lookup by
taskIdandnodeId.
Flow
- Assignment updates
Task.statuson-chain. - Node submits
proofHashon-chain. - Verifier fetches
Poff-chain and checksH(P)=proofHash. - If challenged, challenger provides evidence bundle off-chain; chain already has the commitment references.
- Settlement writes the final outcome and reward amount on-chain.
This layout keeps the chain focused on what must be agreed upon, while the off-chain layer carries the bulk of the proof work. It also makes failure modes predictable: if off-chain data is missing, the commitment still tells you what should have been provided.
2.5 Operational Model Roles and Responsibilities Across the Stack
A DePIN network only works when the operational model is explicit: who does what, when they do it, and what âdoneâ means. The trick is to separate responsibilities so that failures are contained and accountability is clear.
Roles at a glance
Think of the system as four operational planes that interact:
- Protocol plane (on-chain rules): defines eligibility, accounting, and final outcomes.
- Network plane (off-chain coordination): moves tasks, collects proofs, and routes results.
- Node plane (physical measurement): performs measurement and produces signed evidence.
- Client plane (user workflow): requests work, submits evidence, and receives receipts.
Each plane has distinct roles.
1) Governance and protocol maintainers
Responsibilities
- Define and publish protocol parameters that affect eligibility, scoring, and settlement.
- Approve upgrades that change contract logic or verification rules.
- Maintain an auditable record of parameter changes and upgrade events.
Operational habits
- Use versioned parameters so nodes and clients can interpret evidence consistently.
- Require a compatibility window: if a rule changes, evidence produced under the old rule should either remain valid or be explicitly invalidated.
Example
A protocol parameter called qualityThreshold changes from 0.70 to 0.75. The governance process sets an activation block height. Nodes include the parameter version in their signed measurement metadata. Clients reject proofs with mismatched versions after the activation height.
2) Smart contract operators (if separate from governance)
In some designs, governance sets rules, while an operator runs the operational tooling around those rules.
Responsibilities
- Run indexers and monitoring services that track events relevant to settlement.
- Provide operational support for dispute windows (e.g., ensuring evidence is retrievable and correctly formatted).
- Maintain safe operational keys for administrative actions.
Operational habits
- Keep admin keys in a dedicated custody process with auditable access.
- Treat indexing and evidence formatting as production systems, not âbest effortâ scripts.
Example
An indexer watches ProofSubmitted events and builds a read model used by the client UI. If indexing lags, the UI shows âstatus unknownâ rather than guessing. Settlement still proceeds based on on-chain data.
3) Node operators
Node operators are responsible for turning physical reality into signed, verifiable evidence.
Responsibilities
- Maintain node identity and keys.
- Execute assigned tasks within agreed time bounds.
- Produce measurement artifacts and proof objects that match the protocolâs expected format.
- Submit proofs and respond to challenges when required.
Operational habits
- Implement liveness checks so the network can detect ânode is alive but not measuring.â
- Use idempotent submission: if the same task is retried, the node should not create conflicting evidence.
Example
A node receives task T123 to measure a sensor reading. The node records: sensor ID, sampling window, calibration version, and a monotonic timestamp. If the submission request times out, the node retries with the same task ID and a deterministic evidence hash. The network accepts the first valid submission and ignores duplicates.
4) Verifiers (off-chain or on-chain)
Verification can be split across layers.
Responsibilities
- Validate proof structure and signatures.
- Check measurement constraints (freshness, bounds, and required fields).
- Optionally perform expensive verification off-chain and submit a compact result on-chain.
Operational habits
- Separate âformat validityâ from âsemantic validity.â Format checks are fast; semantic checks may require more computation.
- Ensure verifiers use the same parameter versions as the protocol.
Example
A verifier first checks that the proof includes measurementHash, parameterVersion, and a valid signature. Then it checks that the measurement timestamp is within the allowed window for task T123. Only after both checks does it mark the proof as eligible for scoring.
5) Clients (requesters) and client operators
Clients initiate work and manage the user workflow.
Responsibilities
- Request tasks or quotes (depending on the design).
- Provide any required context (e.g., target location, desired service level).
- Submit proofs for settlement when the protocol requires client involvement.
- Handle receipts, status, and error reporting.
Operational habits
- Treat the client as a state machine:
Requested â Assigned â ProofReady â Submitted â Settled. - Persist task state so a restart does not lose track of what was already submitted.
Example A client requests âcoverage for Zone A.â It receives a task assignment ID and waits for a proof. If the proof arrives but submission fails due to a temporary network issue, the client retries submission using the same evidence hash. The client UI shows âretrying submissionâ rather than âfailed.â
6) Dispute handlers and evidence curators
Disputes require structured evidence handling.
Responsibilities
- Provide a process for evidence submission during challenge windows.
- Ensure evidence is complete, correctly formatted, and traceable to the original proof.
- Coordinate with verifiers to determine what can be checked.
Operational habits
- Define evidence schemas that include enough information to reproduce verification steps.
- Keep evidence retrieval deterministic: the same evidence ID should map to the same artifact.
Example
During a challenge, a node operator submits raw measurement logs. The dispute handler verifies that the logs match the original measurementHash and include the calibration version referenced in the signed proof. If the hash matches, the dispute proceeds to semantic checks.
7) Observability and operations (SRE-style, but practical)
Operational roles often get ignored until something breaks.
Responsibilities
- Monitor proof latency, submission failure rates, and verifier throughput.
- Track node liveness and task assignment distribution.
- Maintain runbooks for common failure modes.
Operational habits
- Alert on symptoms that matter: âproofs rejected due to parameter mismatchâ is more actionable than âerror rate increased.â
- Log correlation IDs across planes (task ID is usually the anchor).
Example If many proofs are rejected for âtimestamp out of window,â operations checks whether the node time source drifted. The runbook instructs node operators to resync time and re-register if necessary.
Mind map: operational responsibilities
Responsibility boundaries that prevent âeveryone owns everythingâ
A clean operational model defines boundaries:
- Node operators own measurement integrity (what was measured, when, and under which calibration).
- Verifiers own validation correctness (whether the evidence satisfies protocol rules).
- Clients own workflow state (what was requested and what was submitted).
- Governance owns rule changes (what counts as valid and how rewards are computed).
- Operations owns reliability (keeping the system observable and recoverable).
Example boundary failure (and the fix)
If verifiers accept proofs without checking parameter versions, a client might submit evidence that was valid under old rules but not under new rules. The fix is simple: verifiers require parameterVersion and reject mismatches before any scoring.
A concrete end-to-end operational flow
- Governance sets
parameterVersion = 3and activates it at blockB. - Node operators register and keep their node identity keys current.
- A client requests a task and receives assignment
T123. - The node measures within the task window and signs evidence including
parameterVersion = 3. - Verifiers validate signatures, freshness, and semantic constraints.
- The client submits the proof for settlement.
- If challenged, dispute handlers retrieve the evidence artifacts and verifiers re-check semantic validity.
- Operations monitors proof latency and rejection reasons, then triggers runbook actions.
This flow works because each role has a narrow, testable responsibility. When something fails, you know where to look: measurement, validation, workflow state, rule configuration, or operational reliability.
3. Identity, Membership, and Node Lifecycle Design
3.1 Node Identity Models: Keys, Certificates, and Human-Readable Metadata
A DePIN network needs a way to answer two questions: who is this node, and what can it do. Identity design is the part that makes those answers consistent across registration, measurement submission, and dispute handling. The trick is to separate cryptographic identity (keys and certificates) from operational identity (what humans and dashboards need).
Identity layers: cryptographic vs operational
Think of node identity as three layers that work together:
- Key material: the cryptographic root used to sign requests and proofs.
- Certificates / attestations: a way to bind a key to a role or membership status.
- Human-readable metadata: labels that help operators and clients interpret logs and dashboards.
A common mistake is to treat metadata as security. A node name like âalpha-3â is useful, but it should never be used to decide eligibility. Eligibility should be derived from verifiable cryptographic material.
Keys: what you sign with, and why it matters
Nodes typically use asymmetric keys for signatures. You want keys that support:
- Request signing: node signs âI am submitting this measurement for task X.â
- Proof signing: node signs measurement artifacts so verifiers can trust origin.
- Rotation: keys change without breaking the network.
A practical model is to use a long-term identity key plus short-term signing keys.
- The identity key changes rarely and is used to authenticate certificate issuance or rotation.
- The signing key changes more often to limit the blast radius of a compromised key.
Example (request signing):
- Node signs a payload containing:
task_id,measurement_hash,timestamp, andnonce. - The signature covers all fields so a verifier canât swap the task or replay an old submission.
Example (key rotation):
- Node generates a new signing key.
- It requests a new certificate binding the new key to its existing membership.
- Until the new certificate is active, the node continues using the old signing key.
Certificates: binding keys to membership and roles
Certificates answer: âIs this public key allowed to participate, and under what rules?â In a DePIN setting, certificates often encode:
- Membership status (admitted, suspended, revoked)
- Role (operator, verifier, client, etc., depending on your architecture)
- Validity window (not-before / not-after)
- Key binding (the public key being certified)
You can implement certificates in multiple ways, but the design principle stays the same: verifiers should be able to check certificate validity using data available at verification time.
Example (membership certificate fields):
subject_public_keynode_id(derived from the identity key, not a nickname)role = operatorvalid_from,valid_toissuer_signature
Revocation and suspension: Certificates need a way to stop trust. Two common approaches are:
- Short validity windows: certificates expire quickly, reducing the need for immediate revocation checks.
- Revocation lists or on-chain status: verifiers check whether a nodeâs identity key is revoked.
If you choose short windows, you still need a safe failure mode: when a certificate expires, the node should stop submitting proofs rather than guessing.
Node identifiers: stable IDs that donât depend on nicknames
A node identifier should be stable and derived from cryptographic material. A typical approach is:
- Compute
node_id = hash(identity_public_key). - Use
node_idin logs, dashboards, and on-chain references.
This makes identity consistent even when keys rotate. Rotation changes signing keys, but the identity key (and thus node_id) stays stable.
Example (log correlation):
- Dashboard shows:
node_id = 0x9a...f1. - Operator metadata maps that
node_idto âalpha-3 in datacenter B.â - When the node rotates signing keys, the
node_idremains the same, so historical charts donât fragment.
Human-readable metadata: what it is for, and what it must not do
Human-readable metadata helps people answer operational questions quickly:
- Which node is this?
- Where is it running?
- What hardware profile does it claim?
- Who is the responsible operator?
Good metadata is descriptive, not authoritative. It should be treated as âhelpful context,â not as a basis for eligibility.
Example metadata fields:
display_name: âalpha-3âlocation_tag: âdc-b / rack-12âhardware_profile: âgpu-tier-2âcontact: an operator email or support handlecapabilities: a list of supported measurement types
Example (capabilities): A node might advertise it can measure âtemperatureâ and âhumidity.â Verifiers should still check that the submitted proof matches the task requirements, but metadata makes it easier to route tasks and interpret failures.
Mind map: identity model components and responsibilities
End-to-end example: from registration to proof verification
- Registration: Node submits its identity key and requests admission.
- Certificate issuance: Network (or governance-controlled authority) issues a membership certificate for the nodeâs identity key and current signing key.
- Operational metadata: Node provides
display_name,location_tag, andcapabilitiesfor dashboards. - Proof submission: Node signs a proof package with the signing key and includes the certificate.
- Verification:
- Verifier checks certificate validity and role.
- Verifier checks signature on the proof payload.
- Verifier uses
node_idfor correlation and uses metadata only for operator-facing context.
Example (what the verifier trusts):
- Trusts: certificate signature, certificate validity window, proof signature.
- Does not trust:
display_nameorlocation_tagfor eligibility.
Design checklist for this section
- Derive a stable
node_idfrom the identity key. - Use signing keys for submissions; rotate them safely.
- Issue certificates that bind keys to membership and roles.
- Ensure verifiers can check certificate validity at verification time.
- Treat human-readable metadata as context, not authorization.
- Make failure behavior explicit: expired or revoked certificates mean âstop submitting.â
3.2 Registration and Admission Control: Whitelisting With Proof of Control
A DePIN network needs a way to decide who may operate nodes. âWhitelistingâ is the simplest admission control: only approved node identities can submit measurements and receive rewards. The key design question is how to prove that an applicant actually controls the resources they claim to operateâwithout trusting their word.
Goal and threat model
Admission control should prevent these failures:
- Impersonation: someone registers as âNode Aâ but controls a different machine.
- Resource squatting: someone claims a device location or hardware capability they donât own.
- Sybil flooding: many identities are created to overwhelm verification and governance.
The admission mechanism should therefore verify control (not just identity) and should do it in a way that is repeatable and auditable.
Mind map: whitelisting with proof of control
A practical admission flow
A common pattern is challenge-response. The network issues a short-lived challenge, and the applicant proves it can respond from the claimed node environment.
Step 1: Applicant submits a registration request
The applicant provides:
- Node public key (the key that will sign future submissions)
- Claimed endpoint (e.g., IP/port or a service URL)
- Claimed resource attributes (e.g., hardware fingerprint hash, site tag)
- Registration metadata (operator name or organization, optional)
Example request (conceptual):
nodePubKey = 0xabc...claimedEndpoint = node-12.example:9000resourceHash = sha256(hw-info)siteTag = âwarehouse-3â
The network does not trust these fields yet; it uses them to define what the proof must bind to.
Step 2: Network issues a challenge
The network generates a challenge that includes:
- A nonce (prevents replay)
- A timestamp or expiry (limits usefulness)
- The applicantâs node public key (binds proof to identity)
- Optionally, a hash of the claimed resource attributes (binds proof to claims)
Example challenge payload:
challenge = H(nonce || expiry || nodePubKey || resourceHash)
The applicant must prove it can compute a response that the network can verify.
Step 3: Applicant returns a signed response
The applicant signs the challenge with the private key corresponding to nodePubKey. That proves key control.
But key control alone is not enough. Someone could sign with a key while still not operating the claimed hardware. So the proof should also bind to the claimed environment.
Two easy-to-understand options are:
- Remote attestation (if available): prove the software/hardware state.
- Proof-of-reachability with environment binding: prove the node can reach a network endpoint and that the response includes an environment secret derived from the claimed resource.
You can implement either without fancy magic by requiring the applicant to produce a second signature from an environment-bound secret.
Step 4: Verify freshness and binding
Verification checks:
- Signature validity: response signature matches
nodePubKey. - Freshness: challenge nonce is unexpired and unused.
- Binding: response includes
resourceHash(or an attestation report that hashes to it). - Policy constraints: site tags, hardware class, and rate limits match the networkâs rules.
If any check fails, the network returns a structured denial reason, such as:
INVALID_SIGNATUREEXPIRED_CHALLENGERESOURCE_MISMATCHPOLICY_REJECTED
This matters because operators can fix the exact cause instead of guessing.
Example: whitelisting a node with a hardware fingerprint
Assume the network wants to ensure that only machines with a specific hardware class can join.
- The operator computes
resourceHash = sha256(hw-info). - During registration, the network includes
resourceHashin the challenge. - The node software must produce a response that includes a value derived from the same
hw-info.
A simple binding method is to require the node to hold a local secret k_hw that is deterministically derived from the hardware fingerprint at install time (or provisioned during setup). The node then signs the challenge using k_hw.
Verification then checks that the k_hw-signature verifies against a public key registered during setup, and that the challenge included the same resourceHash.
This creates a clean chain:
- hardware fingerprint â derived environment secret â signed proof â admission.
Example: admission with âproof of reachabilityâ
Sometimes you cannot rely on attestation. You can still prove control by requiring the applicant to respond from the claimed endpoint.
Flow:
- The network sends a challenge to the claimed endpoint.
- The node must respond within a short timeout.
- The response must include a signature over the challenge.
This prevents a random third party from registering a key and claiming an endpoint they canât actually serve.
To avoid trivial spoofing, the challenge should be unpredictable and short-lived, and the response should include the nodeâs nodePubKey so the network can verify it matches the registered identity.
On-chain registry update: keep it minimal
Admission control should update only what the network needs for later verification:
nodePubKeystatus(e.g.,WHITELISTED,QUARANTINED,REVOKED)allowedResourceClassor a reference to a policy bucketproofReference(hash or pointer to off-chain proof artifacts)
Everything elseâlike verbose attestation logsâcan stay off-chain, referenced by hash. This keeps the chain lean and makes audits deterministic.
Failure handling and operator experience
A good admission system distinguishes between mistakes and attacks.
- Wrong configuration: return
RESOURCE_MISMATCHorPOLICY_REJECTED. - Network issues: return
TIMEOUTand allow retry with a new challenge. - Replay attempts: return
NONCE_ALREADY_USEDand mark the identity as suspicious.
For retries, the network should issue a new challenge each time. Reusing challenges makes replay attacks easier and makes debugging harder.
Admission policy knobs that matter
Whitelisting is not just a yes/no gate. It usually includes:
- Maximum admission rate per operator identity to reduce flooding.
- Resource class limits to prevent one operator from claiming everything.
- Challenge expiry window tuned to typical network latency.
- Quarantine rules for borderline proofs (e.g., reachability verified but resource binding missing).
These knobs should be explicit in the protocol spec so operators know what âgoodâ looks like.
Summary
Whitelisting with proof of control works when admission verifies fresh, challenge-bound evidence that ties the applicantâs node identity to the claimed environment. The simplest reliable approach is challenge-response with explicit binding to resource claims, plus clear failure reasons and a minimal on-chain registry.
3.3 Node Health and Liveness Checks: Heartbeats and Failure Handling
A DePIN network only pays for work it can trust, so âlivenessâ is not a vibeâitâs a set of measurable behaviors. In practice, you want to answer two questions continuously: (1) is the node reachable and responsive, and (2) is it still eligible to earn rewards for the tasks it has been assigned.
What âhealthâ means in a DePIN context
Health is broader than âthe process is running.â A node can be up while still failing the job, for example by producing stale measurements or refusing to submit proofs. A useful approach is to split health into three signals:
- Connectivity: can the network reach the node and can the node reach the network?
- Responsiveness: does the node respond within an expected time window?
- Task correctness readiness: is the node able to produce valid proofs for currently assigned tasks?
You can implement these signals with heartbeats plus lightweight task-aware checks.
Heartbeats: the minimum viable liveness signal
A heartbeat is a periodic message from the node to the network (or to a coordinator) that includes enough information to decide whether the node is still âaliveâ and whether it is still working on the right things.
A practical heartbeat payload includes:
- Node ID (identity used for admission and accounting)
- Epoch or time window (so the receiver can detect staleness)
- Current workload state (e.g., idle, assigned, proving, submitting)
- Last completed task reference (task ID or measurement batch ID)
- Optional health counters (e.g., consecutive submission failures)
- Signature over the payload (prevents spoofing)
Example heartbeat (conceptual)
A node sends a heartbeat every 10 seconds. The payload says it is in state submitting and references task T-1842 as the last task it attempted. If the network doesnât see a heartbeat for 30 seconds, it marks the node as not live.
Choosing heartbeat intervals and timeouts
Heartbeat design is mostly about timing math. You want timeouts that tolerate normal network jitter but still catch failures quickly.
A common pattern:
- Heartbeat interval: (H) seconds (e.g., 10)
- Grace window: (G) seconds (e.g., 20)
- Liveness timeout: (T = H + G) (e.g., 30)
If the last heartbeat timestamp is older than (T), the node is considered not live.
To avoid edge cases, define how you treat clock skew. The simplest method is to use the receiverâs arrival time as the staleness basis, not the senderâs timestamp.
Failure handling: what to do when a node stops being live
Once a node fails liveness, the network must protect two things: fair rewards and workflow continuity.
1) Mark state transitions explicitly
Use a small state machine for node status:
active: heartbeats are timelysuspect: heartbeats are late but within an early warning windowinactive: heartbeats are missing beyond timeoutbanned_or_slashed(optional): repeated failures or misbehavior
This prevents abrupt behavior changes and gives you room to reassign tasks.
2) Early warning (âsuspectâ) before hard timeout
Instead of waiting for the full timeout, you can trigger a suspect state at a fraction of (T). For example:
- suspect at (0.7T)
- inactive at (T)
In suspect mode, the network can:
- stop assigning new tasks to the node
- request a status update (optional)
- prepare replacement nodes for tasks already in flight
3) Reassignment rules for in-flight tasks
When a node becomes inactive, you need deterministic rules for tasks that were assigned but not yet finalized.
A clean approach is to separate tasks into phases:
- Assigned: node has a task but hasnât produced a proof
- Proving: node has produced a proof artifact locally but hasnât submitted it
- Submitted: proof is on-chain or otherwise accepted by the verifier
- Finalized: reward eligibility is settled
If the node goes inactive before Submitted, the network can reassign the task to another node. If it goes inactive after Submitted but before Finalized, the network should rely on the already-submitted evidence and continue settlement.
4) Prevent double work from breaking accounting
Reassignment can cause duplicate proofs. Thatâs fine if your verification and accounting are designed for it.
A robust pattern:
- Each task has a unique ID and a challenge window.
- The verifier accepts the first valid proof that meets the taskâs requirements within the window.
- Later proofs for the same task are either ignored or recorded as redundant.
This keeps rewards from being paid twice.
Task-aware liveness checks
Heartbeats alone can be gamed unintentionally. A node might keep sending heartbeats while stuck on a task. To reduce that risk, include task-aware signals.
Two lightweight checks work well:
- Progress heartbeat: include
last_completed_task_idorlast_batch_seq. If it doesnât advance for (K) heartbeat intervals while the node claims it isproving, treat it as unhealthy. - Submission heartbeat: include
last_submission_attempt_timeandconsecutive_failures. If failures exceed a threshold, mark the node suspect even if heartbeats are timely.
Example: progress-based suspect
- Heartbeat interval (H=10) seconds.
- Suspect if
state=provingandlast_completed_task_idhasnât changed for 6 intervals (60 seconds).
This catches nodes that are alive but not making progress.
Concrete example: handling a node outage
Assume:
- Heartbeat interval (H=10) seconds.
- Suspect at 21 seconds, inactive at 30 seconds.
- Task
T-1842was assigned at time 1000.
Timeline:
- 1010: heartbeat received, node
assigned. - 1020: heartbeat received, node
proving. - 1030: no heartbeat yet; still within suspect window.
- 1021â1041: at 1021, node becomes
suspect. Network stops assigning new tasks and prepares reassignment forT-1842. - 1030: still no heartbeat; at 1030, node becomes
inactive. - 1035: verifier reassigns
T-1842to another node. - If the original node later reconnects and submits a proof, the verifier accepts it only if it arrives within the taskâs challenge window and is valid.
The key is that the networkâs behavior is driven by timestamps and task phase, not by hope.
Mind map: node health and liveness
Implementation checklist (design-level)
- Define heartbeat interval (H) and liveness timeout (T) with explicit numbers.
- Use receiver arrival time for staleness to reduce clock-skew surprises.
- Include task phase and last completed reference in heartbeats.
- Implement
active â suspect â inactivetransitions with early warning. - Reassign only tasks that are not yet Submitted.
- Ensure accounting accepts the first valid proof within the taskâs window.
- Track consecutive failures and progress stagnation to avoid âalive but stuckâ nodes.
When these pieces are in place, liveness checks become a predictable control system: nodes either keep up, fall behind, or stopâand the network responds in a way that preserves correctness and fairness.
3.4 Key Rotation and Recovery Example Safe Rotation Without Downtime
Key rotation is the boring part of security that prevents the exciting part from happening. In a DePIN network, keys are used for identity, request signing, proof submission, and sometimes operator-to-client authorization. Rotation must therefore preserve two properties at once: (1) new messages verify with the new key, and (2) in-flight work created under the old key can still be verified and settled.
Goal and constraints
A âsafe rotation without downtimeâ design usually targets these constraints:
- Verification continuity: Verifiers must accept signatures from both the old and new keys during a transition window.
- No double-spend of identity: A node should not be able to claim two identities at the same time.
- Predictable cutover: The network needs a deterministic moment when the old key stops being accepted.
- Recovery path: If a node loses the new key before cutover, it must be able to return to a known-good state.
Mind map: rotation and recovery
A concrete example: versioned signing keys with dual verification
Assume each node has a node ID and a signing key that changes over time. The registry stores a key version and its status.
Registry data model (conceptual)
nodeIdcurrentKeyVersionkeys[nodeId][version] = { publicKey, status, validFrom, validTo }recoveryKeyHash(or a recovery authorization mechanism)
The key idea is that verifiers do not just check âthe nodeâs current key.â They check whether the signature matches any key version whose validity interval covers the message timestamp (or covers the messageâs included âissued-atâ value).
Rotation timeline
Letâs define three timestamps:
T_announce: when the node publishes the new public key.T_cutover: when the old key stops being accepted.T_cleanup: when old key records can be removed after settlement finality.
During the interval [T_announce, T_cutover), verifiers accept signatures from both:
- old key version
v_old(still valid) - new key version
v_new(valid fromT_announce)
After T_cutover, only v_new remains valid.
Step-by-step rotation procedure (node operator)
- Generate a new signing key pair
K_new. - Create a rotation authorization proving control of the node. This can be done by signing the rotation request with the old key
K_old(if itâs still safe) or with a recovery mechanism. - Submit a rotation transaction to the registry:
nodeIdnewVersion = v_newpublicKey = PK_newvalidFrom = T_announcevalidTo = T_cutover
- Update the nodeâs local configuration so new outbound messages are signed with
K_new. - Keep the old key available until after
T_cutoverso any late retries can still be signed correctly. - After
T_cutover, stop usingK_oldand optionally delete it.
This is âno downtimeâ because verifiers accept both keys during the transition, and the node can safely retry messages that were created near the cutover boundary.
Step-by-step verification procedure (verifier)
When a message arrives, the verifier:
- Extracts
nodeId,keyVersion(or infers it), and the signature. - Checks the message freshness (e.g., nonce or issued-at window) so replayed messages donât get a free pass.
- Looks up the public key for the relevant version.
- Verifies the signature.
- Confirms the keyâs validity interval covers the message time.
If the message does not include a key version, the verifier can still work by trying allowed versions for that node within the transition window. That approach is slightly more expensive but can reduce reliance on client correctness.
Example: handling in-flight proof submissions
Suppose the node submits a proof request at t = 10:59:58 signed with K_old. The verifier only receives it at t = 11:00:02, after T_cutover = 11:00:00.
To avoid rejecting it, you have two practical options:
- Time-based validity: Set
validToforv_oldto a time that covers network delay, e.g.,T_cutover = T_announce + Î + buffer. - Message-based validity: Include an
issuedAtinside the signed payload and validate thatissuedAtfalls within the keyâs validity interval.
The second option is usually cleaner because it ties acceptance to what the node intended, not to transport timing.
Recovery example: lost new key before cutover
Now assume the node successfully announces v_new but then loses K_new before T_cutover. The node still has K_old.
A safe recovery design provides a recovery action that can revert the node to the old key without breaking verification.
Recovery action rules
- The node can submit a
recoverytransaction signed byK_old(if still uncompromised) or by a recovery authorization. - The registry marks
v_newasrevoked(or setsvalidToearlier thanT_cutover). - The registry extends or reasserts
v_oldvalidity until a new rotation is completed.
Concrete sequence
- At
t = 11:01:00, the node detects it cannot sign withK_new. - It submits
recovery(nodeId, targetVersion=v_old, revokeVersion=v_new). - The registry updates:
v_new.validTo = 11:01:10(short grace window)v_old.validTo = 11:02:00(extend)
- The node continues to sign with
K_old. - Verifiers accept either key only within their updated validity windows.
This keeps the network operational because verifiers never see a period where no valid key exists.
Recovery example: compromised old key
If K_old is compromised, the node should not rely on old-key signatures for recovery. Instead, it uses a recovery authorization that does not require K_old.
A common pattern is an offline recovery key whose public hash is stored in the registry. The recovery transaction is signed with the recovery key and includes:
nodeIdrevokeVersion=v_oldnewVersion=v_new(or âpauseâ until a new key is generated)- a
recoveryNonceto prevent replay
Verifiers then treat the recovery action as authoritative and immediately stop accepting v_old after the registry update is finalized.
Safety rules that prevent foot-guns
To keep rotation from becoming a loophole, enforce these rules:
- Version monotonicity: A node cannot reuse an old version number for a different public key.
- Single active identity: At any time, the nodeâs identity is tied to its
nodeId, not to whichever key happens to be presented. - Freshness checks: Even during dual verification, require nonces or issued-at windows so an attacker cannot replay old signed messages after cutover.
- Challenge compatibility: Dispute evidence should reference the message payload and its
issuedAt, so verifiers can validate signatures according to the keyâs interval at the time the message was created.
Minimal pseudo-logic for verifier acceptance
if !fresh(message.nonce, message.issuedAt): reject
keys = registry.getKeysForNode(message.nodeId)
for key in keys:
if key.status != ACTIVE: continue
if message.issuedAt < key.validFrom or message.issuedAt >= key.validTo: continue
if verifySig(message.payload, message.signature, key.publicKey):
accept
reject
Operational checklist for âno downtimeâ rotation
- Choose
T_cutoverwith a buffer that matches your message propagation and retry behavior. - Sign payloads with an
issuedAtincluded in the signed data. - Keep
K_oldusable until afterT_cutover(or until you are sure no in-flight messages remain). - Provide a recovery transaction that can extend
v_oldor revokev_newwithout requiring the lost key. - Ensure verifiers use key validity intervals, not only âcurrent key,â during the transition.
When these pieces are in place, rotation becomes a controlled change in the registryâs accepted key set rather than a risky moment where everyone must coordinate perfectly. The network keeps verifying, the node keeps operating, and the rules make it hard to accidentally accept the wrong thing.
3.5 Revocation and Slashing Preconditions Defining Misbehavior Triggers
Revocation and slashing are the networkâs way of turning âbad behaviorâ into concrete, enforceable outcomes. The key design goal is to make triggers (1) specific enough to be provable, (2) narrow enough to avoid accidental punishment, and (3) consistent with the rest of the incentive and verification pipeline.
What âmisbehaviorâ means in practice
Misbehavior should be defined in terms of observable deviations from required behavior, not vague intent. For a DePIN node, typical required behavior includes: submitting fresh measurements, using correct identity keys, following protocol formats, and not gaming reward accounting.
A useful mental model is: trigger = evidence + rule + scope.
- Evidence is what can be checked (signed messages, on-chain events, proof artifacts, timestamps, or mismatched commitments).
- Rule is the deterministic condition that maps evidence to a violation.
- Scope is what the violation affects (revocation only, slashing amount, or temporary suspension).
Mind map: revocation and slashing triggers
Designing preconditions: the âmust all holdâ checklist
Before any enforcement action, the contract or enforcement module should verify a small set of preconditions. This prevents accidental slashing from partial or irrelevant evidence.
- Node eligibility: the node must be in an active or probationary set where enforcement is meaningful. If a node is already revoked, the system should treat new evidence as no-op or as a separate administrative record.
- Evidence validity window: submissions and proofs should include timestamps or sequence numbers. Enforcement should only consider evidence within the window where it is supposed to be valid.
- Identity match: the evidence must be cryptographically bound to the nodeâs current identity key (or an explicitly authorized rotated key). If the signature does not match the nodeâs current key mapping, the trigger should not fire.
- Rule versioning: the rule that interprets evidence must be tied to a protocol version. This avoids ârule driftâ where old submissions are judged by new logic.
- Dispute-state conditions: if the protocol includes a challenge period, slashing should either occur after finality of the dispute outcome or be gated by a âdispute not successfulâ condition.
Trigger categories with concrete examples
Below are common trigger categories. Each includes an example of evidence and a deterministic rule.
1) Identity & key misuse
Misbehavior: a node submits messages signed by a key that is not authorized for its current node identity.
- Evidence: signed registration/heartbeat/measurement messages.
- Rule:
signature(node_key_at_time) == truewherenode_key_at_timeis resolved from the nodeâs identity registry. - Action: immediate revocation; optional small slashing if the node posted stake under the wrong key.
Example: A node operator rotates keys but forgets to publish the rotation authorization. Their next heartbeat is signed with the new key. The contract checks the registry mapping and rejects the heartbeat as unauthorized, then revokes because the node is actively participating with an unregistered key.
2) Measurement integrity failures
Misbehavior: the node submits a measurement proof that fails verification.
- Evidence: proof artifact plus public inputs (task ID, measurement target, commitment hash, and any required metadata).
- Rule:
VerifyProof(proof, public_inputs) == true. - Action: slashing with a moderate amount if the proof is invalid, because the node attempted to claim rewards.
Example: A task requires a signed sensor reading anchored to a commitment. The node submits a proof, but the verifier recomputes the commitment hash from the public inputs and finds a mismatch. The proof fails, so the node is slashed.
3) Freshness / replay violations
Misbehavior: the node reuses an old measurement or proof for a new task instance.
- Evidence: task instance identifiers, nonces, and timestamps included in the signed payload or proof inputs.
- Rule: the measurementâs freshness fields must match the taskâs expected nonce/instance ID.
Example: Task T=42 includes nonce N=abc. The node submits a proof that was generated for N=xyz. Even if the proof verifies structurally, the public inputs do not match the task instance, so the rule rejects it as a replay.
4) Double-claiming / conflicting commitments
Misbehavior: the node claims incompatible outcomes for the same task or measurement window.
- Evidence: two or more submissions tied to the same task ID and measurement window.
- Rule: for a given
(task_id, window_id), the node must submit at most one accepted commitment. If two accepted commitments conflict (e.g., different measurement hashes) then the node violates the uniqueness constraint. - Action: strong slashing because this indicates deliberate gaming or broken measurement discipline.
Example: A node submits two different proof commitments for the same window_id. Both proofs pass local verification, but the commitments differ. The contract enforces uniqueness and slashes because the node cannot be both truthful under the same measurement constraints.
5) Protocol compliance failures
Misbehavior: the node submits malformed or non-conforming payloads that prevent verification.
- Evidence: submission format, missing fields, invalid serialization, incorrect domain separation tags.
- Rule:
PayloadSchemaValid == trueandDomainTag == expected. - Action: revocation first; slashing only if repeated or if the node clearly attempted to claim rewards.
Example: The node submits a measurement with the correct signature but uses the wrong domain tag, causing the verifier to treat it as a different protocol context. The system revokes to stop further confusion and only slashes after a second offense within a defined period.
6) Liveness failures
Misbehavior: the node fails to maintain required responsiveness.
- Evidence: heartbeats, task acknowledgements, or result submission deadlines.
- Rule: if
now > last_heartbeat + heartbeat_interval * kor if acknowledgements miss deadlines beyond tolerance. - Action: temporary suspension or revocation without slashing, unless the protocol explicitly treats non-responsiveness as stake-worthy.
Example: The protocol expects a heartbeat every 60 seconds. The node misses three consecutive intervals. The system suspends the node to protect service quality, but does not slash because the nodeâs stake is intended to cover provable fraud, not accidental downtime.
Guardrails that keep enforcement fair
- Thresholding for quality-related triggers: if a trigger depends on a quality score or uncertainty bounds, enforce thresholds that tolerate normal variance. Otherwise, youâll punish nodes for measurement noise.
- Separate ârevocationâ from âslashingâ: revocation can be immediate when the node is clearly ineligible (wrong key, invalid schema). Slashing should require stronger evidence (failed proof verification, replay, or conflicting commitments).
- Rate limit enforcement actions: allow only one enforcement event per node per task window to avoid repeated punishments from the same underlying issue.
- Clear scope: define whether the action affects only the nodeâs eligibility for future tasks, or also reduces its stake used for specific roles.
Minimal enforcement rule set (example)
A compact way to implement triggers is to define a small set of deterministic checks, each mapping to an action.
Trigger: UnauthorizedKey
Evidence: signature by key not in registry
Preconditions: node eligible; evidence within window
Rule: signature_valid && key == current_key
Action: revoke; no slash (or small slash)
Trigger: InvalidProof
Evidence: proof + public inputs
Preconditions: node eligible; task finality reached
Rule: VerifyProof(proof, inputs) == false
Action: slash; revoke
Trigger: ReplayOrWrongNonce
Evidence: task_id, nonce/instance_id in signed payload
Preconditions: node eligible; evidence within window
Rule: payload_nonce != task_nonce
Action: slash; revoke
Trigger: ConflictingCommitments
Evidence: two accepted submissions for same (task_id, window_id)
Preconditions: both submissions finalized; identity matches
Rule: commitment_hash_1 != commitment_hash_2
Action: strong slash; revoke
Practical example walkthrough: from submission to enforcement
- A node submits a measurement for task
T=42with nonceN=abc. - The verifier checks proof validity and nonce matching.
- If the proof fails, the system records an âInvalidProofâ evidence bundle.
- After the dispute window closes (or immediately if disputes are not supported for that trigger), the contract evaluates preconditions: node eligibility, identity match, and rule version.
- The contract then revokes the node and applies slashing according to the trigger category.
This flow ensures that enforcement is not a reaction to a single bad packet. It is a structured outcome based on evidence that the protocol already knows how to verify.
4. Incentives, Payments, and Reward Settlement
4.1 Incentive Objectives Mapping: Throughput, Coverage, and Quality
A DePIN incentive system usually pays for outcomes, not effort. The tricky part is translating âoutcomesâ into measurable objectives that (1) operators can influence, (2) clients can understand, and (3) the protocol can verify without guessing.
Step 1: Define the three objectives as measurable targets
Throughput answers: How much useful work gets done per unit time? In practice, itâs about the rate of accepted tasks or the rate of valid proofs.
Coverage answers: How broadly the network serves the space of requests? Coverage is not just âmore tasks,â but âtasks across the relevant variety,â such as locations, time windows, device types, or customer segments.
Quality answers: How correct and useful are the results? Quality is about proof validity, measurement accuracy, and consistency with constraints.
A useful mental model: throughput is volume, coverage is distribution, quality is correctness. If you optimize only one, the system will happily misbehave in predictable ways.
Step 2: Map each objective to concrete metrics
Below is a practical mapping you can adapt.
| Objective | Metric (example) | What counts as âgoodâ | What counts as âbadâ |
|---|---|---|---|
| Throughput | AcceptedProofsPerHour | Proofs that pass verification and meet freshness | Rejected proofs, stale proofs, duplicates |
| Coverage | UniqueSegmentsServed | Distinct segments (e.g., zones) with at least one accepted proof | Repeated proofs in the same segment while other segments are empty |
| Quality | QualityScore | Score derived from accuracy bounds and consistency checks | Out-of-bounds measurements, inconsistent evidence |
Easy example: Suppose you run a network that verifies âair quality readingsâ from sensors.
- Throughput: how many readings are verified per hour.
- Coverage: readings across neighborhoods (segments).
- Quality: how close readings are to expected ranges and how consistent they are with cross-checks.
Step 3: Decide what the operator can control
Incentives should reward things operators can influence.
- Operators can usually control which tasks they attempt, how quickly they respond, and how they prepare evidence.
- Operators often cannot control client request patterns or external conditions.
So you should avoid paying purely for outcomes that depend heavily on factors outside operator control. Instead, normalize by what the operator actually did.
Example: If a client sends 10,000 tasks concentrated in one city, coverage across other cities is impossible for operators who are not assigned those tasks. The protocol can still measure coverage, but rewards should be based on relative contribution to coverage, not on absolute coverage alone.
Step 4: Use a scoring model that combines objectives without double-counting
A common mistake is to let quality dominate so much that throughput and coverage become irrelevant, or to let throughput dominate so that operators spam low-quality proofs.
A straightforward approach is a weighted score per accepted task, then aggregate over a time window.
Let each accepted proof (i) have:
- \(q_i \in [0,1]\) quality score
- (s_i \in [0,1]) segment contribution score (coverage)
- (t_i \in [0,1]) timeliness score (throughput component)
Then define a per-proof reward weight: \[ W_i = \alpha, t_i + \beta, s_i + \gamma, q_i \] with \(\alpha+\beta+\gamma = 1\).
Aggregate over the window: \[ \text{Reward} = R_{\text{window}} \cdot \frac{\sum_i W_i}{\sum_{j \in \text{all accepted}} W_j} \]
This structure has two benefits:
- It keeps the system from paying twice for the same thing. Quality is only counted through (q_i), coverage only through (s_i), and timeliness only through (t_i).
- It makes the reward comparable across operators even if they submit different numbers of proofs.
Concrete example: In an hour, Operator A submits 50 accepted proofs, Operator B submits 30.
- A has high timeliness but mostly in one segment.
- B has slightly lower timeliness but covers more segments and has strong quality.
With coverage included via (s_i), B can still earn a meaningful share of rewards even with fewer proofs.
Step 5: Define timeliness and freshness so throughput is meaningful
Throughput metrics should not reward stale work.
A practical timeliness score: \[ t_i = \max\left(0, 1 - \frac{\Delta_i}{\Delta_{\max}}\right) \] where \(\Delta_i\) is the time between task assignment and proof acceptance, and \(\Delta_{\max}}\) is the deadline window.
Example: If \(\Delta_{\max}} = 30\) minutes:
- Proof accepted in 5 minutes gets (t_i = 1 - 5/30 = 0.833\).
- Proof accepted in 40 minutes gets (t_i = 0\) and contributes nothing to throughput.
This prevents âfast enoughâ from turning into âeventually accepted.â
Step 6: Define coverage as diversity, not just counts
Coverage needs a definition of âsegment.â Common segment choices:
- geographic zone
- device model
- network slice
- time bucket
- request category
Then define segment contribution for a proof.
One simple method: for each segment (k), compute whether the operator contributed at least one accepted proof in that segment during the window.
- Let (I_{k} = 1) if operator has any accepted proof in segment (k), else 0.
- Let (K\) be the number of relevant segments.
Then define coverage score for a proof in segment (k): \[ s_i = \frac{I_k}{K} \]
Example: If there are 10 segments in the hour and Operator B has accepted proofs in 4 segments, then each proof in those segments gets (s_i = 1/10). Operator Bâs total coverage contribution is proportional to how many segments it actually touched.
Step 7: Define quality so it can be verified and explained
Quality should be derived from verification outputs.
Typical quality inputs:
- measurement error bound (e.g., within tolerance)
- consistency checks (e.g., cross-proof agreement)
- proof completeness (e.g., required fields present)
A simple quality score: \[ q_i = \begin{cases} 1 - \frac{e_i}{e_{\text{tol}}} & \text{if } e_i \le e_{\text{tol}}\ 0 & \text{otherwise} \end{cases} \] where (e_i) is the observed error and (e_{\text{tol}}\) is the tolerance.
Example: If tolerance is 2.0 units:
- error 0.5 gives (q_i = 1 - 0.25 = 0.75\)
- error 2.5 gives (q_i = 0\)
This makes quality continuous instead of binary, which helps operators improve rather than guess.
Mind map: Incentive objective mapping
Step 8: Choose weights with a âwhat breaks firstâ mindset
Weights \(\alpha, \beta, \gamma\) should reflect what you can tolerate.
- If you set \(\alpha\) too high, youâll get fast but repetitive submissions.
- If you set \(\beta\) too high, you may get broad coverage with mediocre measurements.
- If you set \(\gamma\) too high, you may get careful submissions that arrive too slowly.
Example configuration for an air-quality network:
- Quality is essential: \(\gamma = 0.6\)
- Coverage matters for fairness across neighborhoods: \(\beta = 0.25\)
- Throughput keeps the system responsive: \(\alpha = 0.15\)
This doesnât mean throughput is unimportant; it means quality mistakes are more expensive than slower responses.
Step 9: Validate the mapping with a small simulation thought experiment
Before coding, test the mapping with a few operator profiles.
Profiles:
- Operator Fast: high timeliness, low quality, single segment.
- Operator Balanced: medium timeliness, medium quality, multiple segments.
- Operator Careful: lower timeliness, high quality, many segments.
Expected outcome:
- Fast should not dominate because low (q_i\) and low (s_i\) reduce (W_i\).
- Careful should score well due to high (q_i\) and (s_i\), even if (t_i\) is lower.
- Balanced should land in the middle, with coverage preventing it from being treated like âjust another fast operator.â
This is the practical reason for mapping: it lets you predict how incentives behave under realistic operator strategies.
Step 10: Make the objectives visible in receipts and dashboards
Operators should be able to see why they earned what they earned.
A clean receipt per window can include:
- number of accepted proofs
- average timeliness score
- segment diversity count
- average quality score
- final weighted score share
Example receipt fields:
- Accepted: 42
- Avg timeliness: 0.72
- Segments covered: 6/10
- Avg quality: 0.88
- Weighted score share: 18.4%
When these numbers are present, the incentive system becomes easier to operate and harder to game, because the feedback loop is immediate and specific.
4.2 Reward Accounting Example: Metered Rewards With Quality Multipliers
Reward accounting is where âthe network did useful workâ becomes âthe network pays for useful work.â A good design makes three things easy: (1) measuring what happened, (2) turning measurements into a payout number, and (3) reconciling payouts with on-chain records.
The scenario
Assume a DePIN network where operators submit measurements for physical infrastructure tasks (e.g., sensor readings, coverage checks, or service availability proofs). Each task has:
- Unit of work: one measurement submission for a specific task instance.
- Base reward: a fixed amount per unit of work.
- Quality multiplier: a factor that scales the base reward based on verification outcomes.
- Metering: a cap on how many units an operator can earn in a time window.
Operators submit results; verifiers validate them and assign a quality score. The protocol then computes rewards deterministically.
Mind map: reward accounting components
Step 1: Define the unit and the base reward
Let each verified task instance count as one unit. For simplicity, use a base reward of 10 tokens per unit.
- BaseReward: \(B = 10\)
- Unit: one verified submission tied to a unique \(taskId\)
To prevent double payment, every payout must be keyed by a unique tuple such as \((operatorId, taskId)\). If the same tuple is finalized twice, the second attempt should produce zero additional payout.
Step 2: Convert quality into a multiplier
Quality multipliers should be monotonic (higher quality never reduces reward) and bounded (so one operator canât earn infinite value from a single outlier).
A practical approach is a piecewise multiplier based on a quality score \(q\) in \([0,1]\):
\[
M(q) =
\begin{cases}
0 & \text{if fail flag is set}
0.5 & \text{if } 0 \le q < 0.6\
0.8 & \text{if } 0.6 \le q < 0.8\
1.0 & \text{if } 0.8 \le q \le 1.0
\end{cases}
\]
Then the per-unit reward is: \[ R_{unit} = B \cdot M(q) \]
Example outcomes:
- Quality \(q=0.55\) â multiplier 0.5 â reward \(5\)
- Quality \(q=0.75\) â multiplier 0.8 â reward \(8\)
- Quality \(q=0.92\) â multiplier 1.0 â reward \(10\)
- Any fail flag â multiplier 0 â reward \(0\)
This design keeps the verification logic separate from payout math. Verifiers produce \(q\) and flags; accounting applies the multiplier.
Step 3: Meter rewards with a per-window cap
Metering prevents a single operator from dominating payouts due to volume. Define:
- Window: a fixed interval \([t_0, t_1)\)
- CapUnits: maximum units eligible for payout per operator per window
Let \(cap = 100\) units per operator per window. If an operator has more than 100 verified units in the window, only the first 100 eligible units count.
To make âfirstâ deterministic, define an ordering rule, such as:
- sort by \(taskId\) ascending, or
- sort by verification finalization timestamp, or
- sort by an explicit sequence number assigned at submission time.
Example:
- Operator A has 120 verified units in the window.
- The protocol selects the first 100 units by the ordering rule.
- The remaining 20 units are recorded but do not contribute to payout.
Step 4: Compute total payout deterministically
For operator \(o\), let eligible units be \(U_o\) after applying eligibility and metering. Total payout is: \[ P_o = \sum_{u \in U_o} R_{unit}(u) \]
Example calculation:
- Operator A eligible units: 6 units
- Quality outcomes (in eligible order): \([0.92, 0.75, 0.55, 0.81, fail, 0.65]\)
- Rewards per unit: \([10, 8, 5, 10, 0, 8]\)
- Total payout: \(10+8+5+10+0+8 = 41\) tokens
Step 5: Precision, rounding, and accounting hygiene
Even if multipliers are simple, you still need a consistent precision policy.
Common rule:
- Represent rewards in integer smallest units (e.g., 1 token = 1,000,000 âmicrotokensâ).
- Store multipliers as rational numbers or scaled integers.
For the piecewise multipliers above, you can encode:
- 0.5 as \(1/2\)
- 0.8 as \(4/5\)
- 1.0 as \(1\)
Then compute: \[ R_{unit} = B \cdot \frac{num}{den} \] and round down to avoid paying more than intended.
Example with integer math:
- \(B = 10\) tokens
- multiplier 0.8 = \(4/5\)
- \(R_{unit} = 10 \cdot 4 / 5 = 8\) exactly
If you later introduce multipliers that donât divide cleanly, rounding-down keeps the protocol conservative.
Step 6: On-chain events for reconciliation
Accounting should emit enough data to reconcile payouts without re-running the entire verification pipeline.
Minimum event fields per finalized unit payout:
- \(operatorId\)
- \(taskId\)
- \(qualityScore\) (or a compact quality tier)
- \(multiplier\) tier
- \(rewardAmount\)
- \(windowId\)
A separate summary event can include:
- \(operatorId\)
- \(windowId\)
- \(totalUnitsCounted\)
- \(totalPayout\)
This separation helps auditors and operators: unit events explain âwhy,â while summary events explain âhow much.â
Worked mini-example with metering
Assume:
- \(B=10\)
- \(cap=3\) units per window
- Operator B has 5 verified units with quality \([0.92, 0.55, 0.75, 0.81, 0.65]\)
- Ordering rule selects the first 3 eligible units: \([0.92, 0.55, 0.75]\)
Compute:
- \(0.92\) â 1.0 â 10
- \(0.55\) â 0.5 â 5
- \(0.75\) â 0.8 â 8
Total payout: \(10+5+8=23\) tokens.
The remaining two units are still recorded for transparency, but they do not affect payout due to metering.
Implementation checklist (accounting-focused)
- Uniqueness: key payouts by \((operatorId, taskId)\).
- Eligibility: confirm membership and window inclusion before counting.
- Metering: apply cap using a deterministic ordering rule.
- Multiplier: use bounded, monotonic quality tiers.
- Math: compute in integer micro-units; round down consistently.
- Events: emit unit-level and summary-level records for reconciliation.
4.3 Escrow, Dispute Windows, and Finality: Payment With a Challenge Period
A DePIN payment flow usually has three goals that pull in different directions: (1) pay operators when work is valid, (2) avoid paying for bad or manipulated measurements, and (3) keep the system responsive. Escrow plus a challenge window is the standard compromise: money is held temporarily, then released if no valid dispute arrives.
Escrow: what it holds and why it exists
Escrow is a locked balance tied to a specific job or measurement claim. It prevents two common failure modes:
- Premature release: If you pay immediately, you need perfect verification at submission time. In practice, verification is often multi-stage and may depend on later evidence.
- Unbounded liability: Without escrow, an operator might receive funds and disappear before disputes are resolved.
A practical escrow design includes these fields:
- Escrow ID (unique per claim or per job)
- Payer (client or app contract)
- Payee (operator or operator group)
- Amount and currency/denomination
- Claim reference (job ID, measurement ID, or proof hash)
- Release condition (time-based and/or verification-based)
- Dispute window parameters (start, end, and required evidence)
Example: escrow per measurement claim
A client requests a measurement for a site. The client pays 10 tokens into escrow tied to measurementId = 77.
- If the operator submits a proof for measurement 77 and the proof passes basic checks, the escrow remains locked.
- If no dispute is raised before the challenge window ends, the escrow releases to the operator.
- If a dispute is raised, the escrow stays locked until resolution.
This structure keeps the payment logic simple: âmoney moves only when the claim survives the window.â
Dispute windows: how long, when they start, and what they require
A dispute window is a fixed period during which someone can challenge the claim. The key is to define when the window starts and what evidence is needed.
Choosing the start time
Common start triggers:
- Submission time: window starts when the operator posts the claim.
- Verification time: window starts when the claim is marked âverifiableâ (e.g., after required off-chain artifacts are available).
- On-chain finalization of inputs: window starts when the proof hash and relevant metadata are anchored.
For readability and predictable behavior, many systems start the window when the claim is anchored on-chain (proof hash + job ID). That way, the challenger can rely on a stable reference.
Evidence requirements
A dispute should not be âI disagree.â It should be âhere is the specific reason the claim fails.â Evidence requirements typically include:
- Counter-evidence (e.g., alternate measurement, invalid signature, inconsistent sensor readings)
- Freshness proof (e.g., nonce or timestamp binding)
- Scope proof (showing the challenger is disputing the correct job/measurement)
- Format compliance (proofs must match the expected schema)
To keep disputes from becoming expensive theater, require challengers to submit evidence that can be checked deterministically or with bounded computation.
Example: dispute window with two-stage evidence
- Stage A (quick check): challenger submits a counter-proof hash and a short validity witness.
- Stage B (full evidence): if Stage A passes, challenger must submit full evidence before a later deadline.
This reduces wasted work on obviously invalid disputes while still allowing meaningful challenges.
Finality: what âdoneâ means in a payment flow
Finality is not just âthe transaction is mined.â It means the system has reached a state where the escrow can be released or refunded according to rules that cannot be changed by later events.
In escrow-based designs, finality usually has two layers:
- Claim finality: the claim is accepted or rejected after disputes.
- Payment finality: escrow release or refund has executed.
A clean approach is to tie payment finality to claim finality. That is, escrow release happens only after the contract records the claim outcome.
Example: finality states for a claim
Use explicit states to avoid ambiguity:
PendingVerificationChallengeOpenDisputeInProgressAcceptedRejectedReleasedRefunded
Then define transitions so that only one path can lead to Released.
Putting it together: a concrete payment-with-challenge example
Assume:
- Client deposits 10 tokens into escrow for
measurementId = 77. - Operator submits a proof at block time
T0. - Challenge window lasts 3 hours.
- If a dispute is raised, resolution takes 1 hour.
Timeline
- T0: Operator submits claim
C77withproofHash = H1. - T0 + 0: Contract anchors
C77and setschallengeEndsAt = T0 + 3h. - T0 + 2h: A challenger submits dispute
D77with evidenceE. - T0 + 2h + Î: Contract verifies dispute eligibility (correct job, evidence format, required bonds).
- T0 + 2h + 1h: Resolution occurs.
- If claim is valid: state becomes
Accepted, escrow releases to operator. - If claim is invalid: state becomes
Rejected, escrow refunds to client (or pays challenger depending on policy).
- If claim is valid: state becomes
- If no dispute: at
challengeEndsAt, state becomesAcceptedand escrow releases.
The important detail is that the contract never releases funds during ChallengeOpen or DisputeInProgress.
Mind map: escrow, dispute, and finality
Mind Map: Escrow, Dispute Windows, and Finality
Design details that prevent edge-case headaches
- Idempotent submissions: If an operator retries a submission, the contract should recognize duplicates by
measurementIdandproofHash. - Dispute eligibility checks: Require challengers to prove they target the correct claim reference and submit evidence in the expected format.
- Bonds and penalties (policy-driven): A challenger bond discourages spam disputes. If the dispute fails, the bond can be forfeited to the client or burned.
- Deterministic resolution rules: Resolution should rely on verifiable inputs already anchored (proof hashes, job parameters, and evidence hashes) to avoid âweâll figure it out later.â
- Clear refund rules: Decide whether refunds go to the client, are split, or pay a challenger. Keep it consistent with your incentive model.
Minimal state machine sketch (conceptual)
stateDiagram-v2 [*] --> ChallengeOpen: claim anchored ChallengeOpen --> DisputeInProgress: dispute submitted ChallengeOpen --> Accepted: challengeEndsAt reached DisputeInProgress --> Accepted: resolution valid DisputeInProgress --> Rejected: resolution invalid Accepted --> Released: release executed Rejected --> Refunded: refund executed
Example payout outcomes (simple policy)
- No dispute: operator receives 10 tokens.
- Valid dispute: client receives 10 tokens back; operator receives 0.
- Invalid dispute: client keeps escrowed funds; challenger bond is forfeited.
This policy is easy to reason about because each outcome maps to a single escrow action.
Summary
Escrow holds funds tied to a specific claim, the dispute window defines when challenges are allowed and what evidence must accompany them, and finality is achieved only when the contract records an accepted/rejected outcome and executes the corresponding release/refund. When these pieces are connected through explicit state transitions, payment becomes predictable even when verification is imperfect and disputes happen.
4.4 Fee Model Design Example: Separating Client Fees From Operator Rewards
A fee model does two jobs at once: it funds the networkâs operations and it aligns incentives for operators to produce usable results. The cleanest way to keep those goals from stepping on each other is to separate client fees (who pays for a request) from operator rewards (who earns for verified work). Below is a concrete design that you can implement without turning accounting into a full-time hobby.
Design goals
- Client pays for service: The clientâs payment should cover request costs and any protocol overhead.
- Operator earns for quality: Operator rewards should depend on verified outcomes, not on who happened to submit first.
- No hidden cross-subsidy: Client fees should not silently become operator rewards unless the rules explicitly say so.
- Auditable accounting: Every unit of value should have a clear destination: treasury, escrow, operator payout, or refunds.
Core components
- Client fee (CF): Paid by the client when submitting a request.
- Operator reward pool (ORP): A portion of CF reserved for operators, released only after verification.
- Protocol overhead (PO): A portion of CF reserved for network costs (e.g., dispute handling, indexing, or verifier incentives).
- Escrow (E): CF is locked at request creation to prevent âpay laterâ behavior.
- Settlement: After verification, escrow is split into ORP and PO, with ORP further split among operators.
Mind map: Fee flow and responsibilities
Concrete fee formula
Let a request specify:
- (U): expected work units (integer)
- (q): quality tier (integer)
- (s): number of operators required for the request (integer)
Define:
- Base fee per unit: (p_{unit})
- Quality multiplier: (m(q))
- Operator reward share: \(r) where (0 \le r \le 1\)
- Overhead share: (1-r)
Then:
\[ CF = U \cdot p_{unit} \cdot m(q) \]
\[ ORP = r \cdot CF \]
\[ PO = (1-r) \cdot CF \]
This keeps the model simple: clients fund the request, and the split ratio decides how much of that funding becomes operator money.
Example numbers
Assume:
- (p_{unit} = 0.10\) tokens
- (m(q) = 1.5\) for tier 2
- (U = 200\)
- (r = 0.70\)
Compute:
- (CF = 200 \cdot 0.10 \cdot 1.5 = 30\) tokens
- (ORP = 0.70 \cdot 30 = 21\) tokens
- (PO = 9\) tokens
Now the separation is explicit: operators can only earn from ORP, and the protocol overhead is paid from PO.
Operator reward distribution
Operators should not all get the same amount automatically. A practical approach is to distribute ORP based on verified quality scores.
Let each operator (i) submit a result with:
- \(score_i \ge 0\)
- \(eligible_i \in {0,1}\)
Define total eligible score: \[ S = \sum_i eligible_i \cdot score_i \]
If (S = 0), no operator payout occurs and ORP follows the refund policy.
Otherwise, operator payout: \[ reward_i = ORP \cdot \frac{eligible_i \cdot score_i}{S} \]
Example distribution
Suppose 3 operators are required ((s=3)). Verification yields:
- Operator A: eligible, score 80
- Operator B: eligible, score 60
- Operator C: ineligible (fails threshold), score ignored
Then:
- (S = 80 + 60 = 140)
- (reward_A = 21 \cdot 80/140 = 12\) tokens
- (reward_B = 21 \cdot 60/140 = 9\) tokens
- Operator C gets 0
PO is paid regardless of operator eligibility because it covers the verification process itself.
Escrow and settlement rules
A request should move through these states:
- Created: Client deposits (CF) into escrow (E).
- Verification: Operators submit results; verifiers run checks.
- Resolved: Contract determines eligibility and computes payouts.
- Settled: Escrow is split into payouts and refunds.
Mind map: Settlement outcomes
Refund policy: keep it boring and consistent
Refunds prevent clients from paying for outcomes that never get delivered. A simple policy:
- If no eligible operator exists, refund ORP portion to the client.
- If some eligible operators exist, refund only the unused ORP due to rounding or fewer-than-expected eligible results.
Example: if ORP is 21 tokens and only 2 operators are eligible, you still distribute all ORP by score. That means the client is paying for verification and quality, not for a fixed headcount.
Why separation matters (with a concrete failure mode)
Consider a naive model where the client fee directly funds operator payouts without a reserved overhead. If disputes occur, the system needs someone to pay for verification work and dispute resolution. Without PO, you end up either:
- reducing operator rewards after the fact (which makes operators distrust the rules), or
- charging clients extra later (which makes clients distrust the pricing).
By splitting CF into ORP and PO up front, you can handle disputes deterministically: PO covers the process, ORP covers the outcome.
Data model for accounting events
To make this auditable, emit events that mirror the split.
Minimal event set
RequestCreated(requestId, client, CF, U, q, r)VerificationResolved(requestId, eligibleCount, S, PO, ORP)OperatorPayout(requestId, operator, reward_i)RefundIssued(requestId, client, amount)
Example event values
For the earlier example (CF=30, ORP=21, PO=9):
VerificationResolved(... eligibleCount=2, S=140, PO=9, ORP=21)OperatorPayout(... A, 12)OperatorPayout(... B, 9)RefundIssuedonly if there is leftover ORP due to a failure or rounding rule.
Implementation checklist
- Choose (r) as a configuration parameter with clear meaning: âfraction of client fee reserved for operator rewards.â
- Compute CF at request creation and lock it in escrow.
- Pay PO to treasury (or a designated overhead account) only after verification resolution.
- Distribute ORP using score-weighted eligibility.
- Define refund behavior for ORP when (S=0) or when resolution produces unused portions.
- Ensure every settlement event can be recomputed from request inputs and verification outputs.
When client fees and operator rewards are separated at the start, the rest of the system becomes easier to reason about: clients know what theyâre paying for, operators know what they can earn, and the contract knows exactly where the money goes.
4.5 Auditable Settlement Records: Example Event Schemas and Reconciliation
A settlement record is only useful if someone else can replay the logic and reach the same outcome. In a DePIN network, that means your on-chain events (or their canonical equivalents) must carry enough information to reconstruct: (1) what was measured, (2) how it was scored, (3) which operator got paid, and (4) why disputes did or did not change anything. The trick is to avoid dumping raw data on-chain while still making the settlement auditable.
What âauditableâ means in practice
An auditor (or your own reconciliation job) should be able to:
- Verify that a payment corresponds to a specific measurement submission.
- Recompute the eligibility and reward calculation from the event inputs.
- Confirm that the same measurement cannot be paid twice.
- Trace any dispute outcome to the exact evidence and rule path.
To make that possible, design your event schemas around stable identifiers and deterministic fields.
Mind map: settlement record components
Event schema set (example)
Below is a compact set of events that supports end-to-end reconstruction. The fields are intentionally redundant where it helps auditors avoid chasing off-chain state.
1) Task submission and proof anchoring
- Event:
TaskSubmitted - Purpose: Bind an operator submission to a proof artifact and a specific job.
Example fields:
submission_id: unique per operator submissionjob_id: identifies the task instanceround_id: scoring windowoperator_id: identity used for eligibilityproof_hash: hash of the proof artifact (off-chain stored)measured_at: timestamp claimed by the operatorpayload_hash: hash of any structured measurement payload
2) Verification results
- Event:
VerificationTallied - Purpose: Record the deterministic scoring inputs derived from verifiers.
Example fields:
submission_idround_idverifier_threshold: required number of passing verifierspasses: number of verifiers that acceptedscore_raw: base score before multipliersquality_multiplier: deterministic multiplier from policyfinal_score: computed score used for reward
3) Escrow release and payout
- Event:
EscrowReleased - Purpose: Make payment auditable as a state transition.
Example fields:
submission_idround_idclient_id(orrequest_id)reward_amountfee_amountpayout_addressescrow_idrelease_reason:eligible,dispute_upheld,dispute_timeout
4) Dispute lifecycle
- Event:
DisputeOpened - Purpose: Anchor dispute evidence references.
Example fields:
-
dispute_id -
submission_id -
opened_by: client or verifier -
challenge_deadline -
evidence_hashes: array of hashes -
rule_version: policy version used for ruling -
Event:
DisputeRuling -
Purpose: Record the final outcome that affects settlement.
Example fields:
dispute_idsubmission_idruling:upheldoroverturnedadjustment_amount: how much reward changes (can be zero)ruling_reason_code: deterministic code
Mind map: reconciliation workflow
Deterministic reward recomputation (example)
Assume a simple policy:
- If
passes < verifier_threshold, the submission is ineligible. - Otherwise, reward is proportional to
final_score.
A concrete formula (using integers to avoid rounding drift):
\[
\text{reward_amount} = \left\lfloor \frac{\text{final_score} \times \text{reward_pool_for_round}}{\text{score_normalizer}} \right\rfloor
\]
Fees are separated:
\[
\text{fee_amount} = \left\lfloor \frac{\text{reward_amount} \times \text{fee_bps}}{10{,}000} \right\rfloor
\]
The operator payout can be reward_amount - fee_amount or the contract can pay fee to a separate recipient; either way, the event must state both numbers so reconciliation doesnât guess.
Example event sequence and reconciliation
Consider one submission_id = sub_42 in round_id = r_7.
TaskSubmitted
submission_id: sub_42job_id: job_9operator_id: op_3proof_hash: 0xabc...measured_at: 2026-03-01T10:00:00Z
VerificationTallied
submission_id: sub_42passes: 3verifier_threshold: 3score_raw: 80quality_multiplier: 125(meaning 1.25 in fixed-point)final_score: 100(already computed deterministically)
EscrowReleased
submission_id: sub_42escrow_id: esc_1reward_amount: 5000fee_amount: 250payout_address: addr_op_3release_reason: eligible
Reconciliation steps:
- Confirm there is exactly one
EscrowReleasedforsub_42. - Confirm
VerificationTalliedexists forsub_42and occurs before release. - Recompute
reward_amountfromfinal_scoreand the round parameters that were in effect. Those parameters must be either included in events or versioned and queryable byround_id. - Compare recomputed
reward_amountandfee_amountto event values.
If the recomputed reward_amount is 4999 but the event says 5000, you have a mismatch that should be flagged immediately. The mismatch can come from: wrong policy version, wrong normalizer, or a non-deterministic scoring step. Auditable records make that failure mode obvious.
Dispute example: adjustment and traceability
Now add a dispute for submission_id = sub_43.
-
TaskSubmittedforsub_43. -
VerificationTalliedcomputes an initialfinal_score. -
DisputeOpened
dispute_id: disp_2submission_id: sub_43challenge_deadline: 2026-03-02T12:00:00Zevidence_hashes: [0x111..., 0x222...]rule_version: v3
DisputeRuling
dispute_id: disp_2submission_id: sub_43ruling: overturnedadjustment_amount: -800ruling_reason_code: R_OVR_17
EscrowReleased
submission_id: sub_43reward_amount: 4200fee_amount: 210release_reason: dispute_upheld
Note the subtlety: release_reason should reflect the final state of the dispute, not the initial verification. If the dispute is overturned, release_reason should indicate that the ruling changed the outcome (for example, dispute_overturned), and the reward numbers must match the adjusted computation.
Reconciliation invariants to enforce
Use these checks to prevent âlooks rightâ settlements:
- Uniqueness: one
EscrowReleasedpersubmission_id. - Causality:
VerificationTalliedmust exist before release. - Dispute linkage: if a
DisputeOpenedexists, then either aDisputeRulingexists before release orrelease_reasonexplicitly indicates a timeout path. - Evidence traceability:
DisputeOpened.evidence_hashesmust match the evidence references used by the ruling logic (at least by hash). - Determinism: recomputed amounts must equal event amounts under the same
rule_versionand round parameters.
Minimal reconciliation pseudocode (illustrative)
for each round_id:
index submissions from TaskSubmitted
index scores from VerificationTallied
index disputes from Dispute* events
for each submission_id in round:
assert exactly_one EscrowReleased
assert VerificationTallied exists
if DisputeOpened exists:
assert DisputeRuling exists or release_reason is timeout
recompute reward using final_score and round params
recompute fee using fee_bps
compare to EscrowReleased.reward_amount/fee_amount
if mismatch: emit error with submission_id and rule_version
Why event design beats âtrust the indexerâ
If your settlement events include stable IDs, proof hashes, scoring inputs, and the final amounts, reconciliation becomes a mechanical process. That reduces the chance that a missing off-chain field silently changes payouts. It also makes disputes easier to reason about because the record shows which rule version and which evidence hashes were in play.
In short: treat settlement events like a receipt that contains enough line items to audit the math, not just enough to say âpaid.â
5. Measurement, Proofs, and Verification Pipelines
5.1 What Must Be Proven: Choosing Measurement Targets and Proof Granularity
A DePIN verification pipeline has one job: prove that some physical claim is true enough to justify payment. The tricky part is deciding what âtrue enoughâ means in measurable terms, and how much evidence you require to make that claim verifiable.
Start with the payment claim
Before designing proof formats, write the payment claim in plain language:
- Client claim: âThis operator delivered service X to location Y during time window Z.â
- Network claim: âThe delivered service meets quality threshold Q.â
- Accounting claim: âGiven the evidence, we can compute the reward R deterministically.â
Everything you prove should map to one of these claims. If you canât point to a payment rule, you probably donât need that measurement.
Measurement targets: what exactly is being measured
A measurement target is the smallest unit of physical reality you can tie to a verification step. Good targets have three properties: observability, bounded scope, and verifiability.
- Observability: A verifier (or verifier workflow) can obtain the data needed to assess the target.
- Bounded scope: The target has clear boundaries (time window, region, device set, and units).
- Verifiability: The target can be checked with deterministic rules or with bounded uncertainty.
Common target types:
- Presence/availability: âDevice was online and reachable.â
- Quantity: âMeasured throughput was at least T.â
- Quality: âSignal quality score was at least S.â
- Coverage: âAt least K distinct locations were served.â
- Compliance: âWork followed constraints (e.g., power limits, safety checks).â
Example: coverage vs quantity
Suppose you pay for âcoverageâ of a region using roadside sensors.
- A quantity target might be âsensor reported N readings.â
- A coverage target is âreadings came from at least K distinct coordinates within region R.â
Coverage is usually harder to fake because it requires diversity across locations. If your payment is for coverage, you should prove coverage, not just volume.
Proof granularity: how much evidence you require
Proof granularity is the level of detail in the evidence you accept. It determines cost, latency, and fraud resistance.
Think of granularity as a spectrum:
- Coarse proofs: âOperator passed a threshold check.â
- Structured proofs: âOperator submitted measurements plus a summary that can be validated.â
- Fine proofs: âOperator submitted raw measurements with enough structure to recompute results.â
Coarse proofs are cheaper but weaker. Fine proofs are stronger but more expensive to transmit, store, and verify.
A practical approach is to choose granularity per verification stage:
- Stage 1 (eligibility): coarse proof to filter obvious noncompliance.
- Stage 2 (scoring): structured proof to compute a score.
- Stage 3 (dispute support): fine proof or additional artifacts to resolve challenges.
This staged design keeps routine verification fast while still giving disputes a path to correctness.
A mind map for target-to-proof design
Mind map: choosing measurement targets and proof granularity
Define bounds and units early
Ambiguity is the enemy of verification. If you donât specify units and bounds, youâll end up proving something that canât be compared.
Include these in the measurement target definition:
- Units: e.g., meters, seconds, bytes, dBm, packets per second.
- Time window: start/end timestamps or block ranges.
- Spatial scope: bounding box, geohash prefix, or region ID.
- Device set: which device IDs are allowed for the claim.
- Aggregation rule: average, minimum, percentile, or weighted sum.
Example: throughput aggregation
If you pay for âthroughput,â decide whether you mean:
- Minimum throughput over the window (penalizes dips), or
- Average throughput (smooths spikes), or
- Percentile throughput (robust to outliers).
Then your proof granularity must support that aggregation. If you accept only a single summary number, you canât later recompute a percentile without raw samples.
Choose granularity based on the scoring function
Your scoring function determines what evidence is necessary.
- If the score is based on a single threshold (e.g., score = 1 if Q â„ S), coarse proofs can work.
- If the score uses nonlinear aggregation (e.g., percentile, median absolute deviation, or piecewise penalties), structured or fine proofs are usually required.
Example: quality score with a penalty
Imagine quality score: \[ \text{score} = \max\left(0, \frac{\text{SNR} - 10}{20 - 10}\right) \times \mathbf{1}[\text{latency} \le 200\text{ ms}] \] To verify this, you need at least:
- A validated SNR statistic (mean, median, or worst-case), and
- A validated latency statistic.
If you only accept âpassed latency check,â you still need the SNR statistic. If you only accept âSNR summary,â you must ensure the summary is computed from data that can be audited during disputes.
Use a concrete evidence ladder
A simple evidence ladder helps you avoid all-or-nothing proof requirements.
- Eligibility evidence (coarse):
- Signed heartbeat showing the node was active in the time window.
- A commitment to the measurement set (hash or Merkle root).
- Scoring evidence (structured):
- Aggregated metrics computed from the measurement set.
- Proof that the aggregation corresponds to the committed set.
- Dispute evidence (fine):
- Raw samples or enough sample-level data to recompute the aggregation.
- Any calibration metadata needed to interpret raw values.
Example: sensor readings
- Eligibility: âSensor ID A reported data during window W; commitment C was published.â
- Scoring: âMean temperature over W is 23.4°C; computed from samples committed in C.â
- Dispute: âIf challenged, provide the sample list (timestamped) and calibration parameters used to convert sensor units.â
This ladder prevents routine verification from carrying the full weight of raw data.
Decide what not to prove
Not everything needs proof. If a value is used only for display, you can treat it as informational rather than payment-critical.
A good rule: prove only what affects eligibility, scoring, or reward computation.
Example: location display vs location eligibility
If the UI shows âapproximate location,â you donât need to prove it for rewards unless location affects eligibility or scoring. If location affects scoring (e.g., coverage requires distinct coordinates), then location becomes a measurement target with strict bounds.
Common pitfalls and how to avoid them
- Underproof: You accept a coarse claim that doesnât constrain the attack surface.
- Fix: tighten the measurement target to match the payment claim.
- Overproof: You require raw data for every request.
- Fix: use staged granularity; require fine evidence only for disputes.
- Ambiguous bounds: Disputes become âinterpretation fights.â
- Fix: specify units, time windows, and aggregation rules.
- Mismatch between scoring and evidence: The proof format canât support the scoring function.
- Fix: ensure the evidence ladder contains the minimum data needed to recompute the score.
Quick checklist for choosing targets and granularity
- What is the exact payment claim in one sentence?
- Which measurement target(s) map to eligibility, scoring, and reward?
- Are units, time window, spatial scope, and aggregation rule explicitly defined?
- Does the scoring function require structured or fine evidence?
- Can you design a staged evidence ladder (coarse â structured â fine) that keeps routine verification efficient?
- Are the failure modes acceptable for the chosen granularity?
When these answers are written down, proof design becomes mechanical: you choose evidence that can be validated against the scoring rules, and you avoid paying for detail you wonât use.
5.2 Proof Formats Example: Signed Measurements and Merkle Commitments
A DePIN verification pipeline usually needs two things from a node: (1) a statement about what happened (the measurement), and (2) a way to prove that statement is tied to specific data and specific time. Proof formats are the âpackagingâ that makes those two requirements easy to check.
Signed measurements: proving authorship and intent
A signed measurement is a structured record that the node signs with its private key. Verification checks the signature and then checks the record against protocol rules (freshness, eligibility, and consistency).
What to include in the signed payload
A good signed measurement payload is small enough to be handled frequently, but complete enough that verifiers donât need to guess. Typical fields:
- node_id: the identity the protocol recognizes.
- task_id: which request this measurement answers.
- measurement_type: what is being measured (e.g., âtemperatureâ, âuptimeâ, âbandwidthâ).
- value: the numeric or categorical result.
- unit / scale: so verifiers interpret the value correctly.
- timestamp: when the measurement was taken.
- nonce: prevents replay of old signed messages.
- context_hash: binds the measurement to the taskâs parameters (e.g., target endpoint, sampling window).
Why context_hash matters
If the payload only contains a value and a timestamp, a verifier canât tell whether the node measured the right thing. By hashing the task parameters into context_hash, the signature commits to the exact context.
Example signed measurement (conceptual JSON)
{
"node_id": "node-7",
"task_id": "task-2026-03-24-001",
"measurement_type": "bandwidth_mbps",
"value": 94.2,
"unit": "Mbps",
"timestamp": 1711250000,
"nonce": "b9c1...",
"context_hash": "0x9a3f...",
"signature": "0x5d2a..."
}
Verification steps
- Look up the nodeâs public key from the registry.
- Recompute the hash of the payload fields (everything except
signature). - Verify the signature.
- Check
task_idis valid and the node is eligible for it. - Enforce freshness:
timestampmust be within an allowed window. - Enforce anti-replay:
noncemust not have been used for the sametask_id. - Recompute
context_hashfrom the task parameters and compare.
Common pitfall: signing too little
If you omit context_hash, a node can reuse a signature for a different task that asks for a similar measurement. The verifier might still accept it if the value âlooks plausible,â which is exactly the kind of ambiguity proof formats are meant to remove.
Merkle commitments: proving inclusion without sending everything
Merkle commitments are useful when a node must attest to a set of items (samples, logs, segments, or evidence chunks) but sending all items to every verifier is expensive. The node commits to the full set by publishing a Merkle root, and later reveals specific leaves with Merkle proofs.
Core idea
- Leaves: hashes of individual items.
- Internal nodes: hashes of pairs of children.
- Root: a single hash that commits to the entire dataset.
Example: committing to measurement samples
Suppose a task requires sampling bandwidth every second for 10 seconds. The node collects 10 samples and wants to prove that the samples used to compute the final value are exactly those it collected.
- Each sample is serialized deterministically (same field order, same numeric encoding).
- Each leaf is
H(sample_i). - The node publishes
merkle_rootin its signed measurement. - When challenged, the node reveals a subset of samples plus Merkle proofs.
Leaf construction example
Let each sample be:
t_i: sample timestamp (or offset)v_i: measured valuemeta: any required metadata (e.g., interface id)
A deterministic leaf hash could be:
\[ \text{leaf}_i = H(\text{encode}(t_i, v_i, meta)) \]
Then the Merkle root is computed over the leaf list.
Combining both: signed root + Merkle proofs
The cleanest integrated format is:
- The node signs a measurement record that includes the Merkle root.
- The node provides Merkle proofs for any revealed items.
- Verifiers check the signature first, then verify inclusion proofs against the signed root.
This prevents a subtle attack: a node could otherwise send a Merkle root that matches one dataset while later revealing leaves from another dataset.
Mind map: proof formats and their responsibilities
Mind map: Proof Formats (Signed Measurements + Merkle Commitments)
Concrete end-to-end example
Scenario: A task asks for a bandwidth measurement over a fixed 10-second window. The node must provide a final value and be able to prove which samples were used.
Step A: Node computes samples and Merkle root
- Samples:
sample_0 ... sample_9. - Leaves:
leaf_i = H(encode(sample_i)). - Root:
merkle_root = MerkleRoot(leaf_0 ... leaf_9). - Final value:
value = average(sample_i.v)(the exact aggregation rule is defined by the protocol).
Step B: Node signs the measurement record
The signed payload includes:
task_idvaluetimestampnoncecontext_hash(hash of task parameters)merkle_root
Step C: Verifier accepts or challenges
- If accepted without challenge, the verifier checks signature, freshness, nonce, and
context_hash. - If challenged, the node reveals a subset of samples, each with a Merkle proof.
Step D: Verifier checks Merkle inclusion
For each revealed sample_k:
- Compute
leaf_k = H(encode(sample_k)). - Use the provided Merkle proof to recompute the root.
- Confirm the recomputed root equals the
merkle_rootinside the signed measurement.
If all revealed leaves match, the nodeâs committed dataset is consistent with the signed record.
Design notes that keep proofs practical
- Deterministic encoding: both signature hashing and leaf hashing must use a deterministic encoding. If two implementations serialize numbers differently, proofs fail even when the underlying data is the same.
- Domain separation: use different hash domains for payload hashing vs leaf hashing (e.g., prefix bytes) to avoid accidental collisions across contexts.
- Bounded proof size: Merkle proofs scale with
log2(n)fornleaves. Choose evidence chunking so proofs remain small enough for your expected challenge rate. - Aggregation rule transparency: if the signed record includes a derived
value(like an average), the protocol must specify the exact aggregation method so verifiers can check consistency when enough samples are revealed.
Summary
Signed measurements ensure the verifier can trust who produced the claim and that the claim is tied to the correct task context and time. Merkle commitments ensure the node can prove inclusion of specific evidence items without sending the entire dataset. When you include the Merkle root inside the signed measurement, you get a tight binding between the claim and the committed evidence, which makes verification straightforward and robust.
5.3 Verification Workflows Example Multi-Stage Validation With Thresholds
A DePIN verification workflow is easiest to reason about when you treat it like a pipeline with explicit gates. Each gate checks one property, and each property has a threshold that decides pass/fail (or pass/needs-more-evidence). Multi-stage validation is useful because the first checks are cheap and fast, while later checks are more expensive and require stronger evidence.
The goal: decide âvalid enough to payâ
In a measurement-based network, a client submits a proof package for a task. The network must decide whether the package supports the claim strongly enough to unlock rewards. The decision should be deterministic given the submitted data and the current protocol parameters.
A practical pattern is:
- Stage A: Format and authenticity checks (cheap)
- Stage B: Basic consistency checks (moderate)
- Stage C: Thresholded verification (expensive, evidence-weighted)
- Stage D: Final decision and settlement (on-chain record)
Mind map: multi-stage validation with thresholds
Stage A: Intake & authenticity checks (cheap, strict)
Stage A should reject anything that is clearly not usable. This prevents later stages from wasting compute.
Checks
- Signature validity: The proof package includes a signature from the node identity key over a canonical payload.
- Freshness: The payload includes a nonce or task-specific identifier so the same proof cannot be replayed.
- Schema version: The proof must match the expected schema version for the task.
- Membership/eligibility: The node must be currently eligible for the task type.
Example A client submits:
taskId = 42nodeId = N7measurement = 18.6timestamp = 1710000000proofHash = H(...)signature = Sig_node(...)
Stage A verifies:
- The signature matches
nodeId = N7. - The signature covers
taskId = 42and a protocol nonce. - The proof schema version is
v3(the task expectsv3). - Node
N7is in the eligible set for âcoverageâ tasks.
If any of these fail, the workflow returns Fail immediately.
Stage B: Consistency & sanity checks (moderate, deterministic)
Stage B checks whether the submitted values are internally coherent and plausible under protocol rules.
Checks
- Measurement bounds: For example, a temperature sensor reading must be within a configured range.
- Unit normalization: If the proof includes units, convert to canonical units before scoring.
- Timestamp ordering: Ensure the measurement timestamp is within the taskâs allowed window.
- Cross-field checks: If the proof claims âdistance = d,â then the derived value used in the proof must match.
Example Suppose the task expects a normalized reading in meters and the proof says:
rawDistance = 1200withunit = cm- Protocol canonical unit is meters
Stage B converts to d = 12.0 m and checks:
dmust be within[0.5, 20.0]for this task.- The timestamp must be between
startTimeandendTime. - If the proof includes a commitment to derived values, the derived values must match the commitment.
If Stage B fails, return Fail. If Stage B passes, proceed to Stage C.
Stage C: Thresholded verification (expensive, evidence-weighted)
Stage C is where multi-stage validation earns its keep. Instead of requiring a single âperfectâ proof, the protocol can accept evidence that meets a threshold.
Evidence items A proof package may include multiple evidence items, such as:
E1: signed measurementE2: location attestationE3: sensor calibration proofE4: redundancy from an additional observation
Each evidence item is scored independently.
Per-item scoring Define a score function (s_i in [0,1]) for each evidence item (E_i). The score reflects how well the evidence supports the claim.
Examples of scoring rules:
- A signature-based evidence item is either valid or not, so (s_i) is 1 or 0.
- A calibration proof might be partially valid if it is within an acceptable tolerance window, giving (s_i = 0.7).
- A location attestation might be strong when it matches multiple constraints, giving (s_i = 0.9).
Weighting Assign weights (w_i ge 0) to reflect evidence importance. For instance, signed measurement might be weighted higher than metadata.
Aggregate score Compute an aggregate score:
\[ S = \sum_{i=1}^{n} w_i \cdot s_i \]
Then compare against thresholds:
- Pass: \(S \ge T_{pay}\)
- Review: \(T_{review} \le S < T_{pay}\)
- Fail: (S < T_{review})
Concrete example Assume:
- Evidence items: (E_1\) signed measurement, (E_2\) location attestation, (E_3\) calibration proof
- Weights: (w_1 = 0.5, w_2 = 0.3, w_3 = 0.2)
- Thresholds: (T_{pay} = 0.75, T_{review} = 0.55)
Scores from verification:
- (s_1 = 1.0) (signature valid)
- (s_2 = 0.6) (location matches constraints but with mild uncertainty)
- (s_3 = 0.0) (calibration proof missing)
Aggregate: \[ S = 0.5\cdot1.0 + 0.3\cdot0.6 + 0.2\cdot0.0 = 0.5 + 0.18 + 0 = 0.68 \]
Decision:
- (0.55 \le 0.68 < 0.75\) so the result is Review.
What âReviewâ means in practice Review should not be vague. It should trigger a deterministic next step, such as requesting additional evidence items or running an alternate verification path.
Example policy:
- If calibration proof is missing, allow the client to resubmit with (E_3) within a challenge window.
- If location attestation is weak, require a second independent location evidence item.
If the client cannot supply additional evidence, the workflow eventually returns Fail.
Stage D: Final decision and settlement (on-chain record)
Stage D records the outcome in a way that supports later auditing and dispute resolution.
Outputs
verificationStatus:PASS,REVIEW, orFAILacceptedEvidenceHash: hash of the evidence package used for scoringscoreS: the aggregate score (S)rewardMultiplier: derived from (S) or from the pass/review/fail category
Example reward multiplier
If PASS, set:
rewardMultiplier = 1.0IfREVIEW, set:rewardMultiplier = 0.5IfFAIL, set:rewardMultiplier = 0.0
This keeps settlement consistent with verification outcomes.
Failure handling rules (so the pipeline doesnât get messy)
A good workflow specifies what happens when evidence is missing or contradictory.
- Missing required fields (Stage A/B): immediate Fail.
- Contradictory evidence (Stage B): immediate Fail because consistency checks should catch it.
- Borderline aggregate score (Stage C): Review with a defined evidence request.
- Evidence tampering (Stage A): immediate Fail because authenticity checks should detect it.
Mermaid diagram: end-to-end multi-stage validation
flowchart TD
A[Receive proof package] --> B{Stage A: Authenticity & schema?}
B -- No --> F[Fail]
B -- Yes --> C{Stage B: Consistency & sanity?}
C -- No --> F
C -- Yes --> D[Stage C: Score evidence items]
D --> E{Aggregate score S vs thresholds}
E -- S >= T_pay --> P[Pass]
E -- T_review <= S < T_pay --> R[Review]
E -- S < T_review --> F
P --> G[Stage D: Record PASS and settle]
R --> H[Request more evidence / alternate check]
H --> C
F --> I[Stage D: Record FAIL and stop]
Putting it together: a short walkthrough
- The client submits a proof package for
taskId = 42. - Stage A confirms the node signature, freshness, and schema version.
- Stage B normalizes units and checks timestamps and internal commitments.
- Stage C scores three evidence items and computes (S = 0.68).
- Since \(0.55 \le 0.68 < 0.75\), the result is Review.
- The protocol requests the missing calibration evidence item.
- If the client resubmits and the new evidence raises \(S \ge 0.75\), the workflow records PASS and settles rewards.
This structure keeps verification predictable: early gates prevent waste, later gates quantify uncertainty, and the final decision is recorded with enough detail to support disputes without re-running everything from scratch.
5.4 Handling Measurement Uncertainty: Confidence Scores and Bounds
In a DePIN measurement pipeline, uncertainty is not a bug; itâs a property of the world. Sensors drift, networks delay, and environments change. The design question is how to represent uncertainty so that verification, incentives, and settlement stay consistent.
Start with a clear uncertainty model
Before choosing confidence scores, decide what kind of uncertainty youâre modeling. A practical approach is to separate:
- Noise (random error): repeated measurements vary around a true value.
- Bias (systematic error): measurements are consistently offset (e.g., calibration drift).
- Missingness (data gaps): you cannot measure reliably for some time window.
- Model mismatch: the measurement method assumes conditions that are not always true.
A neutral design rule: represent uncertainty in the same units as the measurement whenever possible. If you measure temperature in °C, express uncertainty in °C too.
Confidence scores: what they mean and what they donât
A confidence score is only useful if its semantics are explicit. Two common interpretations are:
- Probability of correctness: e.g., âthe submitted value is within tolerance with probability 0.9.â
- Relative reliability: e.g., âthis node is usually better than that node.â
Mixing these interpretations causes subtle accounting errors. If you want probability semantics, tie the score to a bound (next section). If you want relative reliability, treat it as a weight and keep the verification rule separate.
A simple, easy-to-explain pattern for probability semantics:
- Submit a measurement value \(x\).
- Submit an uncertainty bound \(\Delta\).
- Define confidence as the probability that the true value \(x^*\) lies in \([x-\Delta, x+\Delta]\).
Then confidence is not a random number; itâs attached to a concrete statement.
Use bounds for verification, not just scores
Verification should operate on bounds because bounds are checkable. Confidence can influence how much you trust, but the core eligibility test should be based on whether the evidence supports the required tolerance.
Example: temperature measurement with bounds
Suppose the network requires temperature to be within \(\pm 1.0,\degree\text{C}\) of a target \(T\).
- Node submits \(x = 22.3,\degree\text{C}\).
- Node submits \(\Delta = 0.6,\degree\text{C}\).
- So the true value is believed to lie in \([21.7, 22.9]\).
If the target is \(T = 23.0\), the acceptable interval is \([22.0, 24.0]\).
Verification rule (interval overlap):
- If \([x-\Delta, x+\Delta]\) overlaps the acceptable interval, the measurement is eligible.
- If it does not overlap, it is rejected.
This rule is deterministic given the submitted \(x\) and \(\Delta\). Confidence can then be used to adjust rewards, but eligibility does not depend on a subjective score.
Confidence from bounds: a concrete mapping
To make confidence meaningful, map it to a bound using a distribution assumption. A common, simple choice is a normal error model.
Assume measurement error \(e = x - x^*\) is distributed as \(\mathcal{N}(0, \sigma^2)\). If you report \(\Delta = k\sigma\), then:
\[ \Pr\big(|x^* - x| \le \Delta\big) = \Pr(|e| \le k\sigma) = \operatorname{erf}\left(\frac{k}{\sqrt{2}}\right) \]
You donât need to compute \(\operatorname{erf}\) on-chain. You can precompute a small lookup table for typical \(k\) values (e.g., 1, 2, 3) and store confidence as a discrete tier.
Example: confidence tiers
- If \(\Delta = 1\sigma\), confidence tier = 0.68.
- If \(\Delta = 2\sigma\), confidence tier = 0.95.
- If \(\Delta = 3\sigma\), confidence tier = 0.997.
Now confidence is consistent with the reported bound.
Calibrating uncertainty so nodes canât game it
A node could submit an absurdly large \(\Delta\) to get high overlap and avoid rejection. Thatâs why uncertainty must be calibrated.
Two practical defenses:
- Bound sanity checks: enforce \(\Delta\) ranges based on sensor specs and recent behavior.
- Reward shaping: penalize overly wide bounds when they donât improve eligibility.
Example: reward depends on tightness
Let the required tolerance be \(\tau\). Define an âeffective tightnessâ score:
- If the acceptable interval is \([T-\tau, T+\tau]\), compute the distance from the center to the nearest point of overlap.
- Reward higher when \(\Delta\) is small but still overlaps.
A simple version:
- If overlap exists, set \(\text{quality} = \max\left(0, 1 - \frac{\Delta}{\tau}\right)\).
- Multiply base reward by \(\text{quality}\).
This makes wide bounds less profitable without requiring complex statistics.
Aggregating multiple uncertain measurements
When multiple nodes submit measurements for the same task, you need a rule for combining uncertainty.
A straightforward approach is evidence-based aggregation:
- Convert each submission into an interval \([x_i-\Delta_i, x_i+\Delta_i]\).
- Compute the intersection or weighted overlap.
If you require a single accepted value, you can compute a weighted average using inverse-variance weights \(w_i = 1/\sigma_i^2\), where \(\sigma_i \approx \Delta_i/k\) under your chosen \(k\).
Example: combining two temperature nodes
- Node A: \(x_A = 22.3\), \(\Delta_A = 0.4\)
- Node B: \(x_B = 22.9\), \(\Delta_B = 0.8\)
Assume \(\Delta = 2\sigma\) (so \(\sigma_A=0.2\), \(\sigma_B=0.4\)).
Weights:
- \(w_A = 1/0.04 = 25\)
- \(w_B = 1/0.16 = 6.25\)
Weighted mean: \[ \bar{x} = \frac{25\cdot 22.3 + 6.25\cdot 22.9}{25 + 6.25} \approx 22.43 \]
The combined uncertainty can be approximated as: \[ \sigma_{\text{comb}} = \sqrt{\frac{1}{w_A + w_B}} \approx \sqrt{\frac{1}{31.25}} \approx 0.179 \]
Then report \(\Delta_{\text{comb}} = 2\sigma_{\text{comb}} \approx 0.36\) if you keep the same \(k\).
This yields a single bound that reflects both measurements and their reported uncertainty.
Mind map: uncertainty handling in the pipeline
Implementation-friendly checklist
- Require submissions to include (x, Î), not just a confidence score.
- Define confidence semantics as probability within the bound.
- Use interval overlap for eligibility.
- Use confidence or tightness for reward scaling, not for acceptance.
- Add sanity checks and reward penalties to prevent trivial inflation of \(\Delta\).
- When aggregating, keep the same \(k\) convention so bounds remain comparable.
When these pieces fit together, uncertainty becomes a first-class input to verification rather than a decorative number. Itâs still imperfectâmeasurements rarely areâbut it stays consistent, checkable, and fair.
5.5 Anti-Fraud Controls: Example Replay Protection and Freshness Requirements
Replay attacks are the boring kind of fraud: the attacker resends something that already worked, hoping the system treats it as new. Freshness requirements are the antidote: they force every proof submission to be tied to a specific time window and a specific request context. Below is a practical set of controls you can apply to a DePIN verification pipeline where clients submit proofs and operators (or verifiers) accept them for rewards.
Threat model in one page
In a typical flow, a client requests a measurement, an operator (or a measurement service) produces a proof, and the client submits that proof for verification and settlement. The replay attacker can:
- Reuse an old proof for a new request.
- Reuse a proof for the same request after it has already been accepted.
- Reorder messages so the verifier processes stale data first.
- Try to âraceâ the system by submitting multiple variants of the same proof.
Your controls should make each accepted proof:
- Uniquely bound to a request.
- Valid only within a narrow time window.
- Non-replayable even if the attacker captures network traffic.
Mind map: replay protection and freshness
Anti-Fraud Controls Mind Map
Control 1: Bind every proof to a specific request
A proof should not be âa measurement.â It should be âa measurement for request X under challenge Y.â The simplest binding is to include a request identifier and a challenge nonce in the signed payload.
Example payload structure
The operator signs a message that includes:
requestId: a unique identifier generated by the client or coordinator.challengeNonce: a random nonce generated for that request.paramsHash: a hash of measurement parameters (e.g., location, sensor type, sampling window).proofDataHash: a hash of the raw measurement or proof artifact.issuedAt: timestamp when the operator created the proof.expiresAt: timestamp when the proof should be considered stale.
Then the verifier recomputes the hashes and checks the signature.
Why this works
Even if an attacker replays an old proof, the requestId and challengeNonce will not match the current request context. The verifier rejects it before doing expensive checks.
Control 2: Use a deterministic proof ID and a âused-proofâ registry
Binding alone prevents cross-request reuse, but it does not stop replay within the same request. For that, you need one-time acceptance.
Deterministic proof ID
Define a proof ID as a hash of the signed payload fields that matter for uniqueness:
\[ \text{proofId} = H(\text{requestId} | \text{challengeNonce} | \text{paramsHash} | \text{proofDataHash}) \]
The verifier stores proofId in a registry (on-chain or in a strongly consistent off-chain store). If the same proofId appears again, reject it.
Example
- First submission:
proofId = 0xabc...accepted. - Second submission (replay): same
proofIdarrives. - Verifier checks registry, sees it already used, rejects with reason
ALREADY_ACCEPTED.
This is also friendly to honest clients: if a client retries due to a timeout, it will either be accepted once or rejected as already accepted.
Control 3: Freshness windows with explicit deadlines
Freshness should be enforced with explicit time windows, not vague ârecent enoughâ language.
Recommended checks
Let the verifier use its own notion of time (e.g., block timestamp or server time) to avoid trusting client clocks.
For each proof, verify:
now <= expiresAtnow >= issuedAt - skewTolerance
Where skewTolerance is a small buffer (for example, a few minutes) to handle clock drift.
Example numbers
- Operator includes
issuedAt = 12:00:05ZandexpiresAt = 12:05:05Z. - Verifier time is
12:03:10Z. - Checks pass.
- If verifier time is
12:06:00Z, reject withEXPIRED.
Why include expiresAt in the signed payload?
If expiresAt is computed only by the verifier, an attacker could replay a proof and hope the verifierâs window is wide. Embedding expiresAt makes the operatorâs intent explicit and auditable.
Control 4: Challenge timestamps and request deadlines
Freshness is stronger when the request itself has a deadline. The coordinator issues a challenge with a timestamp and the verifier enforces that the proof arrives before the request deadline.
Example request fields
requestIdchallengeNoncechallengeIssuedAtrequestDeadline
Verifier checks:
now <= requestDeadlinechallengeIssuedAtis within an allowed age relative tonow(to prevent very old challenges being used).
This prevents a replay attacker from using an old request object even if the proof is still within its own expiresAt.
Control 5: Idempotency rules for retries
A replay attacker and a retrying honest client look similar at the network level. Idempotency rules make the system deterministic.
Rule set
- If a proof with the same
proofIdwas accepted earlier, return the prior result (or a clear rejection reason) without re-verifying. - If a proof matches the request binding but fails freshness, do not accept it even if a different proof for the same request might arrive later.
- If a proof fails signature verification, do not treat it as idempotent; reject and log.
Example behavior
- Client submits proof at
12:04:59Z(accepted). - Network drops response.
- Client retries at
12:05:10Z. - Verifier sees
proofIdalready used and returnsALREADY_ACCEPTED.
No reward duplication, no wasted work.
Control 6: Rate limiting and replay detection heuristics
Even with strict checks, you should protect the verifier from being flooded with invalid submissions.
Practical rate limits
Apply limits per:
requestId(max submissions per request)operatorId(max submissions per operator per time window)clientId(max submissions per client per time window)
Example
- Allow up to 3 proof submissions per
requestId. - If more arrive, reject with
RATE_LIMITED.
This doesnât replace cryptographic checks; it reduces load and makes abuse more expensive.
Verification-time checklist (concrete)
When a proof submission arrives, the verifier should run checks in a cheap-to-expensive order:
- Parse payload and confirm required fields exist.
- Recompute
paramsHashandproofDataHash. - Recompute
proofId. - Check
proofIdin used-proof registry. - Verify signature over the signed payload.
- Check request binding:
requestIdandchallengeNoncematch the expected context. - Check freshness:
now <= expiresAtandnow >= issuedAt - skewTolerance. - Only then run measurement-specific verification logic.
This ordering ensures that replay and stale proofs are rejected quickly.
Example end-to-end scenario
- At
12:00:00Z, coordinator createsrequestId=R1,challengeNonce=N1,requestDeadline=12:05:00Z. - Operator signs proof with
issuedAt=12:00:10Z,expiresAt=12:04:50Z. - Client submits at
12:04:20Z.proofIdnot used â signature valid â binding matches â freshness passes â accept.
- Attacker replays the same proof at
12:04:40Z.proofIdalready used â rejectALREADY_ACCEPTED.
- Attacker replays again at
12:06:00Z.- Even if registry were cleared, freshness fails
EXPIRED.
- Even if registry were cleared, freshness fails
Two independent barriers prevent the same fraud outcome.
Logging and rejection reasons
For operational sanity, every rejection should include a stable reason code and enough context to debug without leaking sensitive data.
Example reason codes:
MISSING_FIELDSINVALID_SIGNATUREREQUEST_MISMATCHALREADY_ACCEPTEDEXPIREDRATE_LIMITED
Store these alongside requestId, operatorId, and proofId so you can trace patterns like repeated stale submissions from a specific operator.
Summary
Replay protection and freshness are not separate features; they reinforce each other. Binding ensures the proof is for the right request, the used-proof registry ensures itâs accepted once, and explicit time windows ensure old proofs stop working even if they were never seen before.
6. Consensus, Finality, and On-Chain Data Modeling
6.1 Choosing What Lives On-Chain: Example Minimal State for Maximum Throughput
On-chain state is expensive in two ways: it costs gas (or equivalent fees) and it forces every full node to track it. Off-chain data is cheaper, but it must still be anchored to something the chain can verify. The design goal is simple: keep on-chain state minimal, but keep it sufficient to make verification and settlement deterministic.
The rule of thumb: store commitments, not raw facts
A common mistake is putting raw measurements, raw logs, or large proof blobs directly on-chain. Instead, store compact commitments that let anyone verify that a particular off-chain artifact corresponds to an on-chain claim.
Example:
- Off-chain: a measurement report (e.g., sensor readings), plus a proof artifact (e.g., signatures, Merkle paths).
- On-chain: a hash (or Merkle root) of the report, plus the metadata needed to interpret it (e.g., measurement type, time window, and the node identity).
This pattern keeps the chain focused on what was accepted and what must be paid, not on how the data was produced.
Mind map: what to keep on-chain vs off-chain
Identify the minimum set of state transitions
Most DePIN networks need a few state transitions:
- A node becomes eligible to participate.
- A client request is created.
- A node submits a result for that request.
- The network accepts or rejects the result.
- Rewards are computed and paid.
- Disputes can override acceptance.
You can implement these transitions with minimal state by storing only the identifiers and commitments required to enforce the rules.
Example: a minimal on-chain data model
Assume a network where clients request coverage measurements for a geographic area during a time window. Operators submit results containing:
- a report hash
- a proof that the report was produced correctly
- a signature from the operator
On-chain state (minimal):
NodeRegistry: mapsnodeId -> nodePubKeyHash, status, stakeInfoRefRequest: mapsrequestId -> (client, epochId, areaId, timeWindow, policyVersionId)Submission: maps(requestId, nodeId) -> submissionStatus, resultCommitmentHashSettlement: mapsrequestId -> finalResultCommitmentHash, finalStatusReward: mapsrequestId -> rewardBreakdownRef(or directly stores small numeric amounts)
What is intentionally not stored on-chain:
- the raw report
- the full proof artifact
- large lists of participants
- per-measurement details when a single commitment suffices
Why commitments work: deterministic interpretation
A commitment is only useful if the chain can interpret it consistently. That means the on-chain state must include enough context to interpret the commitment.
Example commitment scheme:
- Off-chain constructs
reportHash = H(reportBytes). - On-chain stores
reportHashplusmeasurementTypeandtimeWindow. - Verification rules say: âA valid submission must provide a proof that binds to
reportHashand is signed by the operatorâs key for this epoch.â
If you omit measurementType or timeWindow, the same reportHash could be interpreted under different rules, which breaks determinism.
Keep per-request state small and bounded
Throughput suffers when per-request state grows without a clear upper bound. A minimal design avoids storing unbounded arrays.
Bad pattern:
- Storing every submission attempt in an ever-growing list.
Better pattern:
- Store only the final accepted commitment and a compact record of the winning set (or a threshold summary).
If you need multiple submissions to reach a threshold (e.g., âaccept if at least 3 operators agreeâ), store:
- the set size threshold
k - the final aggregated commitment (or the list of signers if it is bounded)
Example:
- If you require exactly 3 signers, store 3
nodeIds and their signatures off-chain; on-chain stores a single aggregated hash. - If you require âat least 3 out of N,â store only the final aggregated commitment and the count, not the full roster.
Use event logs for bulk audit trails
On-chain state is for enforcement; event logs are for observability. Events are cheaper than state writes and donât need to be read by every contract.
Example:
- When a submission is accepted, emit
SubmissionAccepted(requestId, nodeId, resultCommitmentHash). - When a challenge is resolved, emit
ChallengeResolved(requestId, outcome, finalCommitmentHash).
Clients and indexers can reconstruct history from events without bloating contract storage.
Minimal state for disputes: store pointers, not evidence
Disputes require evidence, but evidence can be large. The chain should store:
- the dispute id
- the commitment being challenged
- the policy version and challenge window
Off-chain:
- the evidence bundle
- the challengerâs proof
- the operatorâs rebuttal
On-chain:
- a boolean or small enum for dispute status
- the final decision commitment
This keeps dispute handling deterministic while avoiding large storage writes.
A concrete âminimal stateâ checklist
Use this checklist when deciding what to store:
- Is it required to enforce a rule inside the contract? If not, prefer events or off-chain.
- Can it be represented as a fixed-size commitment? If yes, store the commitment.
- Does it grow with the number of submissions or measurements? If yes, redesign to keep it bounded.
- Do you need it for deterministic interpretation? If yes, store the minimal context (type, epoch, policy version).
- Can it be derived from events? If yes, donât store it as state.
Throughput example: reducing writes in a submission-heavy workflow
Consider a workflow where each request may receive 50 submissions, but only one final result is accepted.
Naive approach: store each submissionâs full details in state.
- 50 state writes per request.
Minimal approach: store only:
- one
Requestrecord - one
Submissionrecord per node only until it is superseded (or store only the latest status) - one final
Settlementrecord
If you also avoid storing raw proofs and evidence, the contractâs write footprint becomes dominated by a small number of fixed-size updates per request.
The result is not magic; itâs arithmetic. Fewer state writes means less fee burn and less time spent processing state changes, which directly improves throughput.
Summary
Minimal on-chain state is achieved by storing (1) identity anchors, (2) verification and settlement commitments, and (3) small bounded records needed for enforcement. Everything elseâraw data, proof artifacts, and audit-friendly historyâbelongs off-chain or in events. The chain then acts like a referee with a scorecard, not like a warehouse that stores every piece of equipment used in the match.
6.2 Transaction and Event Design Example Deterministic Event Schemas
Deterministic event schemas make a DePIN network easier to index, easier to audit, and harder to misunderstand. The goal is simple: every on-chain event should have a stable meaning, a stable field order, and a stable way to interpret timestamps, identifiers, and amounts.
Why âdeterministicâ matters in practice
When an operator submits a proof, the chain emits events that downstream components consume: indexers build read models, clients show receipts, and dispute logic checks evidence. If event fields are ambiguous (or change shape), you get mismatched accounting, broken UIs, and verification code that silently assumes the wrong thing.
Determinism here means:
- Stable schema: the same event name always carries the same fields with the same types.
- Stable semantics: fields mean the same thing across versions.
- Stable ordering: field order is fixed in the ABI, and your parsing code relies on it.
- Stable identifiers: IDs are generated deterministically from inputs or are explicitly included.
Mind map: event design checklist
Deterministic Event Schemas (Mind Map)
A concrete event set for a measurement-and-reward flow
Assume a minimal DePIN flow:
- A client requests a measurement.
- An operator submits a proof.
- The contract verifies eligibility and records the result.
- Rewards are accounted and later settled.
You want events that support both the happy path and disputes.
Event naming and versioning
Use names that encode lifecycle stage, and add a version suffix only when you must change semantics.
MeasurementRequestedV1ProofSubmittedV1MeasurementAcceptedV1MeasurementRejectedV1DisputeOpenedV1RewardAccountedV1RewardSettledV1
If you later add a field, prefer emitting a new event version rather than changing the old one.
Deterministic identifiers: the backbone of correlation
Events should include identifiers that let you correlate across systems.
Common IDs:
requestId: identifies the clientâs measurement request.submissionId: identifies a specific operator submission.taskId: identifies the physical task or measurement target.
A practical pattern is to compute IDs deterministically from inputs.
Example: requestId derived from (client, taskId, nonce).
\[
requestId = keccak256(abi.encode(client, taskId, nonce))
\]
This avoids âguessingâ IDs off-chain and makes event correlation reliable even if multiple requests share the same task.
Schema design rules (with examples)
1) Use explicit units
Never rely on âimpliedâ units.
- Use
timestampSecfor seconds. - Use
rewardWeifor token amounts. - Use
distanceMetersfor meters.
Example fields:
proofTimestampSec: uint64rewardWei: uint256
2) Separate accepted vs rejected
Rejected submissions should still emit an event with a reason code, so indexers can update UI state without parsing revert strings.
Example:
MeasurementAcceptedV1includesqualityScoreandrewardWei.MeasurementRejectedV1includesreasonCode.
3) Include keys for indexing
If you want to query âall submissions by operator X for task Y,â include both.
Example fields in ProofSubmittedV1:
operator: addresstaskId: bytes32requestId: bytes32
4) Emit derived fields for reads
Indexers prefer events that already contain the values they need.
For example, if you compute a qualityScore during verification, emit it in MeasurementAcceptedV1.
Example deterministic event schemas (ABI-style)
Below is a compact example set. The field order is intentional and should match your ABI.
// Event schemas (illustrative types)
event MeasurementRequestedV1(
bytes32 indexed requestId,
address indexed client,
bytes32 indexed taskId,
uint64 timestampSec,
uint256 maxRewardWei
);
event ProofSubmittedV1(
bytes32 indexed submissionId,
bytes32 indexed requestId,
address indexed operator,
uint64 proofTimestampSec,
bytes32 proofHash
);
event MeasurementAcceptedV1(
bytes32 indexed submissionId,
bytes32 indexed requestId,
address indexed operator,
uint64 acceptedTimestampSec,
uint256 qualityScore,
uint256 rewardWei,
bytes32 resultHash
);
event MeasurementRejectedV1(
bytes32 indexed submissionId,
bytes32 indexed requestId,
address indexed operator,
uint64 rejectedTimestampSec,
uint32 reasonCode
);
A few details worth noticing:
indexedfields are chosen to support common queries.proofHashandresultHashare fixed-size, so parsing is consistent.reasonCodeis numeric, so itâs stable and cheap to handle.
Reason codes: deterministic failure semantics
Define a small set of reason codes and keep them stable.
Example mapping:
1:NotEligibleOperator2:ProofExpired3:InvalidProofFormat4:QualityBelowThreshold5:DuplicateSubmission
Then MeasurementRejectedV1 can be interpreted without looking at revert data.
Dispute and challenge correlation
Disputes need to connect to the exact accepted measurement.
event DisputeOpenedV1(
bytes32 indexed disputeId,
bytes32 indexed submissionId,
bytes32 indexed requestId,
address indexed challenger,
uint64 openedTimestampSec,
bytes32 evidenceHash
);
event RewardAccountedV1(
bytes32 indexed requestId,
bytes32 indexed submissionId,
address indexed operator,
uint256 rewardWei,
uint64 accountedTimestampSec
);
event RewardSettledV1(
bytes32 indexed requestId,
address indexed operator,
uint256 settledWei,
uint64 settledTimestampSec
);
Key points:
DisputeOpenedV1referencessubmissionIdandrequestId.RewardAccountedV1records accounting separately from settlement.RewardSettledV1is the final âmoney movedâ moment.
Deterministic event ordering within a transaction
Within a single transaction, event order is the order of emission in the contract. Downstream systems should not assume a particular order across different transactions, but they can assume order within one tx.
A practical convention:
- Emit request events first.
- Emit submission events next.
- Emit accepted/rejected next.
- Emit dispute events only when a dispute is opened.
- Emit reward accounting after acceptance and any required delay logic.
If you ever need to change this ordering, version the event set or document the new sequence in your internal spec.
Example: end-to-end event trace for one request
Assume:
requestId = RsubmissionId = S- operator is
O
A typical trace might look like:
MeasurementRequestedV1(R, client, taskId, t0, maxRewardWei)ProofSubmittedV1(S, R, O, t1, proofHash)MeasurementAcceptedV1(S, R, O, t2, qualityScore, rewardWei, resultHash)RewardAccountedV1(R, S, O, rewardWei, t3)RewardSettledV1(R, O, rewardWei, t4)
If the proof is rejected, step 3 becomes MeasurementRejectedV1(S, R, O, t2, reasonCode) and you should not emit reward events.
Mind map: what to include in every event
Event Fields (Mind Map)

Practical parsing example (conceptual)
An indexer can build a deterministic state machine keyed by submissionId:
- On
ProofSubmittedV1, create a pending record. - On
MeasurementAcceptedV1, fill inqualityScore,rewardWei, andresultHash. - On
MeasurementRejectedV1, mark terminal failure withreasonCode. - On
DisputeOpenedV1, attach dispute metadata to the accepted submission. - On
RewardAccountedV1, record accounting. - On
RewardSettledV1, mark final settlement.
Because each eventâs schema is stable and each record is keyed by the same deterministic IDs, the indexer never needs to guess which fields correspond to which stage.
Deterministic event schemas are not about being fancy; theyâre about making the system legible to machines and humans at the same time. When you get the IDs, units, and lifecycle separation right, the rest of the architecture becomes much easier to implement correctly.
6.3 Finality Assumptions Example Confirming Settlements Safely
In a DePIN network, âsettlement confirmedâ means more than âa transaction was seen.â It means the chain state used for accounting wonât change under your feet. Finality assumptions define what you treat as final, when you credit rewards, and how you recover when reality disagrees.
What âfinalityâ means in practice
Different chains offer different guarantees, but your design should reduce to two questions:
- When can I safely treat an on-chain event as immutable for settlement?
- What do I do if I credited something and later the chain reorganizes?
A good rule: separate âevent observedâ from âevent finalized.â Your indexer may see an event quickly, but your settlement engine should only act when the event is finalized according to your chainâs rules.
A concrete settlement flow with explicit finality
Assume the protocol has:
- A Client submits a task result and proof.
- An Operator posts a measurement attestation.
- A Verifier (on-chain or off-chain) checks eligibility.
- A Settlement contract records the outcome and pays rewards.
A typical flow:
- Client submits
ResultSubmitted(taskId, operatorId, proofHash). - Operator submits
Attested(taskId, operatorId, measurementHash). - Verifier finalizes eligibility and emits
Eligible(taskId, operatorId, score). - Settlement contract emits
SettlementProposed(taskId, operatorId, amount). - After finality, the settlement engine marks
SettlementConfirmedand triggers payout.
The key is step 5: payout depends on finality, not mere inclusion.
Mind map: finality assumptions and where they apply
Example: probabilistic finality with confirmation depth
Suppose your chain uses probabilistic finality. You choose a confirmation depth of N blocks. Your indexer:
- Watches for
SettlementProposed. - Records the block number
bwhere it appeared. - Marks it final only when the chain head is at least
b + N.
This is not magic; itâs a policy. Your policy should be consistent across all components that act on settlement.
Example numbers:
N = 20.SettlementProposedappears at blockb = 1,000,100.- Your settlement engine confirms it only when the head reaches
1,000,120.
If a reorg happens before that point, the event may disappear. Your engine never credited it yet, so nothing needs undoing.
Example: deterministic finality with âfinalized blockâ
On chains with deterministic finality, you can wait for a block to be marked finalized by the protocol. Your indexer:
- Listens for
SettlementProposedin blocks. - Checks whether the block is finalized.
- Confirms settlement only when the block is finalized.
This reduces the need for a depth parameter, but you still need a clear rule for what you consider finalized in your code and tests.
Contract pattern: intent first, payout later
A safe approach is to make the contract store settlement intent separately from fund movement.
SettlementProposedrecords:(settlementId, taskId, operatorId, amount, proofHash).SettlementConfirmedis triggered only after finality checks off-chain.Payout(settlementId)moves funds and is idempotent.
Idempotency matters because your settlement engine may retry after timeouts. If Payout is called twice with the same settlementId, the second call should do nothing.
Hereâs a minimal pseudocode sketch of the contract-side idempotency logic:
function payout(bytes32 settlementId) external {
require(!paid[settlementId], "already paid");
require(confirmed[settlementId], "not confirmed");
uint256 amount = amounts[settlementId];
paid[settlementId] = true;
token.transfer(rewardRecipient[settlementId], amount);
emit PayoutExecuted(settlementId, amount);
}
Off-chain confirmation engine: what it must track
Your settlement engine needs more than âlatest block.â It needs to track:
- The canonical head it believes is current.
- A mapping from
settlementIdto the block whereSettlementProposedoccurred. - Whether the event is final under your policy.
- Whether it already triggered
SettlementConfirmedon-chain.
If you use confirmation depth, you also need to handle the case where the head moves backward temporarily (reorg). The engine should not mark events as final until the head has advanced enough beyond their block.
Handling the uncomfortable case: credited then reverted
Even with careful design, you should still plan for mistakes or unexpected chain behavior. The safe design goal is: minimize the surface area where reversals can cause harm.
Do this by:
- Crediting rewards only after finality.
- Moving funds only after on-chain confirmation.
- Keeping accounting entries tied to
settlementIdso you can reconcile.
If you ever credited before finality due to a bug, you need a compensation mechanism. A common approach is to keep credits in a ledger that can be adjusted by a corrective transaction keyed to the same settlementId.
A concrete âfinality-safeâ checklist
When implementing settlement confirmation, verify these invariants:
- Invariant 1: No payout is possible unless
confirmed[settlementId] == true. - Invariant 2:
confirmed[settlementId]is set only after your indexer declares the source event final. - Invariant 3:
payout(settlementId)is idempotent. - Invariant 4: The indexerâs finality rule is deterministic and testable (depth N or finalized flag).
- Invariant 5: Your engine can restart without double-confirming or double-paying.
Quick test scenario you can run
- Simulate a chain where
SettlementProposedappears at blockb. - Advance head to
b + N - 1and ensure your engine does not confirm. - Advance to
b + Nand ensure it confirms and triggers payout. - Retry payout calls and ensure only one transfer occurs.
This test forces the code to respect the finality boundary instead of treating âseenâ as âsafe.â
Finality assumptions are not a footnote; they are part of the settlement interface. When you make them explicitâobserved vs finalized, intent vs payout, and idempotency keyed by settlementIdâyou get a system that behaves predictably even when the chain behaves like a chain.
6.4 Indexing and Query Patterns â Example Building Efficient Read Models
A DePIN network often writes data in a way that is friendly to verification (deterministic events, minimal on-chain state). Clients, operators, and dashboards usually need the opposite: fast reads, flexible filtering, and stable views that donât require replaying every event from genesis. That gap is where read models and indexing patterns matter.
What âread modelâ means in this context
A read model is a materialized, query-friendly representation of the canonical event stream. The canonical source is still the chain (or whatever consensus log you use). The read model is a projection: it can be rebuilt from events, and it should be consistent with the latest finalized block height.
A practical rule: if your UI or client needs to answer a question like âWhich nodes are eligible for the next task?â or âHow much did operator X earn in the last week?â, you should not compute that by scanning raw events at request time.
Indexing goals and constraints
Indexing is not just about speed; itâs about predictable correctness.
- Correctness boundary: Decide whether queries are served from the latest finalized height, or whether you also expose âpendingâ data. For settlement-related views, finalized-only is usually safer.
- Determinism of projections: Given the same event history, the read model should end up in the same state. This keeps rebuilds reliable.
- Idempotency: Event handlers must tolerate reprocessing the same event (common during reorg handling or restarts).
- Backfill strategy: You need a way to bootstrap the index from block 0 (or from a snapshot) and then switch to incremental updates.
Mind map: indexing and query patterns
Choosing entities and keys
Start by listing the entities you will query frequently.
Common DePIN entities:
- Node: identity, status, last heartbeat time, operator association.
- Task: client request, assigned node set, measurement window, expected proof format.
- Proof: measurement payload reference, proof artifact reference, verification result.
- Settlement: reward amount, escrow status, dispute/challenge state.
Then pick stable keys that match how events identify things.
Example key strategy:
node_idfrom the registry admission event.task_idfrom the task creation event.proof_idderived from(task_id, proof_index)or a direct on-chain id.settlement_idfrom(task_id, phase)where phase might beproposed,finalized,disputed.
If you canât find a stable id in events, create one deterministically in the projection. For instance, hash (task_id, operator_id, proof_hash) into a proof_key. The projection must compute the same key every time.
Cursor-based ingestion: the backbone
A robust indexer ingests events in order and tracks a cursor.
- Cursor fields:
last_finalized_height, plus optionally anevent_indexwithin that block. - Processing loop: fetch events from
last_finalized_height + 1up to the newest finalized height, apply handlers, then advance the cursor.
This avoids âexactly onceâ fantasies. You get âat least onceâ with idempotent handlers.
Example: idempotent event handler
-- Table to track processed events
CREATE TABLE processed_events (
chain_id TEXT NOT NULL,
block_height BIGINT NOT NULL,
tx_hash TEXT NOT NULL,
log_index INT NOT NULL,
PRIMARY KEY (chain_id, block_height, tx_hash, log_index)
);
Use a primary key on the event identity. In your handler, insert into processed_events first; if it already exists, skip the rest.
Query patterns and how to model them
1) âList with filtersâ (eligibility and status)
Youâll often need to list nodes that match criteria: active, not slashed, within a region, and healthy.
A good pattern is to maintain a NodeStatus table updated by events.
Example columns:
node_idoperator_idstatus(active,inactive,slashed)last_heartbeat_atregionquality_score_latesteligibility_until(if eligibility is time-bound)
Then your query becomes a simple filter:
status = 'active'last_heartbeat_at > now() - interval 'T'region IN (...)eligibility_until >= now()
This avoids joining raw event tables on every request.
2) âLookup by idâ (task/proof retrieval)
Clients frequently fetch a single taskâs current state and the proof artifacts.
Maintain a TaskView table keyed by task_id:
task_idclient_idstatus(assigned,proof_submitted,verified,settlement_finalized,disputed)assigned_node_ids(either normalized or stored as an array)latest_proof_refverification_resultsettlement_ref
For proof artifacts, store references (hashes, URIs, or content addresses) rather than the raw payload. The read model should be small and fast.
3) Aggregations (earnings, quality, and throughput)
Dashboards and operator tools need aggregates.
Two approaches:
- Pre-aggregated tables updated incrementally (fast reads, more write work).
- On-demand aggregation for low-traffic endpoints (simpler, slower).
For earnings, pre-aggregation is usually worth it.
Example: OperatorEarningsDaily
operator_idday(UTC date)earned_amountquality_multiplier_sumtasks_verified_count
Update it when a settlement finalizes. If disputes can reverse outcomes, you must also handle âcorrectionâ events by adjusting totals.
Handling reorgs and finality
If your chain has reorgs, you need a policy.
- If you index only finalized blocks, reorg handling is simpler: you never retract finalized events.
- If you also index non-finalized data, you need rollback support. That means either:
- storing versioned projections per block height, or
- keeping a reversible log of changes.
For most settlement and eligibility views, finalized-only reads reduce complexity.
Storage choices: match the query shape
- Relational tables are excellent for entity views, filtering, and joins.
- Document stores can work well for âtask detailâ objects, especially when the shape varies by task type.
- Search indexes help for text-like fields (e.g., metadata tags), but most DePIN queries are numeric and categorical, so relational often wins.
A common hybrid: relational for canonical read models, plus a lightweight search index for metadata browsing.
Example: building a TaskStatus read model
Assume events like:
TaskCreated(task_id, client_id, expected_window, ...)TaskAssigned(task_id, node_id, ...)ProofSubmitted(task_id, node_id, proof_ref, ...)ProofVerified(task_id, node_id, result, quality_score, ...)SettlementFinalized(task_id, operator_id, amount, ...)
Projection tables:
task_view(task_id PK, status, client_id, latest_quality_score, settlement_amount, ...)task_node_view(task_id, node_id, proof_status, verified_result, quality_score, PRIMARY KEY(task_id, node_id))
Update rules:
- On
TaskCreated: inserttask_viewwith statusassigned(orcreated). - On
TaskAssigned: insert/updatetask_node_viewrows. - On
ProofSubmitted: setproof_status = 'submitted'. - On
ProofVerified: setproof_status = 'verified', update quality score, and possibly settask_view.statustoverifiedwhen all required nodes are verified. - On
SettlementFinalized: settask_view.status = 'settlement_finalized'and storesettlement_amount.
This turns a multi-event history into a single-row lookup for the UI.
Practical performance tips that donât require magic
- Index the columns you filter on (
status,region,eligibility_until,operator_id,day). - *Avoid âselect â in read endpoints; keep payloads small.
- Separate write-heavy tables from read-heavy tables if needed (e.g., store raw proof submissions separately from the summarized task view).
- Use batch ingestion during backfill to reduce overhead.
Consistency checks: prove your projection matches reality
Add internal invariants:
- If
task_view.status = 'settlement_finalized', there must exist a corresponding finalized settlement event for thattask_id. - If
task_node_view.proof_status = 'verified', the storedquality_scoremust match the verification event.
These checks can run during backfill and periodically in production. They catch projection bugs early, before users do.
6.5 Governance of Protocol Parameters Example Controlled Updates With Versioning
Protocol parameters decide how the network behaves: what counts as valid work, how rewards are computed, which nodes are eligible, and how disputes are handled. Governance is the mechanism that changes those parameters safely, with versioning so clients and operators can agree on what rules were used.
Why parameter governance needs versioning
A parameter change is not just ânew settings.â It changes the meaning of future transactions and, often, the interpretation of proofs. Without versioning, you get ambiguity like: âWas this proof evaluated under the old quality threshold or the new one?â Versioning makes evaluation deterministic by tying each proof and settlement to a specific ruleset.
A practical rule of thumb: if a parameter can affect eligibility, scoring, or settlement, it must be versioned and referenced by the on-chain record that finalizes outcomes.
Parameter taxonomy: what to version and what to keep fixed
Not every value needs the same treatment.
- Eligibility parameters (e.g., minimum uptime, allowed regions, required hardware class): must be versioned because they gate whether a node can participate.
- Scoring parameters (e.g., quality multipliers, weighting of metrics): must be versioned because they change reward outcomes.
- Verification parameters (e.g., proof freshness window, challenge period length): must be versioned because they affect whether proofs are accepted.
- Operational parameters (e.g., logging verbosity, rate limits for internal services): can be off-chain or non-consensus, so long as they do not change settlement meaning.
This distinction keeps governance focused on consensus-critical changes.
Controlled update model: propose â validate â schedule â activate
A controlled update flow reduces surprises.
- Propose a parameter set with a unique version identifier.
- Validate the proposal against constraints (type checks, bounds, and compatibility rules).
- Schedule an activation block/time so clients can prepare.
- Activate by publishing the new version and mapping it to an activation point.
The key is that activation is explicit and observable, not âwhenever the transaction lands.â
On-chain data model for parameter versions
A clean pattern is to store:
- A ParameterVersion object: the version id, activation point, and a hash of the parameter payload.
- A ParameterPayloadHash: used to verify that off-chain parameter data matches what was approved.
- A Compatibility record: which client/proof formats are supported by that version.
Even if the full payload is stored off-chain, the on-chain hash anchors what was approved.
Example: versioned parameter set
Suppose the protocol has these consensus parameters:
quality_min(e.g., 0.80)reward_multiplier(e.g., 1.25)proof_freshness_seconds(e.g., 300)challenge_window_blocks(e.g., 720)
A governance proposal creates version = 12 with a payload hash H12. The chain records:
activation_block = 18,000,000payload_hash = H12
When a client submits a proof, it includes version = 12 in the submission metadata. The verifier then checks that the proofâs version matches the version active for the proofâs evaluation context.
Activation semantics: âactive at evaluation timeâ vs âactive at submission timeâ
You must choose one and document it.
- Active at submission time: the version is determined when the proof is submitted.
- Active at evaluation time: the version is determined when the proof is verified or when settlement is finalized.
For many systems, âactive at submission timeâ is easier for clients because they know which version they are targeting. If verification can be delayed, you still want the proof to carry the intended version so the verifier can evaluate consistently.
A robust approach is: proofs are evaluated using the version they declare, but the protocol enforces that the declared version is valid for the proofâs submission context (e.g., submission must occur after activation and before a deprecation point).
Compatibility rules: preventing mismatched clients and proofs
Versioning is not only about parameter values; itâs also about format compatibility.
A common failure mode is changing a parameter that affects scoring while a client still produces proofs under an older scoring interpretation. To prevent this, governance can require:
- A proof format version (e.g.,
proof_schema_version) that changes only when the proof structure changes. - A parameter version that changes when scoring/thresholds change.
Then you can allow parameter-only updates without forcing proof schema changes.
Example compatibility constraint
proof_schema_versionmust be3for all parameter versions>= 10.quality_minandreward_multipliercan change without changing schema.
If a proposal tries to activate version = 13 with an incompatible schema requirement, it fails validation.
Validation constraints: bounds, monotonicity, and safety rails
Governance should reject proposals that are technically valid but operationally dangerous.
Examples of validation checks:
- Bounds:
quality_minmust be in[0, 1]. - Freshness window:
proof_freshness_secondsmust be between60and3600. - Challenge window:
challenge_window_blocksmust be at least the maximum expected proof propagation delay. - Reward sanity:
reward_multipliermust not exceed a configured cap.
These checks are simple, deterministic, and easy to test.
Dispute and challenge governance: versioned evidence rules
Disputes depend on rules too. If the challenge window or evidence requirements change, you need to ensure disputes use the rules that were in effect when the underlying settlement decision was made.
A practical rule: dispute resolution references the parameter version used for the contested settlement. That way, evidence submission and acceptance criteria are consistent.
Example
- Settlement for a task is finalized using
parameter_version = 12. - A dispute is opened later.
- The dispute contract loads
parameter_version = 12and applies the correspondingchallenge_window_blocksand verification parameters.
This prevents a dispute from being decided under a different set of rules than the one that produced the original settlement.
Governance execution: multi-step and auditable
Controlled updates should be hard to do accidentally.
- Two-step execution: store the proposal hash first, then execute activation after a delay.
- Quorum and voting: require a minimum participation threshold.
- Event logs: emit events that include version id, activation block, and payload hash.
Auditing matters because operators and clients need to know what changed without reading governance internals.
Mind map: parameter governance with versioning
Concrete example: controlled update with a scoring threshold
Assume the network uses quality_min to decide whether a proof earns any reward.
- Current active version:
11withquality_min = 0.80. - Governance proposes
version = 12withquality_min = 0.85. - Validation checks:
quality_minis within[0, 1].proof_schema_versionremains3.proof_freshness_secondsunchanged.
- Scheduling:
activation_block = 18,000,000.
A client submits a proof at block 17,999,900 and declares parameter_version = 11. The verifier accepts it under version 11 rules.
Another client submits at block 18,000,050 and declares parameter_version = 12. The verifier accepts it under version 12 rules.
If a client tries to submit at block 18,000,050 but declares parameter_version = 11, the verifier rejects it because version 11 is no longer valid for that submission context.
This single policyâdeclared version must be valid for the submission contextâeliminates the âwhich rules applied?â problem.
Concrete example: deprecating a version safely
Sometimes you want to stop accepting old versions.
- Version 11 remains valid only until
deprecation_block = 18,200,000. - After that, proofs must declare version 12 or later.
Deprecation is just another governance-controlled field in the version record. It should be enforced consistently in proof submission, verification, and dispute handling.
Summary of best practices for controlled parameter updates
- Version every consensus-critical parameter that can affect eligibility, scoring, verification, or settlement.
- Anchor parameter payloads with on-chain hashes and publish activation points.
- Require proofs and disputes to reference the parameter version used for evaluation.
- Enforce compatibility constraints so parameter-only changes do not break proof formats.
- Use deterministic validation bounds and timing constraints to prevent accidental misconfiguration.
- Emit clear events so clients and operators can track what changed without guesswork.
7. Off-Chain Data, Storage, and Retrieval Architecture
7.1 Data Partitioning Example Separating Raw Data From Proof Artifacts
A DePIN network usually needs two different kinds of data:
- Raw data: the original measurements or observations (e.g., sensor readings, GPS traces, images, logs).
- Proof artifacts: compact, verifiable evidence derived from the raw data (e.g., signed measurement bundles, Merkle proofs, zk proof objects, or verifier-ready transcripts).
Partitioning means you intentionally design boundaries so that raw data and proof artifacts have different storage, access, and lifecycle rules. This keeps verification fast, reduces storage costs, and limits the blast radius when something goes wrong.
Why separate raw data from proof artifacts?
Verification rarely needs the full raw payload. A verifier typically checks a small set of commitments, signatures, and proof objects. If you store everything together, you end up paying for bandwidth and storage even when only the proof is required.
Raw data has different sensitivity. Some raw inputs can be personal, location-specific, or proprietary. Proof artifacts can be designed to reveal only what verification requires.
Operationally, raw data expires sooner. You may keep raw data for debugging and audits, but you donât need it forever for routine verification. Proof artifacts should remain available for settlement and dispute resolution.
A concrete example: âRoad Surface Qualityâ measurements
Imagine a DePIN network where nodes measure road surface quality using a phone-mounted sensor suite.
- Raw data example: a time series of accelerometer samples plus GPS points, stored as a large binary blob.
- Proof artifact example: a signed summary containing:
- a hash commitment to the raw blob,
- extracted features (e.g., averaged vibration index per segment),
- a Merkle root over per-segment feature records,
- and a verifier-ready signature from the node.
The verifier can check the signature, recompute commitments from the proofâs claimed structure, and validate Merkle inclusion without downloading the entire raw blob.
Data partitioning model
Use three layers of data, each with explicit ownership and retention.
-
Raw layer (R)
- Stored off-chain, often in object storage.
- Access is restricted (or time-limited).
- Used for audits, reprocessing, and dispute evidence.
-
Proof layer (P)
- Stored off-chain and referenced by on-chain events.
- Designed to be small and verifier-friendly.
- Retained for settlement and challenges.
-
Index layer (I)
- Lightweight metadata for discovery.
- Contains pointers, hashes, and status.
- Often cached aggressively.
A key rule: on-chain should reference commitments, not raw payloads. The chain records what must be true; off-chain stores how to prove it.
Mind map: partitioning responsibilities and artifacts
Data Partitioning Mind Map
How to link raw data to proof artifacts (without mixing them)
You need a deterministic relationship so that a proof artifact can be traced back to the raw input it was derived from.
A practical pattern is commitment-first:
- The node computes a hash of the raw payload:
- Let the raw blob be (B\).
- Compute (h = H(B)\).
- The node builds proof artifacts that include (h\) and any derived structure.
- The node submits the proof artifact reference to the network.
- The verifier checks the proof artifact and confirms it corresponds to the committed raw hash.
If a dispute occurs, the challenger can request the raw blob (or a subset) and verify that its hash matches the committed value.
Example: Merkle-based segmentation
Suppose the raw time series is split into segments (S_1, S_2, \dots, S_n\). For each segment, the node computes a feature record (f_i\) (e.g., vibration index).
- Raw payload: (B = \text{concat}(S_1, \dots, S_n)\)
- Segment features: (f_i = \text{FeatureExtract}(S_i)\)
- Leaf hashes: (\ell_i = H(f_i)\)
- Merkle root: (r = \text{MerkleRoot}(\ell_1, \dots, \ell_n)\)
The proof artifact includes:
- (h = H(B)\)
- (r\)
- the node signature over \[ \text{commit} = H(h || r || \text{taskId} || \text{timeWindow}) \]
The verifier checks:
- signature validity,
- that the proof artifactâs Merkle root matches the claimed segment feature structure,
- and that the on-chain commitment corresponds to the same \(\text{commit}\).
Notice what the verifier does not need: the raw segments (S_i\). It only needs the proof artifact and the commitments.
Storage and access rules that follow from partitioning
A clean partitioning design includes explicit policies.
-
Raw storage policy
- Keep raw blobs for a limited window (e.g., until the dispute window closes).
- Require authenticated access for challengers.
- Log every raw retrieval for auditability.
-
Proof storage policy
- Keep proof artifacts until the settlement period ends.
- Make proof retrieval deterministic by task ID and commitment hash.
- Store enough data to re-run verification without needing raw blobs.
-
Index policy
- Store pointers and status flags (e.g., âproof acceptedâ, âchallenge openedâ).
- Keep index updates idempotent so retries donât corrupt state.
Example workflow: from measurement to settlement
- Node measures and uploads raw blob (B\) to raw storage.
- Node computes (h = H(B)\) and generates proof artifact (P\) containing (h\), derived features, and Merkle root (r\).
- Node submits a transaction/event referencing the proof commitment (not the raw blob).
- Verifier retrieves only (P\) and validates it.
- Settlement uses the proof artifact commitment recorded on-chain.
- Dispute (optional): challenger requests raw blob (B\) and checks (H(B)=h\).
This workflow keeps routine verification lightweight while still supporting accountability.
Common pitfalls to avoid
- Accidental coupling: storing proof artifacts that implicitly require raw blobs to be present for verification.
- Non-deterministic derivations: features computed with randomness or unstable preprocessing, making it impossible to reproduce commitments.
- Missing linkage: proof artifacts that donât include a commitment to the raw payload, forcing unverifiable âtrust meâ behavior.
Partitioning is not just a storage decision; itâs a correctness and operations decision. When raw and proof are clearly separated and linked by commitments, verification stays efficient and disputes stay grounded in checkable evidence.
7.2 Storage Options Example Object Storage With Content Addressing
Object storage is a practical fit for DePIN proof artifacts because it separates âwhere bytes liveâ from âwhat the bytes mean.â Content addressing makes that separation safer: the storage key is derived from the content, so clients can verify they fetched the right artifact without trusting the storage provider.
What you store (and what you donât)
In a DePIN measurement flow, you typically store three categories:
- Raw evidence: sensor logs, images, audio, or device telemetry dumps. These can be large and are often not directly verifiable without additional context.
- Proof artifacts: the compact outputs used for verification (e.g., signed measurement bundles, Merkle proofs, or zk proof objects). These are usually smaller and are what verifiers need.
- Metadata: human-readable descriptions, schema versions, and pointers that help clients interpret evidence.
A good rule: store bytes you want to retrieve later, and store only the minimum metadata needed to interpret those bytes. Anything that affects verification should be either embedded in the proof artifact or referenced in a way that is itself content-addressed.
Content addressing: the core idea
With content addressing, each object is named by a hash of its content.
- Let \(h = H(bytes)\).
- The object key becomes something like
sha256/<h>. - Any client that receives \(h\) can recompute \(H(bytes)\) after download and confirm integrity.
This turns storage into a âdumb byte bucketâ with strong integrity guarantees.
Mind map: storage design decisions
Example: storing a proof artifact with a manifest
A common pattern is to store a manifest that lists the content-addressed items required for verification.
- The prover creates:
proof.json(the verification artifact)evidence-chunk-1.bin,evidence-chunk-2.bin, … (optional if you want evidence availability)
- The prover computes hashes for each object.
- The prover creates a
manifest.jsonthat includes:- the hashes of
proof.jsonand any evidence chunks - the schema version
- the measurement identifiers (e.g., task ID, device ID) as plain fields
- the hashes of
- The prover signs the manifest (or signs the proof artifact that includes the manifest hash).
- The prover uploads each object to object storage under keys derived from their hashes.
A verifier then:
- downloads
manifest.jsonby its hash (or receives it from the client) - downloads
proof.jsonby the hash listed in the manifest - recomputes hashes locally to confirm integrity
- verifies the signature and then runs verification using the proof artifact
This keeps trust focused on cryptographic checks, not on storage correctness.
Example object key scheme
Use a deterministic mapping from content hash to storage key.
sha256/<hash>for single objectssha256/<hash>/chunk/<index>if you store chunked evidence under a parent hash
To avoid accidental collisions between different object types that happen to share the same bytes, you can apply domain separation when hashing.
- Example: hash
"proof:" + bytesfor proof artifacts - hash
"evidence:" + bytesfor raw evidence
That way, the same byte sequence cannot produce a key in the wrong namespace.
Chunking large evidence with Merkle roots
Raw evidence can be too large to handle as one object. Chunking helps with partial retrieval and deduplication.
A practical approach:
- Split evidence into fixed-size chunks (e.g., 1â4 MiB).
- Compute a hash for each chunk.
- Build a Merkle tree over chunk hashes.
- Store:
- each chunk as a content-addressed object
- the Merkle root (as part of the manifest)
- optionally a compact Merkle proof for the verifierâs needed subset
The verifier can validate chunk integrity using the Merkle root, even if it only downloads a subset of chunks.
Mind map: retrieval flow
Concrete example: what a manifest might contain
A manifest should be small, deterministic, and content-addressed itself.
schemaVersion: e.g.,"evidence-manifest-v1"taskId: the DePIN task identifierdeviceId: the node identifier used in the protocolproof:{ "hash": "...", "algo": "sha256", "type": "proof" }evidence: optional list of chunk hashes and/or a Merkle rootcreatedAt: timestamp (useful for auditing, not for trust)nonce: helps prevent accidental reuse in logs
If you sign the manifest, include the exact serialized bytes in the signature process. That prevents âsame fields, different formattingâ issues.
Integrity checks that actually matter
Content addressing gives integrity for bytes, but verification still needs to ensure meaning.
- Hash checks confirm the downloaded bytes match the expected object.
- Signature checks confirm the creator authorized the manifest/proof.
- Verification logic confirms the proof corresponds to the task and measurement rules.
If you only do hash checks, you prevent storage tampering but not a wrong proof produced by a malicious or faulty node.
Retention and garbage collection rules
Object storage can accumulate data quickly, especially with raw evidence. Define retention policies that match verification needs.
- Keep proof artifacts long enough for dispute windows and audits.
- Keep raw evidence only if your protocol requires evidence availability for challenges.
- Use reference counting or periodic sweeps based on manifest hashes that are still reachable from on-chain records.
A simple operational rule: if no manifest (or on-chain pointer) references an object hash, it can be deleted after the maximum time window during which it could be needed.
Putting it together: an end-to-end storage checklist
- Compute content hashes with domain separation.
- Store each object under a key derived from its hash.
- Use a manifest to bind proof artifacts to evidence (or to declare evidence is intentionally omitted).
- Sign the manifest or ensure the proof artifact commits to the manifest hash.
- Verify hashes after download before running verification.
- Apply retention rules based on protocol windows and manifest reachability.
This design makes storage predictable: clients can fetch by hash, verify integrity locally, and treat the storage layer as an untrusted transport rather than a trust anchor.
7.3 Retrieval and Caching Example Client-Side Caches With Integrity Checks
Client-side retrieval is where âit worked on my machineâ becomes âit works in production.â The goal is simple: fetch the right proof artifacts efficiently, reuse them safely, and detect tampering or mismatches before you pay or verify anything.
Core idea: cache by content, not just by name
A good client cache stores two things for each artifact: (1) the bytes (or a pointer to them) and (2) an integrity fingerprint that lets you confirm the bytes match what the protocol expects.
In practice, the protocol should provide a content identifier for each artifact. Common choices include a hash (e.g., SHA-256) or a content-addressed identifier. The client then:
- Checks whether the cache already has an entry for that identifier.
- If present, verifies the cached bytes against the expected identifier.
- If missing or invalid, downloads the artifact and verifies it before use.
This makes caching robust against stale data, partial downloads, and accidental mix-ups between similarly named files.
Mind map: retrieval and caching flow
Example: caching a proof artifact with hash verification
Assume the protocol returns a receipt that includes an expected hash for a proof artifact:
artifact_id:sha256:...artifact_url: where to fetch it (could be multiple)artifact_type:measurement_proof_v1
A minimal client workflow looks like this:
-
Cache lookup
- Compute the cache key from
artifact_id. - If the cache has an entry, read bytes.
- Verify
sha256(bytes) == expected_hash.
- Compute the cache key from
-
Integrity failure handling
- If the hash doesnât match, treat it as corrupted or mismatched.
- Delete the cache entry.
- Fetch again.
-
Atomic write on download
- Download to a temporary location.
- Verify hash.
- Move into the cache only after verification succeeds.
Hereâs a concrete pseudo-implementation (language-agnostic):
function getArtifact(expectedHash, url, cache):
key = "sha256:" + expectedHash
entry = cache.get(key)
if entry exists:
bytes = entry.readBytes()
if sha256(bytes) == expectedHash:
return bytes
else:
cache.delete(key)
bytes = download(url)
if sha256(bytes) != expectedHash:
raise IntegrityError("artifact hash mismatch")
cache.putAtomic(key, bytes)
return bytes
The key detail is that verification happens before the artifact is used, not after. If you verify after youâve already fed the bytes into a parser or verifier, youâve already done extra work and potentially exposed yourself to malformed inputs.
Format sanity checks: cheap checks before expensive ones
Hash verification confirms integrity, but it doesnât confirm that the artifact is the right kind of data for the current request. Add lightweight checks before full parsing:
- Confirm the artifact header matches
artifact_type. - Confirm the declared length matches the actual length.
- Confirm required fields exist (e.g., proof elements count).
These checks prevent confusing errors like âproof verification failedâ when the real problem is that you cached the wrong artifact under the right hash (rare, but not impossible if the protocol identifiers are miswired).
Cache entry structure and eviction
A cache entry should include:
key: content identifier (hash-based)bytesorpathretrieved_atsizeartifact_type
Eviction can be simple and deterministic:
- Use an LRU policy with a max size (e.g., 2â5 GB).
- Keep entries keyed by content hash so eviction doesnât break correctness.
If you evict an entry, you donât lose correctness; you just lose the performance benefit.
Concurrency and idempotency: avoid duplicate downloads
When multiple client requests need the same artifact, naive code can trigger multiple downloads. Use an âin-flightâ map keyed by artifact_id:
- First request starts the download.
- Subsequent requests await the same promise/future.
- Once verified, all waiters receive the same cached bytes.
This reduces network load and avoids race conditions where one download writes while another fails.
Example: request-scoped cache vs global cache
Clients often need both:
-
Global cache (content-addressed):
- Shared across requests.
- Verified by hash.
- Evicted by size.
-
Request-scoped cache (metadata):
- Stores mapping from a request ID to expected artifact IDs.
- Helps avoid repeated parsing of receipts.
- Can be short-lived.
Global cache ensures correctness and reuse. Request-scoped cache improves speed without storing large blobs.
Handling multiple URLs and partial failures
Protocols may provide multiple retrieval endpoints. A practical strategy:
- Try the first URL.
- If download fails (timeout, 404), try the next.
- If download succeeds but hash verification fails, discard bytes and try another URL.
Hash mismatch is treated as a hard failure for that artifact. Itâs not a âmaybe itâs fineâ situation; the client must not use it.
Putting it together: a full retrieval checklist
When the client needs an artifact:
- Read expected
artifact_id(hash/content ID) from the receipt. - Look up in global cache by that identifier.
- If hit, verify hash before parsing.
- If miss, download from one of the provided URLs.
- Verify hash before any parsing or verification.
- Run format sanity checks.
- Only then pass the artifact to the proof verifier.
- Write to cache atomically after successful verification.
This approach keeps caching fast while ensuring that every artifact used by the verifier is exactly the one the protocol described.
7.4 Data Retention and Deletion Policies Example Compliance-Friendly Design
A DePIN network usually produces three kinds of data: (1) raw measurements (often large), (2) proof artifacts (smaller but still sensitive), and (3) audit records (metadata that helps reconcile payments and disputes). Retention and deletion policies should treat these categories differently, because they have different legal exposure and different technical roles.
Start with a data inventory that matches system behavior
Before writing any policy, list each data item and answer four questions: purpose, owner, storage location, and deletion trigger. âOwnerâ means who can authorize retention changes, not who physically stores bytes.
Example inventory for a measurement workflow:
- Node registration metadata (identity claims, public keys): purpose = membership control; owner = governance module; storage = on-chain + off-chain index; deletion trigger = key revocation + grace period.
- Raw measurement payload (e.g., sensor readings): purpose = proof generation and dispute evidence; owner = client/operator; storage = object store; deletion trigger = after proof finality + dispute window.
- Proof artifact (e.g., signed measurement + commitment): purpose = verification and settlement; owner = protocol verifier; storage = off-chain + hash anchored on-chain; deletion trigger = after settlement finality (or earlier if policy allows).
- Audit event log (e.g., âproof acceptedâ, âreward paidâ): purpose = reconciliation; owner = protocol; storage = on-chain events + off-chain index; deletion trigger = none for on-chain events; off-chain index can be pruned.
A policy that ignores âwho can delete whatâ tends to fail in practice, because storage systems and indexes often outlive the data they reference.
Use retention windows tied to protocol finality
Deletion triggers should be expressed in terms of protocol milestones, not calendar dates. Calendar-based rules are easy to misunderstand when disputes or reorg-like events affect when something becomes final.
A practical pattern:
- Proof submission phase: keep raw measurement and proof artifacts for a short window.
- Challenge/dispute window: extend retention to cover evidence submission.
- Settlement finality: after finality, you can delete raw measurement while keeping minimal proof metadata.
Concrete example:
- Dispute window = 14 days.
- After a proof is accepted, raw measurement is retained for 14 days.
- After settlement finality, delete raw measurement, but keep:
- the proof artifact needed for verification replay (or just its hash),
- the on-chain event references,
- any dispute-resolution record required for accounting.
This design keeps deletion aligned with what the system might need later.
Separate âdelete bytesâ from âkeep hashesâ
Many teams accidentally treat deletion as an all-or-nothing switch. In DePIN, you can often delete large payloads while preserving small commitments.
A compliance-friendly approach:
- Delete raw payloads (sensor readings, attachments).
- Keep commitments/hashes that allow auditors to verify that a deleted payload corresponded to a specific proof.
- Keep minimal derived metadata required for settlement correctness.
Example: if a proof includes a Merkle root of raw samples, you can delete the sample leaves but keep the Merkle root (and the proof structure needed to verify membership) depending on your verification requirements.
Define deletion scope: full, partial, and redacted
Not all data can be deleted in the same way.
Use three scopes:
- Full deletion: remove the object and all replicas.
- Partial deletion: remove specific fields (e.g., location tags) while keeping the rest.
- Redaction: replace content with a placeholder while preserving structure for referential integrity.
Example policy for a measurement payload that includes both a reading and a GPS coordinate:
- If the coordinate is not required after proof finality, store it separately.
- Delete the coordinate object after the dispute window.
- Keep the reading and its commitment so the proof remains verifiable.
This avoids breaking verification pipelines that expect a stable object shape.
Make deletion enforceable with lifecycle automation
A policy written in prose is not a deletion mechanism. Implement lifecycle automation at the storage layer and at the application layer.
Storage-layer controls (object store lifecycle rules):
- Set expiration for raw payload buckets based on âaccepted_at + dispute_windowâ.
- Use separate buckets for raw payloads vs proof artifacts.
Application-layer controls (index and cache cleanup):
- Remove database rows that reference deleted objects.
- Invalidate caches that might still serve stale payloads.
A simple rule of thumb: if a system can still return the deleted payload via an API, it is not deleted.
Handle backups and replicas explicitly
Backups are where deletion plans go to die. Decide whether backups are included in deletion scope.
A compliance-friendly stance:
- If your policy requires deletion, ensure backups either expire quickly enough or are excluded from the deletion scope with a documented retention limit.
- For replicas, ensure lifecycle rules apply to all storage classes.
Example:
- Raw payloads: primary storage expires in 14 days after finality.
- Backups: expire in 30 days; during that period, the system should not serve the payload, even if it still exists in backup media.
This keeps user-facing behavior consistent with the deletion policy.
Document exceptions and âcannot deleteâ categories
Some data cannot be deleted without breaking correctness or violating immutable ledgers.
Common exception categories:
- On-chain events: typically immutable; you can prune off-chain indexes but not the chain history.
- Accounting-critical records: if settlement requires a record for audit, keep the minimal record needed.
- Legal hold: if a dispute is under investigation, pause deletion for the specific case.
Example exception handling:
- If a dispute is opened before the raw payload expiration time, extend retention for that case until the dispute resolves.
- After resolution, resume deletion for the remaining items.
Mind map: retention and deletion design
Mind map: Data retention and deletion policies in DePIN
Example: a concrete policy for a measurement-to-settlement flow
Assume:
- Dispute window: 14 days.
- Settlement finality: after N confirmations.
Policy:
- When a client submits a measurement, store raw payload in
raw_measurements/with metadataaccepted_at. - Store proof artifact in
proof_artifacts/with metadatasettlement_finality_at. - When the proof is accepted, schedule raw payload deletion at
accepted_at + 14 days. - If a dispute is opened, cancel the scheduled deletion for that case and reschedule to
dispute_resolved_at + 1 day. - After settlement finality, delete proof artifacts that are not required for verification replay, but keep:
- proof hash,
- verification parameters needed to interpret it,
- on-chain event references.
- Prune off-chain indexes that map proof IDs to deleted payload locations.
- Ensure APIs return only what is still retained; if a payload is deleted, the API returns a ânot availableâ status rather than a stale copy.
This policy is compliance-friendly because it ties deletion to operational milestones, minimizes retained content, and prevents accidental re-exposure through indexes or caches.
Implementation checklist (short and practical)
- Maintain a data inventory with purpose, owner, location, and deletion trigger.
- Express retention in protocol milestones (accepted, disputed, final).
- Delete raw payloads; keep only minimal commitments/hashes when verification still needs them.
- Implement lifecycle automation in storage and cleanup in application indexes.
- Define backup/replica behavior so user-facing deletion is consistent.
- Document exceptions: immutable on-chain data, legal holds, and accounting-critical minimal records.
- Add tests that confirm deleted payloads cannot be retrieved via any API path.
7.5 Integrity Verification Example Hashing, Signatures, and Proof Links
Integrity verification is the boring part that saves you from exciting bugs. In a DePIN pipeline, you typically have three integrity problems: (1) the data wasnât changed, (2) the proof wasnât forged, and (3) the proof can be tied back to the exact data and request it claims to cover.
What you are protecting
- Raw measurement data (files, logs, sensor readings).
- Proof artifacts (signed statements, Merkle roots, zk proofs, verification receipts).
- Links between them (the âthis proof corresponds to that data and that taskâ relationship).
A good design makes each link checkable with minimal trust.
Hashing: make content addressable
Use hashing to create stable fingerprints.
- Hash the canonical bytes, not a JSON pretty-print. Canonicalization prevents âsame content, different formattingâ issues.
- Prefer domain-separated hashes so you donât accidentally treat a measurement hash as a proof hash.
A practical pattern:
h_data = H("depin:data:v1" || canonical_bytes(data))h_proof = H("depin:proof:v1" || canonical_bytes(proof_payload))
Then you can store or transmit h_data and h_proof without moving the full payload.
Example: hashing a measurement bundle
Suppose a node submits a bundle containing:
timestampdevice_idmeasurements[]
You serialize the bundle in a canonical way (fixed key order, no whitespace significance), then compute:
h_data = SHA-256("depin:data:v1" || canonical_bundle_bytes)
If a single number changes, h_data changes. Thatâs the point.
Signatures: prove authorship and intent
Hashing tells you âwhat,â signatures tell you âwho and what they agreed to.â
- Sign the hash, not the entire payload, to keep signatures small and avoid canonicalization mismatches.
- Include context fields in the signed message so the signature canât be replayed for a different task.
A typical signed message structure:
task_idnode_idh_datah_proofnonce(orchallenge_id)expiry(optional, but useful)
Then:
sig = Sign(node_private_key, canonical(task_id, node_id, h_data, h_proof, nonce, expiry))
Example: signing a proof receipt
A node produces a proof payload P and computes h_proof = H("depin:proof:v1" || P_bytes). It also computes h_data from the measurement bundle. The node signs a receipt:
Receipt = { task_id, node_id, h_data, h_proof, nonce }sig = Sign(node_key, canonical(Receipt))
A verifier checks:
sigis valid fornode_idâs public key.h_datamatches the data it fetched (or the commitment it was given).h_proofmatches the proof payload it received.
Proof links: tie everything together
Hash and signature checks are necessary, but not sufficient, because you also need to ensure the proof is linked to the correct data and request.
A âproof linkâ is a set of identifiers and commitments that let the verifier reconstruct the claimed relationships.
Common link components:
- Task identifier: the exact request the node responded to.
- Data commitment:
h_dataor a Merkle root committing to the raw data. - Proof commitment:
h_proofor a commitment to the proof payload. - Verification context: parameters used to interpret the proof (measurement window, units, thresholds).
Example: Merkle root as a data commitment
If the measurement bundle contains many samples, you can commit to them with a Merkle tree:
- Leaves:
L_i = H("depin:leaf:v1" || canonical(sample_i)) - Root:
root = MerkleRoot(L_i)
Then the nodeâs signed receipt includes root instead of hashing the entire bundle. The verifier can request only the samples needed for verification and check Merkle inclusion proofs.
Mind map: integrity verification components
End-to-end example: from request to verified receipt
Assume a client creates a task:
task_id = 0xabc...- It specifies a measurement window and expected units.
The node responds with:
data_payload(or a pointer plus Merkle proofs)proof_payloadreceipt = { task_id, node_id, h_data, h_proof, nonce }sig
The verifier does:
- Recompute
h_datafrom the received data (or from the Merkle root plus inclusion proofs). - Recompute
h_prooffrom the received proof payload. - Check receipt link fields:
receipt.task_idequals the task being verified, andreceipt.node_idmatches the claimed node. - Verify signature:
Verify(node_public_key, canonical(receipt), sig). - Final consistency: ensure
receipt.h_dataequals recomputedh_data, andreceipt.h_proofequals recomputedh_proof.
If any step fails, the verifier rejects the submission. That rejection is deterministic and explainable: âhash mismatch,â âsignature invalid,â or âtask_id mismatch.â
Practical notes that prevent common mistakes
- Canonicalization must be shared between signer and verifier. If you canât guarantee it, youâll get signature failures that look like cryptography problems but are actually formatting problems.
- Domain separation strings should be constant and versioned. When you change the structure, bump the version so old hashes donât accidentally validate under new rules.
- Never sign a mutable object without hashing it first. If the signature covers a structure that can be serialized multiple ways, youâll eventually sign the wrong bytes.
- Include task context in the signed receipt. Otherwise, a valid signature can be replayed for a different task that happens to use the same data hash.
Minimal verification checklist
- Canonicalize data and proof payloads.
- Compute
h_dataandh_proofwith domain-separated hashing. - Confirm
receipt.task_idandreceipt.node_idmatch the verification context. - Verify signature over canonical receipt fields.
- If using Merkle commitments, verify inclusion proofs and recompute the root.
- Ensure receipt hashes match recomputed hashes.
When these checks pass, you have a tight, checkable chain: the proof is authored by the node, it corresponds to the exact data commitment, and it is bound to the exact task being verified.
8. Networking, Transport, and Node Communication Protocols
8.1 Transport Choices: Example HTTP, gRPC, and Message Queues
A DePIN node network moves three kinds of information: requests (client â network), results (node â client/aggregator), and evidence (proof artifacts, logs, and receipts). Transport is the âplumbingâ that determines how reliably those messages arrive, how quickly they can be processed, and how much operational work you inherit.
What to decide before picking a transport
- Message shape: Are payloads small and structured (IDs, measurements), or large and blob-like (proof artifacts)?
- Interaction pattern: Do you need request/response, streaming, or asynchronous delivery?
- Delivery guarantees: Is âat least onceâ acceptable, or do you require âexactly onceâ semantics?
- Backpressure needs: Can the system slow down senders when verifiers or storage are busy?
- Operational footprint: How many moving parts can you run while keeping debugging sane?
A practical rule: start with the simplest transport that matches the interaction pattern, then add complexity only where it solves a concrete pain.
Mind map: transport selection for DePIN
HTTP: simple, inspectable, and good for âcontrol planeâ traffic
HTTP shines when you want straightforward endpoints and easy debugging.
Example use in DePIN: a client submits a job and polls status.
POST /jobscreates a job record.GET /jobs/{jobId}returns current state.POST /jobs/{jobId}/evidenceuploads proof artifacts (or a pointer to them).
Concrete example request (small payload):
- Client sends:
jobIdnodeIdtaskType(e.g.,measure_temperature)parameters(e.g., sensor location, time window)idempotencyKeyto prevent duplicate job creation
Why idempotency matters with HTTP: networks retry. If the client times out after the server processed the request, a retry can create a duplicate job unless the server deduplicates by idempotencyKey.
Server-side pattern:
- Store
idempotencyKey â jobIdfor a retention window. - If the same key arrives again, return the original
jobIdand status.
Operational note: HTTP logs are readable, and you can correlate requests using a Correlation-Id header. That makes incident response less of a scavenger hunt.
gRPC: typed contracts and deadlines for âdata planeâ calls
gRPC is useful when you want strict message schemas and predictable behavior between services.
Example use in DePIN: node operators and verifiers communicate using typed RPCs.
SubmitMeasurement(MeasurementRequest) returns (MeasurementAck)StreamVerificationResults(VerificationRequest) returns (stream VerificationResult)(optional)
Concrete example: a verifier asks a node for a measurement.
-
Verifier sends:
requestIdnodeIdmeasurementSpec(what to measure, acceptable bounds)freshnessWindow(how old the measurement can be)
-
Node responds:
requestIdmeasurementValuemeasurementProofHashtimestamp
Deadlines reduce âstuck workâ: gRPC supports timeouts and cancellation. If a node canât produce a measurement within the deadline, it returns an error, and the verifier can mark the attempt failed without waiting for a long TCP timeout.
Retry strategy: gRPC retries are not magic. You still need:
- Idempotency keys for submission calls.
- Deduplication on the receiver side.
- State transitions that tolerate repeated attempts (e.g.,
PENDING â RECEIVEDonly once).
When not to use gRPC: if your payloads are huge proof blobs, youâll likely send pointers to object storage instead of streaming large binaries through RPC.
Message queues: buffering, decoupling, and scaling verifiers
Message queues are the right tool when work arrives faster than it can be processed, or when you want producers and consumers to evolve independently.
Example use in DePIN: task dispatch and verification pipeline.
- Client submits a job via HTTP.
- The system enqueues verification tasks.
- Verifier workers consume tasks, produce results, and publish completion events.
Concrete example flow:
POST /jobscreates a job and returnsjobId.- The server publishes
VerificationTask(jobId, nodeId, taskType, parameters)to a queue. - Workers consume tasks and call nodes (via gRPC or HTTP).
- Workers publish
VerificationCompleted(jobId, resultHash, status). - A settlement component reads completion events and updates on-chain state.
Why queues help:
- They absorb bursts without dropping requests.
- They let you scale consumers horizontally.
- They separate âaccepting workâ from âdoing work.â
Delivery semantics and deduplication: many queues provide at-least-once delivery. That means a worker might process the same task twice. To keep accounting correct:
- Use a deterministic
taskId(e.g., hash ofjobId + nodeId + taskType + timeWindow). - Store
taskId â completionStatuswith a unique constraint. - If a duplicate arrives, return the stored completion instead of recomputing.
Choosing between them: a decision table
| Need | HTTP | gRPC | Message Queue |
|---|---|---|---|
| Simple API endpoints | â | â ïž (more setup) | â |
| Typed request/response | â ïž (manual schema) | â | â |
| Streaming results | â ïž | â | â ïž (often indirect) |
| Buffering under load | â | â | â |
| Decoupling producers/consumers | â | â ïž | â |
| Large payload handling | â ïž (use pointers) | â ïž (use pointers) | â (store pointers in messages) |
A common integrated approach is: HTTP for external-facing control, gRPC for internal service calls, and queues for asynchronous pipeline steps.
Unified reliability patterns across transports
Regardless of transport, the network needs consistent handling for retries, ordering, and traceability.
- Idempotency keys: every âcreateâ or âsubmitâ action should accept a key.
- Correlation IDs: propagate a
Correlation-Idfrom client to node to verifier to settlement. - Explicit state machines: represent job/task states so repeated messages donât break invariants.
- Payload pointers: send hashes and storage locations instead of large blobs through RPC/HTTP.
End-to-end mini example: measurement submission and verification
Scenario: a client requests a node to measure a parameter, then waits for verification.
- Client â Coordinator (HTTP):
POST /jobswithidempotencyKey. - Coordinator â Queue: publish
VerificationTask. - Worker â Node (gRPC):
SubmitMeasurementwithrequestIdand deadline. - Worker â Storage: upload proof artifact; compute
proofHash. - Worker â Queue: publish
VerificationCompleted(jobId, proofHash, status). - Settlement (HTTP or internal call): reads completion, verifies it matches expected spec, then finalizes.
This split keeps each transport doing what itâs good at: HTTP for straightforward orchestration, gRPC for typed internal calls with deadlines, and queues for buffering and scaling the verification workload.
8.2 Peer Communication Patterns: Example Task Assignment and Result Submission
Peer communication in a DePIN network is mostly about two things: assigning work in a way that can be retried safely, and submitting results in a way that can be verified without guessing. A good pattern makes failures boring.
Mind map: peer communication patterns
Task assignment: the worker should know exactly what to do
A coordinator typically sends a âtask envelopeâ to an operator peer. The envelope should include everything needed to produce a verifiable result, plus enough metadata to prevent accidental duplicates.
Task envelope fields (practical set):
task_id: unique identifier for the work item.job_id: groups tasks for one client request.assignment_id: unique per dispatch attempt (useful for retries).parameters: measurement target, time window, expected units, and any constraints.freshness: anonceand an allowedvalid_untiltimestamp.deadline: when the coordinator stops waiting.dedupe_key: stable key used to detect duplicate submissions.callback: where the operator should submit results (or which message topic).
Why these fields matter:
task_idties results to a specific verification rule.assignment_idlets you distinguish âsame task, different attempt.âdedupe_keyprevents the coordinator from counting the same result twice.freshnessensures the operator canât reuse an old proof.
Example: assigning a measurement task
Suppose a client requests âmeasure site Aâs temperature at 12:00â12:05 UTC and report a proof.â The coordinator creates:
task_id = meas-7f3a...job_id = req-91c2...parameters = { site: "A", metric: "temperature", window: [12:00, 12:05], unit: "C" }freshness = { nonce: "n-42", valid_until: 12:06:00 }deadline = 12:06:30dedupe_key = sha256(task_id || nonce)
The coordinator then selects eligible operators. Eligibility can be simple: operator must be registered for site A and have a recent liveness heartbeat.
Assignment policy: pick peers without making verification harder
You can choose operators using any policy, but the policy should not change the verification logic. That means the verification rules should depend only on the task envelope, not on which operator happened to be chosen.
Common policies:
- Round-robin: stable and easy to reason about.
- Weighted by capacity: use operator-reported performance metrics, but keep the task envelope identical.
- Random selection with constraints: reduces hotspots; still deterministic verification.
Concrete example:
- Operator O1 and O2 are eligible for site A.
- The coordinator assigns two tasks: one to O1 and one to O2, each with the same
task_idbut differentassignment_id. - The verifier later accepts results based on quality thresholds and freshness, not on which operator was picked.
Result submission: make duplicates harmless
Operators submit results back to the coordinator (or directly to a verifier service). The result envelope should be structured so the coordinator can:
- authenticate the operator,
- check freshness,
- verify integrity,
- dedupe,
- route to verification.
Result envelope fields (practical set):
task_idassignment_iddedupe_keyoperator_idproof: measurement proof artifact (could be a signature over readings, a Merkle commitment, or a structured proof object)proof_hash: hash of the proof payloadmeasurement_metadata: units, sampling interval, device identifiers (as permitted)timestamp: when the operator created the resultsignature: operator signature over the envelope
Why proof_hash helps: it lets you validate integrity before spending time on full verification. If the hash doesnât match, you can reject quickly.
Example: operator submits a result
Operator O1 receives the task envelope for task_id = meas-7f3a... with nonce = n-42. O1 produces:
proofcontaining the signed measurement record and any required commitments.proof_hash = sha256(proof_bytes).timestamp = 12:03:10.
O1 signs the result envelope fields (including task_id and nonce via dedupe_key) and submits it.
Coordinator handling: accept, reject, or ignore duplicates
A coordinator should treat submission as a state machine keyed by task_id and dedupe_key.
Suggested coordinator logic:
- If
dedupe_keyalready exists for thistask_id, ignore as duplicate. - Else if
timestamp > valid_until, reject as stale. - Else verify signature and proof integrity (
proof_hash). - Else forward to verifier for deeper checks (quality, bounds, consistency).
Concrete example of duplicate handling:
- Network hiccup causes O1âs submission to be retried.
- The second submission has the same
dedupe_key. - Coordinator ignores it and returns the same acknowledgement status.
Retries and timeouts: design for âat least onceâ delivery
Most real systems end up with at-least-once delivery semantics. Thatâs fine if your dedupe and freshness rules are correct.
Operator retry behavior:
- If no acknowledgement arrives before
ack_deadline, operator resends the same result with the samededupe_key. - Operator does not regenerate a new proof unless the task envelope changes.
Coordinator retry behavior:
- If a task is not completed by
deadline, coordinator marks it unfulfilled and may reassign to another eligible operator. - The new assignment uses a new
assignment_idbut the sametask_idand freshness rules (or a refreshed nonce, depending on your design).
Communication transport: keep it simple and explicit
You can implement peer communication over HTTP/gRPC or message queues. The key is to keep the semantics clear:
- Request/response for task dispatch and acknowledgement.
- Async submission for results, with correlation IDs.
Example message flow (single task):
- Coordinator sends task envelope to O1.
- O1 replies with
ack_received(optional but useful). - O1 submits result envelope.
- Coordinator replies with
ack_acceptedorack_rejected.
Observability: correlation IDs prevent guesswork
Every message should carry a correlation_id (often derived from job_id + task_id). This makes it possible to trace:
- dispatch latency,
- proof submission latency,
- rejection reasons,
- duplicate counts.
Example metrics to record:
task_dispatch_latency_ms(coordinator side)result_submission_latency_ms(operator side)result_reject_rate{reason}duplicate_submission_rate
Minimal end-to-end pseudocode (illustrative)
Coordinator:
create task_envelope(task_id, nonce, deadline, dedupe_key)
choose operator(s) where eligible(site, liveness)
send task_envelope to operator
on result_envelope:
if seen(task_id, dedupe_key): return ACK_DUPLICATE
if result.timestamp > valid_until: return ACK_STALE
verify signature(operator_id, envelope)
if sha256(proof) != proof_hash: return ACK_INTEGRITY_FAIL
enqueue for verification(task_id, proof)
return ACK_ACCEPTED
Operator:
on task_envelope:
ensure nonce is fresh and within valid_until
produce proof and proof_hash
sign result_envelope
submit result_envelope
on no ack before ack_deadline:
resubmit same signed result_envelope
Summary of the pattern
A reliable peer communication design uses (1) a task envelope with explicit freshness and dedupe keys, (2) a result envelope that is signed and integrity-checkable, and (3) coordinator logic that treats duplicates as normal rather than exceptional. Once those pieces are in place, task assignment and result submission become predictableâeven when the network isnât.
8.3 Reliability and Retries Example Idempotency Keys for Safe Replays
In a DePIN network, retries are inevitable: a client request times out, a nodeâs response arrives late, or a verifier job fails after partially completing. The goal is simple: if the same logical request is submitted more than once, the system should produce the same externally visible outcome.
Idempotency keys are the standard tool for that. They let you treat ârepeat submissionâ as âsame operation,â so you donât double-pay, double-count, or double-issue proofs.
Why retries break things (and how idempotency fixes it)
Consider a client that asks an operator to verify a measurement and then settle rewards.
Without idempotency:
- The client submits request
R1. - The operator processes it, but the response is lost.
- The client retries and submits
R1again. - The operator processes it again, producing two proof records.
- Settlement sees two valid-looking outcomes and pays twice.
With idempotency:
- The client includes an idempotency key
Kwith the request. - The operator stores the result keyed by
K. - On retry, the operator returns the stored result instead of re-running the workflow.
Idempotency key design: what it should cover
An idempotency key should represent the intent of a request, not just the transport attempt.
A practical rule: the key should be derived from the fields that define the outcome.
For a âsubmit measurement for verificationâ request, a good key includes:
networkId(prevents cross-network collisions)clientId(optional but helpful for debugging)requestType(e.g.,MEASUREMENT_VERIFY)measurementHash(hash of the measurement payload)proofSpecVersion(prevents mixing incompatible verification rules)callbackTarget(optional; include if the response is routed to a specific endpoint)
A bad key includes:
- a random nonce generated per attempt (that defeats idempotency)
- timestamps (same intent becomes different keys)
Mind map: where idempotency lives
Example: request/response flow with idempotency
Assume the client sends:
measurementPayloadproofSpecVersionidempotencyKey
The operatorâs handler does:
- Compute or validate the idempotency key.
- Look up
Kin a durable store. - If
Kis completed, return the stored result. - If
Kis in progress, return a status pointer (or a deterministic âtry againâ response). - If
Kis new, create an in-progress record and start the job.
Concrete example payload
Client computes:
measurementHash = SHA256(measurementPayload)idempotencyKey = SHA256(networkId || requestType || clientId || measurementHash || proofSpecVersion)
If the client times out after 2 seconds, it retries with the same idempotencyKey.
Operator behavior:
- First attempt: creates record
KasIN_PROGRESSand starts verification. - Retry: sees
IN_PROGRESSand returnsstatus=IN_PROGRESSwith ajobId. - When verification finishes: record
KbecomesCOMPLETEDwithproofIdandverificationOutcome. - Any further retries return the same
COMPLETEDdata.
Handling the âraceâ: two retries arrive at once
Idempotency must survive concurrent submissions. Two identical requests can hit two different operator instances.
To avoid double work, the operator needs an atomic âcreate-if-absentâ step:
- If record
Kdoes not exist, create it asIN_PROGRESSin one transaction. - If it already exists, do not start a new job.
A simple pattern is:
- Use a unique constraint on
idempotencyKeyin the durable store. - Insert
Kwith statusIN_PROGRESS. - If insert fails due to uniqueness, read the existing record and follow its status.
Storage model: what you store for each key
Store enough information to answer the client without re-running the job.
A minimal durable record for K:
status:IN_PROGRESS|COMPLETED|FAILEDcreatedAtcompletedAt(optional)jobId(optional)result:proofId,verificationOutcome, and any fields needed to proceed to settlementerror: normalized error code and message (forFAILED)
Important nuance: if the job fails, you must decide whether retries should:
- return the same failure (strict idempotency), or
- allow reprocessing after a backoff window (soft idempotency).
For accounting safety, strict idempotency is usually safer: the same key yields the same outcome.
Example: operator handler logic (pseudocode)
function handleVerify(request):
K = request.idempotencyKey
rec = store.get(K)
if rec exists:
if rec.status == COMPLETED:
return rec.result
if rec.status == IN_PROGRESS:
return {status: IN_PROGRESS, jobId: rec.jobId}
if rec.status == FAILED:
return {status: FAILED, error: rec.error}
created = store.insertIfAbsent(K, status=IN_PROGRESS)
if not created:
rec = store.get(K)
return handleVerify({idempotencyKey: K})
jobId = startVerificationJob(request)
store.update(K, {jobId: jobId})
return {status: IN_PROGRESS, jobId: jobId}
Client behavior: retries that donât cause chaos
The client should:
- reuse the same
idempotencyKeyfor retries of the same logical operation - treat
IN_PROGRESSas a signal to poll or wait, not to submit a new operation - only generate a new key when the logical intent changes (different measurement, different spec version, different network)
A concrete polling loop:
- Submit verify request with key
K. - If response is
IN_PROGRESS, polljobStatus(jobId). - If response is
COMPLETED, proceed to settlement using the returnedproofId. - If response is
FAILED, stop and surface the error.
Settlement safety: idempotency doesnât end at verification
Even if verification is idempotent, settlement can still double-pay if it keys off âproof receivedâ events without guarding duplicates.
A robust approach is to apply idempotency again at settlement:
- settlement request includes
settlementIdderived from(proofId, clientId, networkId) - settlement contract or service ensures each
settlementIdis executed once
This creates a clean chain:
- verification is idempotent by
K - settlement is idempotent by
settlementId
Common mistakes to avoid
- Key includes attempt-specific fields: youâll never get a cache hit.
- Key scope too broad: two different intents collide and you return the wrong result.
- No durable storage: in-memory idempotency disappears on restart.
- Re-running on
IN_PROGRESS: concurrency turns retries into parallel duplicates. - Settlement keyed only by âproof existsâ: you need a unique execution identifier.
Quick checklist
- Idempotency key is derived from outcome-defining fields.
- Operator stores
IN_PROGRESSandCOMPLETEDresults durably. - Insert-if-absent (unique constraint) prevents concurrent duplicates.
- Client reuses the same key for retries of the same intent.
- Settlement has its own idempotency guard.
With these pieces in place, retries become a reliability feature rather than a source of accounting surprises. The system may do extra work internally, but it wonât do extra work externally.
8.4 Security Controls Example Mutual TLS and Signed Requests
Mutual TLS (mTLS) and signed requests solve different problems, so the clean design is to use both. mTLS authenticates the network peer at the transport layer. Signed requests authenticate the specific application action and protect the payload from tampering, even if it passes through intermediaries.
Why mTLS first
With mTLS, every node has a certificate issued by your internal certificate authority (CA). When a client connects to a verifier or operator endpoint, the TLS handshake verifies:
- The client certificate is valid (not expired, not revoked).
- The client certificate chains to the CA you trust.
- The server certificate is valid for the hostname you connected to.
This gives you a strong baseline: you can reject unauthorized peers before they can even send a request body.
Concrete example: task submission endpoint
Suppose a client submits a measurement task to an operator at POST /v1/tasks. With mTLS enabled:
- The client presents its certificate during the handshake.
- The server checks the certificateâs subject or SAN (Subject Alternative Name) against an allowlist.
- Only then does the server read the HTTP request.
If the certificate is missing or invalid, you return 401 Unauthorized (or 403 Forbidden if the cert is valid but not permitted). This keeps your application logic focused on correctness, not identity plumbing.
How to map certificates to identities
Certificates are not identities by themselves; you need a deterministic mapping.
A practical approach:
- Use a stable identifier in the certificate, such as
URI:spiffe://depin/nodes/<nodeId>in the SAN. - Treat
<nodeId>as the node identity used across logs, metrics, and authorization rules.
Mind map: mTLS identity and authorization
Signed requests: protecting the action
mTLS authenticates the peer, but it does not guarantee that the request you received is the exact one the sender intended. Signed requests cover:
- Payload integrity (the body wasnât modified).
- Request intent (the method + path + key headers match what was signed).
- Freshness (the request canât be replayed indefinitely).
A common pattern is HTTP signatures over a canonical string.
Canonical string to sign
Define a canonical representation that includes:
- HTTP method (e.g.,
POST). - Request path (e.g.,
/v1/tasks). - A timestamp (e.g.,
X-Timestamp). - A nonce (e.g.,
X-Nonce). - A hash of the request body (e.g.,
sha256(body)). - Key identifier (e.g.,
X-Key-Id).
Then compute signature = Sign(privateKey, canonicalString).
Concrete example: signing a proof submission
A verifier endpoint might be POST /v1/proofs/submit. The operator sends:
- Headers:
X-Key-Id: node-42X-Timestamp: 2026-03-24T10:15:30ZX-Nonce: 9f3a...Content-Type: application/json
- Body:
{ "taskId": "t-123", "proof": { ... }, "quality": 0.92 }
The operator computes bodyHash = sha256(bodyBytes) and signs:
POST\n/v1/proofs/submit\nX-Timestamp:...\nX-Nonce:...\nX-Key-Id:...\nbodyHash:...
The server verifies:
- The signature matches the public key associated with
X-Key-Id. - The timestamp is within an allowed window (for example, 2â5 minutes).
- The nonce hasnât been used before for that key (store recent nonces or use a replay cache).
- The body hash matches the received body bytes.
If any check fails, return 401 Unauthorized for signature problems or 400 Bad Request for malformed headers.
Mind map: signed request verification
Combining mTLS and signatures without redundancy
Itâs tempting to sign everything and skip mTLS, or to rely on mTLS and skip signatures. Using both is reasonable if you keep responsibilities clear:
- mTLS: authenticate the connection peer and reduce the number of unauthenticated requests reaching your app.
- Signed requests: bind the exact action and payload to a cryptographic signature with freshness.
A clean rule of thumb:
- If the request can be replayed or altered in transit, signatures matter.
- If the server needs to reject unknown peers early, mTLS matters.
Header and canonicalization details that prevent bugs
Canonicalization is where many systems accidentally sign different bytes than they verify.
Best practices:
- Use the raw request body bytes when hashing, not a re-serialized JSON string.
- Normalize line endings in the canonical string (e.g.,
\nonly). - Treat header values as exact strings after trimming only the outer whitespace.
- Include the request path exactly as received (no automatic rewriting).
Concrete example: JSON body hashing
If the client sends JSON with different key ordering, the byte-level hash changes. Thatâs fine if both sides hash the raw bytes. If you want order-insensitive signing, you must define a canonical JSON encoding and apply it consistently on both sides. Otherwise, keep it byte-based and document it.
Authorization rules tied to identity
After mTLS identifies the peer, you still need authorization.
Example policy:
- Node role
operatorcan call:POST /v1/tasksPOST /v1/proofs/submit
- Node role
verifiercan call:POST /v1/verification/requests
You can derive role from certificate metadata (e.g., SAN plus a role claim in an internal registry) or from a signed authorization record stored in your system. The key point is that authorization decisions should be deterministic and testable.
Operational checks: revocation and rotation
mTLS requires a plan for certificate lifecycle.
- Prefer short-lived certificates if your environment supports it; it reduces revocation complexity.
- If you use revocation lists, ensure your servers refresh them on a schedule.
- Rotate signing keys used for request signatures independently from TLS certificates.
Concrete example: rotation without downtime
During rotation, allow both the current and previous signing public keys for a limited overlap window. The server can accept signatures that validate against either key while the client migrates.
Failure modes and what to log
When verification fails, log enough to diagnose without leaking secrets.
Recommended log fields:
peerNodeId(from certificate)endpointfailureReason(e.g.,bad_signature,nonce_replay,timestamp_out_of_window)keyId(from header)requestId(a correlation id you generate)
Avoid logging the raw signature or private-key-related material.
Minimal verification checklist
For a signed request handler behind mTLS:
- TLS handshake succeeded and peer identity extracted.
- Authorization check passed for the endpoint.
- Required headers present (
X-Key-Id,X-Timestamp,X-Nonce). - Timestamp within window.
- Nonce not seen before for that key.
- Body hash matches received bytes.
- Signature verifies against the public key for
X-Key-Id. - Proceed to application logic.
This order keeps the expensive cryptographic work aligned with the most likely failure causes, and it makes the systemâs behavior predictable under load.
8.5 Observability for Networking: Example Tracing and Correlation IDs
Networking bugs are often âinvisibleâ until you look at the right signals. Observability for DePIN networking should answer three questions quickly: Where did this request go? What did it do? Why did it fail? The practical tool for the first question is distributed tracing plus a consistent correlation ID that survives hops across clients, nodes, and verifiers.
What to instrument (and what not to)
Instrument at boundaries where behavior changes:
- Client â Node API call (request accepted, queued, processed)
- Node â Verifier call (proof submitted, verification started, verification result)
- Node â Storage call (artifact stored/retrieved, integrity check passed/failed)
- Node â Chain/Settlement call (event emitted, transaction submitted, receipt confirmed)
Avoid instrumenting every internal function. If you trace too granularly, youâll drown in spans and lose the ability to answer âwhere did it go?â
Correlation IDs: the rule of one ID per logical request
Use a correlation ID that:
- is generated once at the start of a logical workflow (e.g., âsubmit proof for job Xâ)
- is propagated in headers and in logs
- is included in metrics labels only when cardinality is controlled (usually not per request)
A simple convention:
X-Correlation-Id: UUIDv4 stringX-Request-Id: optional per-hop ID for debugging retriestraceparent: tracing header used by your tracing system
Even if you use OpenTelemetry (or similar), keep X-Correlation-Id because itâs easy to grep in logs and works across systems that donât share tracing context.
Mind map: observability signals and how they connect
Example: end-to-end tracing with correlation IDs
Assume a client submits a job to a node. The node forwards verification to a verifier service, then stores the result and returns a response.
1) Client side: generate and propagate IDs
- Generate
X-Correlation-Idonce. - Start a trace span for the client request.
- Send both correlation ID and tracing context.
Example request headers:
X-Correlation-Id: 3f2c1b7a-8b1e-4b8f-9b2a-1c0d2a6f4a11traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
2) Node side: create child spans and keep the correlation ID
When the node receives the request:
- Extract
X-Correlation-Id. - Create a span for ânode processingâ.
- Create child spans for âcall verifierâ and âstore artifactâ.
- Log every outcome with the same correlation ID.
Span attributes that help later:
job.idnode.idverifier.idproof.typenetwork.endpointattempt(retry count)
3) Verifier side: respond with the same correlation ID
The verifier service should:
- extract
X-Correlation-Id - create spans for âverificationâ
- return a response that includes the correlation ID (or at least logs it)
This matters because failures often occur in the verifier, but the client only sees a timeout. With correlation IDs, you can prove whether the verifier was never called, called but slow, or called and rejected.
Correlation ID propagation checklist
Use this checklist during implementation:
- Client generates
X-Correlation-Idexactly once per logical workflow - Every outgoing request includes
X-Correlation-Id - Every log line includes
correlationId - Every span includes
correlationIdas an attribute (or via context) - Retry logic preserves the original
X-Correlation-Id - Error responses include
X-Correlation-Idso the caller can report it
Logs: make them searchable and consistent
A good log line includes:
- correlation ID
- job ID
- node ID
- endpoint
- outcome (success/failure)
- error category (timeout, invalid input, upstream rejected, etc.)
Example structured log fields (conceptual):
correlationId:3f2c1b7a-8b1e-4b8f-9b2a-1c0d2a6f4a11jobId:job_1842nodeId:node_7stage:verifier_callattempt:2errorType:upstream_timeoutdurationMs:7420
If you only log ârequest failedâ, youâll spend time guessing. If you log âverifier_call failed with upstream_timeout after 7420msâ, you can move on.
Metrics: connect tracing to operational signals
Tracing shows one requestâs story. Metrics show how often it happens.
Recommended metrics for networking observability:
request_duration_seconds{endpoint, outcome}: histogramupstream_call_duration_seconds{upstream, outcome}: histogramrequest_errors_total{endpoint, errorType}: counterinflight_requests{endpoint}: gaugequeue_depth{queueName}: gauge (if you have queues)
Keep metric labels bounded. Do not label metrics with correlationId or jobId because that creates unbounded cardinality.
Debug workflow: a concrete incident walkthrough
Suppose clients report intermittent timeouts when submitting proofs.
-
Start with correlation ID
- Ask a client for
X-Correlation-Idfrom the error response. - Search logs for that ID.
- Ask a client for
-
Follow the trace timeline
- Look at the span for ânode processingâ.
- Check child spans: âcall verifierâ and âstore artifactâ.
-
Classify the failure stage
- If âcall verifierâ span ends with
upstream_timeout, the verifier path is slow or unreachable. - If âstore artifactâ fails with
integrity_mismatch, the issue is data integrity, not networking. - If the node never creates the âcall verifierâ span, the request likely failed earlier (validation, admission control, or queue rejection).
- If âcall verifierâ span ends with
-
Confirm with metrics
- Check
upstream_call_duration_secondsfor the verifier upstream. - Check
request_errors_totalfortimeouton the client-facing endpoint.
- Check
-
Use retry attempt data
- If attempt 1 fails quickly and attempt 2 succeeds, the system may be experiencing transient network issues.
- If all attempts fail similarly, you likely have a persistent upstream problem or a misconfiguration.
Example: minimal tracing span structure
Below is a compact example of how spans might be named and attributed. (Names should match your actual endpoints and stages.)
Span: client.submitProof
attrs: job.id, node.id, proof.type
child: node.processRequest
attrs: node.id
child: verifier.verifyProof
attrs: verifier.id, proof.type
child: storage.putArtifact
attrs: artifact.hash
Practical rules for correlation IDs and tracing together
- Correlation ID answers: âWhich logical request is this?â
- Trace spans answer: âWhat happened inside it, step by step?â
- Logs answer: âWhat did the system decide at each step?â
- Metrics answer: âHow widespread is this behavior?â
When all four agree, debugging becomes straightforward. When they donât, youâve found the missing instrumentation or the broken propagation pathâusually a header not forwarded on one hop, or a retry path that accidentally generates a new correlation ID.
9. Smart Contract and Protocol Module Design
9.1 Contract Boundaries: Example Splitting Registry, Rewards, and Disputes
A DePIN network usually needs three kinds of on-chain behavior: admitting and tracking nodes (registry), paying for work (rewards), and resolving disagreements (disputes). If you mix these into one contract, you get tangled permissions, hard-to-audit state transitions, and âone bug breaks everythingâ risk. Splitting boundaries keeps each module small enough to reason about and test.
Design goal: separate responsibilities by state and authority
A good boundary is not just âdifferent files.â It is a clear separation of:
- State ownership: which contract is the source of truth for a given piece of data.
- Authority: which contract is allowed to change that state.
- Lifecycle: what transitions are valid and who triggers them.
Below is a practical split into Registry, Rewards, and Disputes.
Mind map: module boundaries and flows
Registry contract: membership and eligibility
The Registry contract should answer questions like: âIs this node allowed to earn rewards right now?â and âWho is the operator for this nodeId?â It should not compute rewards or run dispute logic.
Minimal state (example):
mapping(bytes32 => address) operatorOf;mapping(bytes32 => uint64) lastHeartbeat;uint64 heartbeatTimeout;mapping(bytes32 => bool) active;
Key functions:
register(nodeId, operator, metadataHash)heartbeat(nodeId)deactivate(nodeId)isActive(nodeId)
Example eligibility rule: a node is active if block.timestamp - lastHeartbeat[nodeId] <= heartbeatTimeout and active[nodeId] == true.
Why this boundary matters: Rewards should not need to know how heartbeats work. It only needs a boolean eligibility check.
Rewards contract: accounting and settlement
The Rewards contract should focus on turning verified work into balances, and then paying those balances. It should not decide whether a proof is valid; that decision belongs to verification and dispute resolution.
Minimal state (example):
mapping(bytes32 => uint256) accruedToOperator;mapping(bytes32 => bool) measurementSettled;uint64 disputeWindow;address registry;address disputeManager;
Key functions:
accrue(measurementId, nodeId, operator, amount, quality)claim()markSettled(measurementId)
Important boundary rule: accrue() should only accept inputs that are already âverification-ready.â In practice, that means the caller is a verifier contract, or the function requires a verifier signature.
Dispute-aware settlement: rewards should not be final immediately. A common pattern is:
- accrue balances immediately (or store pending amounts),
- allow disputes during
disputeWindow, - only then mark measurement as settled and finalize the accounting.
Example flow:
- Verifier submits
accrue(measurementId, ...). - Rewards stores pending credit for the operator.
- After
disputeWindowends, Rewards allowsmarkSettled(measurementId). - Operator can
claim()once settled.
Disputes contract: challenge and final outcome
The Disputes contract should own the âwho winsâ decision for a measurement. It should not manage membership or payment balances directly.
Minimal state (example):
struct Dispute { bytes32 measurementId; address challenger; uint64 openedAt; bool resolved; bool accepted; }mapping(bytes32 => Dispute) disputes;address rewards;
Key functions:
challenge(measurementId, evidenceHash)submitEvidence(measurementId, evidence)(optional)resolve(measurementId, accepted)
Effect boundary: when a dispute resolves, Disputes should call back into Rewards with a narrow instruction like:
revokeOrConfirm(measurementId, accepted)
This keeps Rewards in control of balances while Disputes controls the decision.
Cross-contract interfaces: keep them narrow
A boundary is only real if the interface is small. Use explicit functions that encode intent.
Example interface set:
- Registry exposes:
isActive(nodeId) -> bool - Rewards exposes:
onDisputeResolved(measurementId, accepted) - Disputes exposes:
isResolved(measurementId) -> bool(optional)
Concrete example: measurement lifecycle with boundaries
Assume a measurement is identified by measurementId = keccak256(epoch, nodeId, taskId, proofHash).
- Registry: operator registers and stays active via heartbeats.
- Verification: a verifier contract checks the proof format and measurement rules.
- Rewards:
accrue()stores pending credit for the operator tied tomeasurementId. - Dispute window: anyone can call
challenge(measurementId, evidenceHash). - Disputes: a resolution rule (committee, verifier re-check, or evidence-based logic) sets
accepted. - Rewards:
onDisputeResolved()either finalizes the pending credit or cancels it.
Example contract skeletons (illustrative)
// Registry: membership + eligibility
contract Registry {
mapping(bytes32 => address) public operatorOf;
mapping(bytes32 => uint64) public lastHeartbeat;
mapping(bytes32 => bool) public active;
uint64 public heartbeatTimeout;
function register(bytes32 nodeId, address operator, bytes32 metadataHash) external {}
function heartbeat(bytes32 nodeId) external {}
function deactivate(bytes32 nodeId) external {}
function isActive(bytes32 nodeId) public view returns (bool) {
return active[nodeId] && (block.timestamp - lastHeartbeat[nodeId] <= heartbeatTimeout);
}
}
// Rewards: accounting + settlement
contract Rewards {
address public registry;
address public disputeManager;
uint64 public disputeWindow;
mapping(bytes32 => uint256) public pendingByMeasurement;
mapping(bytes32 => uint256) public accruedToOperator;
mapping(bytes32 => bool) public settled;
function accrue(bytes32 measurementId, bytes32 nodeId, address operator, uint256 amount) external {}
function markSettled(bytes32 measurementId) external {}
function claim() external {}
function onDisputeResolved(bytes32 measurementId, bool accepted) external {
require(msg.sender == disputeManager, "only dispute manager");
if (!accepted) { /* cancel pending */ }
}
}
// Disputes: challenge + resolution
contract Disputes {
address public rewards;
struct Dispute { address challenger; uint64 openedAt; bool resolved; bool accepted; }
mapping(bytes32 => Dispute) public disputes;
function challenge(bytes32 measurementId, bytes32 evidenceHash) external {}
function resolve(bytes32 measurementId, bool accepted) external {}
function _notifyRewards(bytes32 measurementId, bool accepted) internal {
Rewards(rewards).onDisputeResolved(measurementId, accepted);
}
}
Boundary checks that prevent common mistakes
- Rewards must not trust Disputes blindly:
onDisputeResolvedshould verify that the dispute exists and is resolved. - Disputes must not touch balances directly: it should only instruct Rewards, not compute operator amounts.
- Registry must not depend on reward timing: membership eligibility should be independent of dispute windows.
- Single source of truth: if
active[nodeId]lives in Registry, Rewards should callregistry.isActive(nodeId)rather than duplicating logic.
Testing strategy aligned with boundaries
Write tests per module with cross-contract mocks:
- Registry tests: heartbeat timeout, deactivate behavior, and
isActivecorrectness. - Rewards tests: accrual, dispute window behavior, and claim eligibility.
- Dispute tests: challenge opening, resolution state transitions, and the callback into Rewards.
This approach makes failures easier to interpret. If a claim fails, you know whether the issue is membership eligibility (Registry), settlement timing (Rewards), or dispute outcome (Disputes).
9.2 Upgrade and Versioning Strategy Example Immutable Logic With Config Updates
A DePIN network usually has two kinds of change requests: changes to rules (logic) and changes to parameters (configuration). A practical upgrade strategy keeps the rules stable and moves most change into configuration. That separation reduces the number of times you must touch consensus-critical code.
Core principle: immutable logic, mutable configuration
Immutable logic means the on-chain contract code that enforces accounting, eligibility, and settlement rules is deployed once (or only changes through a tightly controlled migration). Mutable configuration means the contract reads versioned parameters from storage that can be updated by governance.
A good mental model: logic answers âhow to compute,â while configuration answers âwhat values to use.â If you keep âhowâ stable, you can upgrade âwhatâ without rewriting the rules.
Versioning model: protocol version + config version
Use two version numbers:
- Protocol version: identifies the set of invariants and rule semantics the contract code enforces.
- Config version: identifies the active parameter set (thresholds, time windows, reward weights, allowed measurement formats).
When governance updates parameters, it increments config version while keeping protocol version unchanged. When you must change invariants, you deploy a new logic contract (new protocol version) and migrate configuration to the new contract.
On-chain data layout pattern
Keep configuration in a dedicated structure keyed by config version. The contract stores:
activeConfigVersionconfigs[version]containing parameter valuesprotocolVersionconstant (or stored once at deployment)
Example parameters that belong in config:
- minimum quality score
- challenge window duration
- reward multipliers per service tier
- allowed proof schema IDs
- maximum accepted measurement age
Example parameters that should not be in config (because they change invariants):
- switching from âpay per verified measurementâ to âpay per operator uptimeâ
- changing the meaning of a settlement event
- altering the accounting formula in a way that breaks existing invariants
Upgrade flow: config update without logic change
- Governance proposes a new config bundle.
- The contract validates structural constraints (types, bounds, and compatibility checks).
- The contract stores the bundle under
configs[newVersion]. - The contract updates
activeConfigVersion. - New requests use the new config; old requests keep their config version recorded at creation.
That last step is the quiet hero: freeze the config version per request so settlement remains consistent even if governance changes parameters mid-flight.
Compatibility checks: make invalid configs fail early
A config update should be rejected if it would break existing request semantics. Typical checks:
- Schema compatibility: the allowed proof schema IDs must include the schema used by in-flight requests.
- Time window sanity: challenge window must be > 0 and less than a maximum bound.
- Reward bounds: multipliers must keep rewards within safe numeric ranges.
- Eligibility invariants: if eligibility depends on thresholds, ensure thresholds do not invert ordering assumptions.
Even if you canât prove full correctness on-chain, you can enforce âno obvious foot-guns.â
Example: request-scoped config version
When a client submits a job, store configVersionAtCreation in the job record. Settlement reads that value, not activeConfigVersion.
This prevents a common failure mode: governance updates parameters, and then an older job settles under new rules it was never evaluated against.
Mind map: upgrade strategy components
Example: config bundle structure
A config bundle should be explicit and self-describing. Include:
versionprotocolVersionit is compatible withproofSchemaIds[]qualityThresholdchallengeWindowSecondsrewardWeights(with units)maxMeasurementAgeSeconds
The contract can reject bundles whose protocolVersion does not match.
Example: compatibility rule for proof schemas
Suppose jobs reference a proofSchemaId chosen at submission time. If governance removes a schema from proofSchemaIds, then jobs created earlier with that schema must still be able to settle.
Two safe patterns:
- Request-scoped schema acceptance: store
proofSchemaIdin the job record and allow settlement if it matches the jobâs stored schema, regardless of current config. - Config compatibility check: require that the new configâs
proofSchemaIdssuperset all schema IDs currently used by in-flight jobs.
Pattern (1) is simpler for correctness; pattern (2) can reduce storage complexity depending on your design.
Example: numeric stability and bounds
Reward calculations often involve multipliers and time-based factors. Put the numeric constraints in config validation:
- multipliers must fit within a fixed-point range
- denominators must be non-zero
- time windows must not overflow when converted to seconds
This prevents âvalid-lookingâ configs that cause arithmetic edge cases.
When you truly need logic upgrades
If a change modifies invariants or settlement semantics, treat it as a new protocol version.
A common approach:
- Deploy
ProtocolV2contract. - Freeze
ProtocolV1for settlement of existing jobs. - Migrate or re-create configuration under
ProtocolV2. - Route new job submissions to
ProtocolV2.
This keeps old jobs consistent without forcing the new logic to handle legacy semantics in every code path.
Mind map: decision points for âconfig vs logicâ
Config vs Logic: How to Decide
-
If change affects only values
- thresholds, weights, time windows
- allowed schema lists
- fee rates
=> Put in Config (new config version)
-
If change affects semantics/invariants
- accounting formula meaning
- eligibility definition
- dispute procedure logic
=> Put in Logic (new protocol version)
-
If change affects both
- deploy new logic
- still use config for parameterization
Minimal pseudo-interface for clarity
Below is a conceptual interface showing how config versioning can be wired. (Names are illustrative.)
// Conceptual interface (not production code)
struct Config {
uint64 version;
uint64 protocolVersion;
uint256 qualityThreshold;
uint256 challengeWindowSeconds;
uint256 maxMeasurementAgeSeconds;
bytes32[] proofSchemaIds;
}
function proposeConfig(Config calldata c) external;
function activateConfig(uint64 newVersion) external;
function createJob(bytes32 proofSchemaId, uint64 configVersion) external;
function settleJob(uint256 jobId) external; // uses job.configVersionAtCreation
Practical example: two config updates over one job
- Job #42 is created when
activeConfigVersion = 7. - Governance updates parameters to
activeConfigVersion = 8. - Job #42 settles using config version 7, so its quality threshold and challenge window match what was expected at creation.
- Job #43 created after the update uses config version 8.
This behavior is easy to explain to operators and easy to test: you can assert that settlement uses the stored config version, not the current active one.
Testing strategy tied to versioning
Write tests that explicitly cover:
- rejecting configs with wrong
protocolVersion - rejecting configs with out-of-bounds time windows
- ensuring in-flight jobs settle with their stored config version
- ensuring new jobs pick up the latest active config
- ensuring proof schema acceptance behaves as designed (request-scoped or compatibility-checked)
When versioning is correct, the tests read like a checklist of user-visible guarantees, not like a pile of edge-case hunting.
9.3 Deterministic Accounting Example Avoiding Rounding Errors and Drift
Deterministic accounting means every honest node computes the same balances from the same inputs, down to the last unit. In DePIN, that matters because rewards, fees, and slashing outcomes are usually settled on-chain, while measurement and proof generation happen off-chain. If the on-chain math depends on floating-point behavior, inconsistent rounding, or time-based drift, you get disputes that are hard to resolve and easy to prevent.
The core problem: âsame idea, different numbersâ
A common failure mode is mixing units or rounding at different stages. For example, an operator might compute a quality score as a percentage with two decimals off-chain, while the contract recomputes it from raw evidence and rounds differently. Even if both are âclose,â the contractâs final reward can differ by a few smallest units. Over many jobs, those small differences accumulate into drift.
To avoid this, design accounting so that:
- All monetary values are represented as integers in the smallest unit (e.g., wei-like âmicro-tokensâ).
- All fractional quantities are represented as fixed-point integers with an explicit scale.
- Rounding happens in exactly one place, with a documented direction (floor, ceil, or nearest) and consistent tie-breaking.
- The contract never re-derives a value that the client already rounded differently.
A deterministic accounting pattern: fixed-point, integer-only
Assume the network pays rewards based on a base rate and a quality multiplier.
- Base reward per task:
base = 0.25 tokens - Quality multiplier:
mderived from proof quality, between0.0and1.5 - Final reward:
reward = base * m
On-chain, represent:
base_u= base in smallest units (integer)m_fp= multiplier in fixed-point with scaleS(integer)reward_u= reward in smallest units (integer)
Let S = 10^6. Then:
\[ \text{reward}_u = \left\lfloor \frac{\text{base}_u \cdot \text{m_fp}}{S} \right\rfloor \]
This single formula is the whole accounting story. Everything else should feed m_fp as an integer.
Mind map: deterministic accounting checklist
Example: quality multiplier from evidence without rounding drift
Suppose the proof provides two integers:
good= number of good samplestotal= number of total samples
Define quality as good/total, then map it into a multiplier range. A simple mapping:
- quality ratio
r = good / total - multiplier
m = 0.5 + r(so it ranges from0.5to1.5)
If you compute r as a decimal off-chain and round it, you risk mismatch. Instead, compute everything in fixed-point.
Let S = 10^6. Then:
\[ \text{r_fp} = \left\lfloor \frac{\text{good} \cdot S}{\text{total}} \right\rfloor \]
\[ \text{m_fp} = 500000 + \text{r_fp} \]
Now the contract uses the same m_fp formula every time.
Concrete numbers
base = 0.25 tokens- smallest unit:
1 token = 1,000,000 micro - so
base_u = 250,000 S = 1,000,000
Evidence:
good = 333total = 1000
Compute:
\[ \text{r_fp} = \left\lfloor \frac{333 \cdot 10^6}{1000} \right\rfloor = 333,000 \]
\[ \text{m_fp} = 500,000 + 333,000 = 833,000 \]
Reward:
\[ \text{reward}_u = \left\lfloor \frac{250,000 \cdot 833,000}{10^6} \right\rfloor = \left\lfloor 208,250,000 \right\rfloor = 208,250 \]
So the contract pays 208,250 micro-tokens.
If someone instead computed r = 0.333 off-chain and used m = 0.833, then reward = 0.25 * 0.833 = 0.20825 tokens looks consistent here, but it wonât be consistent across all inputs once rounding differs. The fixed-point approach keeps the contractâs math authoritative.
Example: avoid âaverage quality over timeâ drift
A subtle drift source is aggregating at different granularities. Consider two ways to compute rewards for a day:
- Job-level: compute reward per job and sum.
- Daily-level: compute average quality for the day, then compute one reward.
Even with perfect arithmetic, these can differ because multiplication and division donât commute under integer rounding.
Job-level uses:
\[ \sum_i \left\lfloor \frac{\text{base}_u \cdot \text{m_fp,i}}{S} \right\rfloor \]
Daily-level uses:
\[ \left\lfloor \frac{\text{base}_u \cdot \overline{\text{m_fp}}}{S} \right\rfloor \]
where m_fp is itself derived from integer evidence. The floor operations happen at different points, so results can diverge.
Best practice: settle per job (or per smallest accounting unit) and sum integer outputs. If you need daily reporting, compute it from the already-settled job events rather than recomputing rewards.
Minimal on-chain pseudocode (integer-only)
// All values are integers in smallest units.
// S is the fixed-point scale for multipliers.
function computeReward(uint256 base_u, uint256 m_fp, uint256 S)
internal pure returns (uint256 reward_u)
{
// Single rounding rule: floor.
// reward_u = floor(base_u * m_fp / S)
return (base_u * m_fp) / S;
}
function computeMultiplier(uint256 good, uint256 total, uint256 S)
internal pure returns (uint256 m_fp)
{
// r_fp = floor(good * S / total)
uint256 r_fp = (good * S) / total;
// m_fp = 0.5*S + r_fp
return (S / 2) + r_fp;
}
Event design for reconciliation
To ensure operators and clients can verify outcomes without guessing, emit events that include raw inputs and computed fixed-point values.
JobSettled(jobId, good, total, r_fp, m_fp, base_u, reward_u)
This lets an off-chain auditor recompute reward_u using the same integer formulas and confirm the contract did not apply hidden rounding.
Practical rules that prevent drift in real systems
- Pick one scale and stick to it. If you change
S, you must migrate or version the accounting logic. - Never accept rounded decimals as inputs. Accept raw evidence (
good,total) or fixed-point integers (m_fp). - Use one rounding direction. If you floor in one place, floor everywhere for that quantity.
- Account at the smallest unit. Sum settled job rewards rather than recomputing aggregates.
- Keep arithmetic overflow in mind. Use wider integer types or reorder operations so
base_u * m_fpdoesnât overflow.
Deterministic accounting isnât about being clever; itâs about making the contract the only place where rounding decisions are made, and making every other component provide inputs that wonât force the contract to guess how a human rounded something earlier.
9.4 Dispute and Challenge Mechanisms: Example Evidence Submission Flow
A dispute system exists to answer one question: âWas the submitted measurement (or service result) eligible and correct under the rules?â The trick is to make disputes cheap to start, bounded in cost to run, and deterministic in outcome.
Goals and constraints
- Fast resolution for honest cases. Most submissions should finalize without human review.
- Bounded work for disputes. The protocol should cap how much evidence can be submitted and how long verification can run.
- Deterministic outcomes. Given the same evidence and state, the result should be the same.
- Clear roles. A client challenges, operators respond, and verifiers (or the chain) decide.
Core design: what can be disputed
Disputes should target specific, rule-based claims. For example:
- Eligibility: Was the node active and allowed to submit at that time?
- Freshness: Did the evidence correspond to the requested time window?
- Correctness: Does the proof match the measurement format and constraints?
- Completeness: Was required metadata included (e.g., location bounds, calibration version)?
A good practice is to define a small set of âchallengeable fieldsâ and reject challenges that aim at anything else.
Evidence submission flow (worked example)
Assume a network where a client requests a measurement, an operator submits a result with proof artifacts, and the protocol settles rewards after a challenge window.
Actors
- Client (C): requests work and can challenge.
- Operator (O): submits result and evidence.
- Verifier/Arbiter (V): either an on-chain verifier or a committee that checks evidence.
- Chain (S): stores commitments, enforces timing, and records outcomes.
Data objects
- Request:
requestId,taskSpec,timeWindow,qualityThresholds. - Submission:
submissionId,requestId,nodeId,result,proofCommitment,evidenceHash. - Evidence bundle: raw artifacts plus metadata needed to verify.
Step-by-step timeline
-
Client creates request
- C posts a request with
timeWindowandtaskSpec. - S records the request and assigns a
requestId.
- C posts a request with
-
Operator submits result
- O submits
resultplusproofCommitment. - O also includes
evidenceHashcomputed over the evidence bundle that will be revealed if challenged. - S checks basic eligibility (node is active, request exists, submission format matches).
- S starts a challenge window for this
submissionId.
- O submits
-
Submission enters âpending-finalityâ
- Rewards are not fully settled yet.
- The protocol may allow partial accounting, but final settlement waits for the window.
-
Client challenges (if needed)
- If C believes the submission violates a rule, it submits a challenge transaction before the window ends.
- The challenge includes:
submissionIdchallengeType(e.g.,FRESHNESS,ELIGIBILITY,CORRECTNESS)claim(a concise statement of what is wrong, encoded as parameters)evidenceHashof any counter-evidence C wants considered
- S validates that the challenge is well-formed and that the
challengeTypeis allowed.
-
Operator responds with evidence
- After a challenge is opened, O must reveal the evidence bundle that matches the original
evidenceHash. - O submits:
evidenceBundleevidenceHash(recomputed by S or checked by V)- any additional response metadata required by the challenge type
- If O fails to reveal within the response window, the protocol marks the submission as failed for settlement.
- After a challenge is opened, O must reveal the evidence bundle that matches the original
-
Verifier checks the targeted claim only
- V (or an on-chain verifier) runs checks based on
challengeType. - Examples:
- FRESHNESS: verify timestamps and that the evidence corresponds to
timeWindow. - ELIGIBILITY: verify node membership status and that the submission was within allowed intervals.
- CORRECTNESS: verify proof structure and that the proof commits to the claimed
result.
- FRESHNESS: verify timestamps and that the evidence corresponds to
- V should not re-run unrelated checks; it should focus on the challenged claim.
- V (or an on-chain verifier) runs checks based on
-
Decision and settlement
- V outputs
passorfailfor the submission under the challenge. - S records the decision and finalizes rewards accordingly.
- If the operator fails, the client may receive a predefined outcome (e.g., operator slashing or client fee refund), depending on the incentive model.
- V outputs
-
Post-mortem record for auditability
- S stores:
- challenge parameters
- evidence hashes
- verifier decision
- a short reason code (not a full essay)
- S stores:
Mind maps
Dispute mechanism overview
Evidence submission flow
Concrete example: âFreshnessâ challenge
Suppose the task requires a measurement taken between 10:00:00 and 10:05:00 UTC.
- O submits at
10:06:30with evidence committed. - C challenges with
challengeType = FRESHNESSand provides parameters:expectedTimeWindow = [10:00:00, 10:05:00]observedEvidenceTimestamp = 10:05:40(from Câs own inspection of public metadata or prior knowledge)
V then checks:
- The evidence bundle includes a timestamp field.
- The timestamp is within the requested window.
- The proof commits to the same timestamp (to prevent âtimestamp swappingâ).
If the timestamp is outside the window, the verifier returns fail, and the chain finalizes the submission as invalid.
Concrete example: âCorrectnessâ challenge with bounded evidence
For correctness, evidence bundles can get large. A practical approach is to split evidence into:
- Proof artifacts (small, required)
- Auxiliary data (optional, revealed only if needed)
When C challenges correctness, it specifies which sub-claim it disputes, such as:
resultMatchesCommitmentproofFormatValidcalibrationVersionAllowed
V then requests only the auxiliary parts relevant to that sub-claim. If the protocol is simpler, you can still keep it bounded by enforcing a maximum bundle size and rejecting oversize reveals.
Implementation notes that prevent common failure modes
- Hash-first, reveal-later: evidence must be committed at submission time so the operator canât change it after seeing the challenge.
- Strict evidence matching: the revealed bundle must hash to the original
evidenceHash. - Challenge type whitelist: only allow challenge types with defined verification logic.
- Window enforcement: missing evidence or late challenges should have deterministic outcomes.
- Reason codes, not narratives: store compact codes like
EVIDENCE_HASH_MISMATCHorTIMESTAMP_OUT_OF_WINDOW.
Minimal state machine (conceptual)
stateDiagram-v2 [*] --> Submitted: operator submits evidenceHash Submitted --> Pending: start challenge window Pending --> Challenged: client opens challenge Challenged --> Responded: operator reveals evidence Responded --> Verified: verifier runs targeted checks Verified --> Finalized: chain finalizes settlement Pending --> Finalized: window ends without challenge Challenged --> Finalized: operator fails to respond in time
Summary of the example flow
The client challenges with a specific claim, the operator reveals evidence that must match the original commitment, and the verifier checks only what was challenged. This keeps disputes bounded, makes outcomes deterministic, and ensures that honest submissions finalize quickly.
9.5 Testing Strategy Example Property-Based Tests for Accounting Rules
Accounting rules in DePIN are where âit seems rightâ becomes âit is right.â The goal of this section is to show a practical property-based testing approach for reward accounting, eligibility, and settlement correctnessâusing examples that map directly to what youâd implement.
What to test (and what not to test)
Property-based tests are best for invariants: statements that must hold for all valid inputs. They are not a replacement for scenario tests that check a specific end-to-end flow.
Focus on invariants like:
- Conservation: total distributed rewards never exceeds the reward budget.
- Monotonicity: increasing quality (within bounds) does not decrease eligible rewards.
- No phantom eligibility: nodes that fail eligibility checks cannot receive rewards.
- Determinism: the same inputs always produce the same accounting outputs.
Avoid testing âimplementation detailsâ like internal rounding steps. Instead, test observable outcomes: computed reward amounts, eligibility flags, and emitted accounting events.
A minimal accounting model (for testing)
Assume a round has:
budget: total tokens available for rewards.nodes: a list of node submissions. Each submission has:qualityScorein \([0, 1]\)isEligibleboolean derived from policy checksweight(e.g., based on capacity class) as a non-negative integer
A simple accounting rule might be:
- Filter eligible nodes.
- Compute a per-node score: \(s_i = \text{qualityScore}_i \times \text{weight}_i\).
- Compute \(\text{totalScore} = \sum s_i\).
- If \(\text{totalScore} = 0\), distribute nothing.
- Otherwise, allocate \(r_i = \text{budget} \times s_i / \text{totalScore}\).
- Apply rounding in a deterministic way (e.g., floor to integer tokens) and allocate any remainder to a deterministic tie-breaker.
Even if your real system is more complex, the testing strategy stays the same: define invariants around the outputs.
Mind map: property-based testing for accounting
Property-based tests: concrete examples
Below are example properties you can implement in any property-based framework. The code is intentionally small and focuses on the accounting logic and invariants.
Example 1: Conservation of rewards
Property: For any round inputs, the sum of integer rewards distributed is never greater than the budget.
def accounting(budget, submissions):
eligible = [s for s in submissions if s['isEligible']]
scores = [s['qualityScore'] * s['weight'] for s in eligible]
total = sum(scores)
if total == 0:
return [0 for _ in submissions]
# floor allocation
raw = []
for s in submissions:
if not s['isEligible']:
raw.append(0)
else:
raw.append(budget * (s['qualityScore'] * s['weight']) / total)
rewards = [int(x) for x in raw]
remainder = budget - sum(rewards)
# deterministic remainder: give 1 token to highest raw values
order = sorted(range(len(submissions)), key=lambda i: raw[i], reverse=True)
for i in order[:remainder]:
rewards[i] += 1
return rewards
def prop_conservation(budget, submissions):
rewards = accounting(budget, submissions)
assert sum(rewards) <= budget
Why this matters: conservation catches rounding bugs and remainder distribution mistakes. It also catches cases where ineligible nodes accidentally receive non-zero rewards.
Example 2: No phantom eligibility
Property: If isEligible is false, the reward for that node must be exactly 0.
def prop_no_phantom_rewards(budget, submissions):
rewards = accounting(budget, submissions)
for s, r in zip(submissions, rewards):
if not s['isEligible']:
assert r == 0
This property is simple, but itâs a great early warning system. Many accounting bugs start with âeligibility affects scoringâ but not âeligibility affects payout.â
Example 3: Zero-score behavior
Define totalScore as the sum of \(qualityScore \times weight\) over eligible nodes. If that is 0, all rewards must be 0.
def prop_zero_score_distributes_nothing(budget, submissions):
eligible = [s for s in submissions if s['isEligible']]
total_score = sum(s['qualityScore'] * s['weight'] for s in eligible)
rewards = accounting(budget, submissions)
if total_score == 0:
assert all(r == 0 for r in rewards)
This property prevents divide-by-zero workarounds from turning into âeveryone gets 1 tokenâ surprises.
Example 4: Determinism
Property: Accounting must be a pure function of inputs. Running it twice should produce identical rewards.
def prop_determinism(budget, submissions):
r1 = accounting(budget, submissions)
r2 = accounting(budget, submissions)
assert r1 == r2
Determinism is especially important when remainder allocation uses ordering. If the tie-breaker is unstable (e.g., depends on iteration order from a map), this property will catch it.
Example 5: Monotonicity under quality changes
Property: For two rounds identical except one eligible nodeâs qualityScore increases, that nodeâs reward should not decrease.
To avoid edge cases with rounding and remainder, constrain the test:
- Keep
budgetand all other nodes fixed. - Increase quality by a small amount.
- Optionally skip cases where
totalScoreis 0.
def prop_monotonic_quality(budget, submissions, i):
if not submissions[i]['isEligible']:
return
if submissions[i]['qualityScore'] == 1:
return
base = accounting(budget, submissions)
submissions2 = [dict(s) for s in submissions]
submissions2[i]['qualityScore'] = min(1.0, submissions2[i]['qualityScore'] + 0.01)
changed = accounting(budget, submissions2)
assert changed[i] >= base[i]
This property is useful because it tests the shape of the reward function. If a bug accidentally applies an inverse multiplier or mixes up numerator/denominator, monotonicity fails quickly.
Generators: making invalid inputs impossible
Good property tests start with generators that respect domain constraints.
Recommended generator constraints:
budget: non-negative integer.weight: non-negative integer (allow 0 to test edge cases).qualityScore: real in \([0,1]\) (or fixed-point integers to avoid float artifacts).isEligible: boolean.
If you use floats for qualityScore, consider generating fixed-point values (e.g., quality as an integer in \([0, 10000]\)) and converting to \(qualityScore = q/10000\) inside the accounting function. This reduces âalmost equalâ rounding surprises.
Interpreting failures (shrinking and minimal counterexamples)
When a property fails, the framework should shrink to a minimal counterexample. Use that to pinpoint the exact invariant breach.
Common minimal failing cases to look for:
budgetsmall (e.g., 0 or 1) causing remainder logic to behave oddly.- All eligible nodes having
qualityScore=0orweight=0. - Multiple nodes with identical raw scores, stressing tie-breakers.
- One eligible node with
weight=0and others eligible, testing whether âeligibleâ is treated as âeligible and contributes.â
Summary checklist for accounting properties
- Conservation: \(\sum r_i \le \text{budget}\).
- Eligibility: ineligible nodes always get 0.
- Zero-score: if \(\text{totalScore}=0\), all rewards are 0.
- Determinism: same inputs, same outputs.
- Monotonicity: increasing quality for an eligible node does not reduce its reward.
These properties give you coverage across the most failure-prone parts of accounting: filtering, scoring, division, rounding, and remainder distribution.
10. Client Workflows and Application Integration
10.1 Client Request Lifecycle Example From Quote to Proof to Settlement
This section walks through one complete request lifecycle in a DePIN network: a client asks for a quote, submits a task, receives a proof, and ends with settlement. The example uses simple, explicit rules so you can see where correctness checks happen.
Scenario and assumptions
- Task: measure ambient temperature at a site for a fixed interval.
- Quality requirement: proof must include a sensor reading plus metadata showing the reading was taken within the requested time window.
- Actors:
- Client: wants a measurement and pays for it.
- Operator: performs the measurement and produces proof.
- Verifier: checks proof validity and quality.
- Chain: stores minimal state for eligibility, challenge windows, and settlement.
Mind map: lifecycle overview
Step 1: Quote
The client starts by describing the task in a way that can be checked later. A good task spec is specific enough that a verifier can reject incorrect proofs without guessing.
Client inputs
siteId: where the measurement should be taken.timeWindow: e.g.,start=2026-03-24T10:00:00Z,end=2026-03-24T10:05:00Z.measurementType:temperature_c.maxPrice: e.g.,0.50tokens.qualityThresholds: e.g., sensor must report calibration statusvalid.
Quote output The network responds with:
quoteId: unique identifier.price: exact amount to pay if the proof is accepted.quoteExpiry: a short time limit.operatorSet: optional list or criteria for eligible operators.
Concrete example
- Client requests: temperature at
siteId=HARBOR-12from 10:00 to 10:05. - Network returns:
quoteId=Q-9912,price=0.42,quoteExpiry=10:02.
Client checks
- The quote is signed by the network authority.
- The quote is not expired.
- The price is within
maxPrice.
If any check fails, the client stops early. This prevents paying for tasks that cannot be verified under the same rules.
Step 2: Task dispatch
Now the client turns the quote into a task request that an operator can execute.
Task request fields
quoteIdtaskSpecHash: hash of the task spec (prevents parameter drift).nonce: random value to prevent replay.clientCallback: where the client expects status updates.
Operator selection There are two common patterns:
- Client-selected: the client chooses an operator from
operatorSet. - Network-selected: the client submits to a dispatcher that assigns an operator.
Either way, the key is that the chosen operator becomes eligible for settlement only if it matches the task request.
Concrete example
- Client sends to operator
OP-44:quoteId=Q-9912taskSpecHash=H(temperature, HARBOR-12, 10:00-10:05, thresholds)nonce=N-77
Anti-replay rule
The operator must include the nonce (or a derived commitment) inside the proof. If someone reuses an old proof for a new request, the verifier can detect the mismatch.
Step 3: Measurement and proof creation
The operator performs the measurement and produces a proof bundle that is verifiable without trusting the operatorâs honesty.
Proof bundle contents
taskSpecHash(or equivalent commitment)noncecommitmentmeasurementValue: e.g.,23.6°CmeasurementTimestamp: when the reading was takensensorMetadata: calibration status, sensor model, and any attestation evidencedataCommitment: hash/Merkle root of raw samples used to compute the reported valueoperatorSignature: signs the proof bundle
Concrete example
- Operator reads sensor at
10:03:12Z. - It reports
23.6°C. - It includes
sensorMetadata.calibration=valid. - It commits to raw samples via
dataCommitment=R-abc....
Why the timestamp matters The verifierâs job is to check that the measurement falls inside the requested window. If the operator reports a value taken outside the window, the proof fails even if the number looks plausible.
Step 4: Proof submission and verification
The client submits the proof to the verifier. In many designs, the client submits to the chain, or submits to an off-chain verifier that then anchors the result.
Submission options
- On-chain verification: proof is checked by smart contracts.
- Off-chain verification + on-chain attestation: verifier signs an acceptance receipt that the chain uses for settlement.
Verifier checks (minimum set)
- Format: proof fields exist and match expected schema.
- Binding:
taskSpecHashequals the one from the task request. - Freshness: measurement timestamp is within
timeWindow. - Anti-replay: proof includes the correct
noncecommitment. - Signature: operator signature is valid.
- Quality: sensor metadata meets thresholds.
Concrete example outcome
- Verifier checks:
- timestamp
10:03:12Zis within10:00-10:05â - calibration
validâ taskSpecHashmatches â- operator signature valid â
- timestamp
- Result:
acceptedwithproofId=P-501.
If rejected, the client receives a failure reason and applies refund rules (often full refund if no accepted proof was produced).
Step 5: Challenge window
Even after acceptance, the system allows disputes for a fixed time window.
What can be challenged
- Incorrect binding to
taskSpecHash. - Measurement timestamp outside the window.
- Mismatched commitments (e.g., reported value doesnât match committed samples).
Evidence model The operator (or challenger) provides evidence that links back to the proofâs commitments. The verifier compares evidence to the commitment rather than trusting raw uploads.
Concrete example
- Acceptance at
t=10:06:00Z. - Challenge window lasts until
t=10:10:00Z. - No challenge arrives, so the acceptance becomes final.
Step 6: Settlement
Settlement moves funds based on final acceptance.
Settlement inputs
taskId(derived fromquoteId+nonce)proofId(accepted proof)finalityStatus(accepted and unchallenged)
Settlement outputs
- Operator reward credited.
- Client receives a receipt containing:
measurementValuemeasurementTimestampproofIdsettlementTxHash(or equivalent)
Concrete example
- Price was
0.42tokens. - After challenge window ends with no disputes, operator gets
0.42. - Client receipt records
23.6°C at 10:03:12ZwithproofId=P-501.
Mind map: key invariants to implement
Minimal end-to-end checklist (client perspective)
- Verify quote signature and expiry.
- Compute and store
taskSpecHashandnonce. - Dispatch task with
quoteId,taskSpecHash, andnonce. - Submit proof and verify acceptance result.
- Wait for challenge window finality.
- Record settlement receipt and measurement details.
This lifecycle keeps the clientâs role clear: it provides a checkable task spec, ensures uniqueness via a nonce, and only treats results as final after the systemâs acceptance becomes unchallengeable.
10.2 API Design Example Endpoints for Discovery, Submission, and Status
A DePIN client usually needs three things from the network: (1) find suitable nodes and terms, (2) submit a request and receive a receipt, and (3) check progress until settlement. The API should make those steps explicit, keep payloads small, and return identifiers that let the client resume after interruptions.
Design goals (practical, not theoretical)
- Deterministic identifiers: Every request should yield a stable
requestIdand asubmissionIdso the client can poll without guessing. - Separation of concerns: Discovery returns options; submission returns commitments; status returns state.
- Idempotency by default: Retries should not create duplicate work. Use
Idempotency-Keyheaders and echo them in responses. - Clear failure modes: Distinguish ânot found / not eligibleâ from âaccepted but pendingâ from ârejected with reason.â
Mind map: endpoint responsibilities

Mind map: common data objects
Discovery endpoints
Discovery answers: âWho can do this, under what terms?â It should not start work.
1) POST /v1/discovery/quotes
Returns quotes for eligible nodes and the proof plan the client must follow.
Request body
taskSpec: what you want measured or providedconstraints: location, hardware requirements, allowed regionsclientContext: optional metadata used for routing
Example request
{
"taskSpec": {
"measurementType": "air_quality_pm25",
"expectedUnits": "ug/m3",
"qualityThreshold": 0.85,
"timeWindow": {"start": 1710000000, "end": 1710003600}
},
"constraints": {
"region": "EU-WEST",
"minUptimePercent": 99.5
},
"clientContext": {
"preferredProofFormat": "signed_measurement_v1"
}
}
Example response
{
"quotes": [
{
"nodeId": "node-9f2a",
"quoteId": "q-31b7",
"price": {"currency": "USD", "amount": "2.50"},
"proofFormat": "signed_measurement_v1",
"qualityThreshold": 0.85,
"availabilityScore": 0.92,
"terms": {
"challengeWindowSeconds": 300,
"maxRetries": 2
}
}
],
"requestId": "req-7c1a"
}
Why include requestId here? It lets the client correlate discovery attempts with later submissions, especially when the client caches quotes.
2) GET /v1/nodes/{nodeId}/capabilities
Useful for clients that already selected a node and want to confirm supported measurement types and proof formats.
Example response
{
"nodeId": "node-9f2a",
"supportedMeasurements": ["air_quality_pm25"],
"supportedProofFormats": ["signed_measurement_v1"],
"hardware": {"sensorModel": "AQ-200"}
}
Submission endpoints
Submission answers: âStart this request and lock in terms.â This is where idempotency matters.
3) POST /v1/submissions
Creates a submission that the network will execute and later verify.
Headers
Idempotency-Key: <uuid>Authorization: <client auth>
Request body
quoteIdor explicittermstaskPayload(inputs needed by the node)clientProofPlan(what the client expects to receive and how it will validate)
Example request
{
"quoteId": "q-31b7",
"taskPayload": {
"locationHint": {"lat": 48.86, "lon": 2.35},
"sampling": {"intervalSeconds": 60, "durationSeconds": 1800}
},
"clientProofPlan": {
"expectedProofFormat": "signed_measurement_v1",
"verifyFreshness": true
}
}
Example response
{
"requestId": "req-7c1a",
"submissionId": "sub-4a10",
"idempotencyKey": "b3c2f0d1-1a2b-4c3d-9e0f-7a1b2c3d4e5f",
"receipt": {
"status": "accepted",
"createdAt": 1710000123,
"challengeWindowEndsAt": 1710000423
},
"payment": {
"escrowRef": "esc-88aa",
"clientFee": {"currency": "USD", "amount": "0.10"}
}
}
Idempotency behavior (what clients should rely on):
- If the same
Idempotency-Keyis reused, the server returns the samesubmissionIdand receipt fields. - If the first attempt is still processing, the server returns the current receipt state rather than creating a new submission.
4) POST /v1/submissions/{submissionId}/cancel
Cancels before verification starts. If verification already began, return a clear error.
Example response
{
"submissionId": "sub-4a10",
"result": "canceled",
"canceledAt": 1710000200
}
Status endpoints
Status answers: âWhere is this submission in the lifecycle?â It should be easy to poll and easy to interpret.
5) GET /v1/submissions/{submissionId}/status
Example response
{
"submissionId": "sub-4a10",
"state": "verifying",
"progress": {
"stage": "proof_received",
"percent": 62
},
"timestamps": {
"createdAt": 1710000123,
"proofReceivedAt": 1710000190,
"verificationStartedAt": 1710000210
},
"proofRefs": {
"proofHash": "0x9c3a...",
"proofLocation": "ipfs://..."
},
"errors": null
}
State machine suggestion (keep it small):
acceptedexecutingproof_receivedverifyingchallenge_periodverifiedsettledrejected
6) GET /v1/requests/{requestId}/status
Aggregates across submissions if the client used multiple nodes for redundancy.
Example response
{
"requestId": "req-7c1a",
"overallState": "settled",
"bestSubmissionId": "sub-4a10",
"submissions": [
{"submissionId": "sub-4a10", "state": "settled"},
{"submissionId": "sub-4a11", "state": "rejected", "failureReason": "quality_below_threshold"}
],
"settlement": {
"settlementRef": "set-12ff",
"operatorReward": {"currency": "USD", "amount": "1.80"}
}
}
Concrete client polling pattern
A client typically does: discovery â choose quote â submit with idempotency key â poll until terminal.
Example polling loop (pseudo-JSON responses)
- Poll returns
state: verifyingwithproofRefsonce available. - When
statebecomeschallenge_period, the client can optionally validate the proof hash locally. - When
statebecomessettled, the client recordssettlementRefand stops polling.
This structure avoids âmystery statesâ and keeps the client logic aligned with what the network actually does.
10.3 Handling Partial Failures Example Timeouts and Fallback Verification
A DePIN client rarely gets a perfectly clean run: a verifier might be slow, a node might be unreachable, or a proof might arrive but fail validation. Good design treats these as normal outcomes and keeps the workflow moving without paying twice for the same work.
The failure model: what can go wrong
In the clientâs request lifecycle, each stage has distinct failure modes:
- Discovery/quoting fails: you canât find eligible nodes or you canât get a quote.
- Submission fails: a node doesnât accept the task, or the result upload times out.
- Proof verification fails: the proof is malformed, doesnât match the task, or fails cryptographic checks.
- Partial quorum: you get some valid results but not enough to finalize.
- Late arrivals: a result arrives after you already moved on.
The client should define which failures are retryable, which are fatal for this request, and which trigger fallback verification.
Mind map: timeout and fallback strategy
Mind map: Partial failures in client workflows
Core principles that keep the workflow sane
-
Use a single request ID across the whole lifecycle. Every message (task, result, proof, receipts) carries the same
requestId, plus anoncefor freshness. This prevents mixing results from different attempts. -
Separate âwaitingâ from âdeciding.â Timeouts should trigger a decision: retry, add more nodes, or fallback. If you just keep waiting, youâll accumulate late arrivals and complicate accounting.
-
Verify early, verify cheaply. Run lightweight checks first (task ID match, freshness window, signature format) before expensive verification. If a proof fails cheap checks, you can request a replacement immediately.
-
Treat fallback as a defined alternative, not a last-minute improvisation. Fallback verification should have explicit rules: what evidence is acceptable, how it affects scoring, and whether it changes settlement.
A concrete workflow with timeouts
Assume the client wants a measurement proof from multiple nodes to reach a verification threshold.
k= minimum number of valid proofs required to finalize.n= number of nodes you initially dispatch to.T1= time to wait for results from the initial set.T2= additional time for replacement nodes.
Example parameters:
k = 3n = 5T1 = 20sT2 = 15s
Timeline
- Client dispatches tasks to 5 nodes.
- Client collects results until
T1expires. - Client verifies each received proof immediately.
- If valid proofs count
>= k, finalize. - If valid proofs count
< k, dispatch to additional nodes and wait untilT2expires. - If still
< k, trigger fallback verification.
This avoids the common failure where you wait for everything, then discover you canât finalize anyway.
Fallback verification: two practical patterns
Fallback should be deterministic and auditable. Here are two patterns that work well in practice.
Pattern 1: Lower the verification bar with explicit evidence
If full verification requires k proofs, fallback might accept:
- fewer proofs, but with stronger evidence per proof (e.g., an additional attestation signature or a stricter freshness requirement), or
- proofs from a smaller set of higher-trust nodes.
Example rule:
- Normal mode: accept any node proofs that pass standard verification; require
k=3. - Fallback mode: require
k_f=2proofs, but each proof must include an extra attestation fieldoperatorSigand must be within a tighter freshness window.
The client records which mode was used in the receipt so the settlement logic can apply the correct payout multiplier or mark the result as âreduced confidence.â
Pattern 2: Switch from âmeasurement proofâ to âavailability proofâ
Sometimes the measurement itself canât be verified, but you can still prove that the network participated correctly.
Example rule:
- Normal mode: verify measurement value
mwith proofP(m). - Fallback mode: if measurement verification fails, accept a proof that the node produced a commitment
Cfor the requested measurement task, without claiming the numeric value.
This is useful when the client can proceed with a downstream step that only needs confirmation of participation (for example, triggering a retry later or recording an audit trail).
The key is that fallback must not silently pretend it verified the numeric measurement.
Idempotency and late arrivals
Late arrivals are inevitable. The client should handle them without double-counting.
- Each task dispatch includes a
taskAttemptId. - Each result includes
taskAttemptIdandrequestId. - The client keeps a set
acceptedAttemptIdsfor proofs that were used to finalize.
If a result arrives after finalization:
- If its
taskAttemptIdis already inacceptedAttemptIds, ignore it. - If it belongs to an earlier attempt that was superseded, ignore it.
- If it belongs to the current attempt but arrived late, you can still verify it for logging, but do not change settlement.
This prevents âproof roulette,â where a late valid proof changes the outcome after you already paid.
Example: decision logic in pseudocode
function handleRequest(requestId, nonce):
plan = selectNodes(n)
attemptId = newAttemptId()
dispatch(plan, requestId, nonce, attemptId)
valid = []
deadline1 = now() + T1
while now() < deadline1:
msg = waitForResultOrTimeout()
if msg.requestId != requestId or msg.nonce != nonce: continue
if msg.attemptId != attemptId: continue
if verifyLight(msg) == false: continue
if verifyFull(msg) == true: valid.append(msg)
if len(valid) >= k: return finalize(valid, mode="normal")
replacements = selectNodes(additionalNeeded)
attemptId2 = newAttemptId()
dispatch(replacements, requestId, nonce, attemptId2)
deadline2 = now() + T2
while now() < deadline2:
msg = waitForResultOrTimeout()
if msg.requestId != requestId or msg.nonce != nonce: continue
if msg.attemptId != attemptId2: continue
if verifyLight(msg) == false: continue
if verifyFull(msg) == true: valid.append(msg)
if len(valid) >= k: return finalize(valid, mode="normal")
return fallbackVerify(requestId, nonce, valid)
Fallback verification example with explicit modes
Suppose fallback uses Pattern 1 (lower bar with stronger evidence).
- Normal: require
k=3standard proofs. - Fallback: require
k_f=2proofs where each proof includesoperatorSigand passesverifyFreshnessStrict.
Client behavior:
- If
len(valid) >= k_fand all proofs invalidsatisfy fallback evidence requirements, finalize inmode="fallback". - Otherwise, return a receipt stating
status="unverified"and include which checks failed (e.g.,quorumShortfall,evidenceMissing,freshnessExpired).
This gives downstream systems a clear, machine-readable reason for the outcome.
Accounting and settlement alignment
To keep payments consistent with verification:
- Only proofs that were actually used for finalization are eligible for settlement.
- If fallback finalizes with reduced confidence, settlement should reflect that mode explicitly (e.g., different payout multiplier or different fee handling).
- If the client returns
unverified, it should avoid triggering operator reward for measurement claims.
A simple rule prevents many bugs: settlement is driven by the finalization mode and the set of accepted proofs, not by the number of messages received.
What the client returns: a readable receipt
A good receipt makes debugging boringâin a good way. Include:
requestId,nonce,attemptIdsusedmode:normal,fallback, orunverifiedacceptedProofIdsrejectionReasonssummary (counts by reason)timing:T1andT2outcomes (e.g.,quorumReachedAt=stageB)
With this, you can explain outcomes without guessing which node was slow or which proof was malformed.
10.4 User Experience Requirements Example Transparent Proof and Receipt Display
A DePIN client should make it obvious what happened, what was proven, and what you can do next. âTransparentâ here means the user can inspect the proof inputs and the resulting receipt without needing to understand the entire protocol stack.
UX goals that drive the design
- Clarity of the request: The UI should show the exact job parameters the client submitted (e.g., location, time window, measurement type, and expected quality threshold). If the user canât see these, they canât tell whether the proof matches their intent.
- Traceability of evidence: The receipt should link each payout-relevant outcome to the evidence that supports it (hashes, signatures, and proof artifacts). The user should be able to verify âwhat was claimedâ even if they donât verify cryptography themselves.
- Actionable status: The UI should distinguish between âsubmitted,â âverifying,â âdisputed,â âfinalized,â and âpaid.â A single generic âprocessingâ label turns troubleshooting into guesswork.
- Graceful failure: When verification fails or times out, the UI should explain what failed (e.g., freshness check, signature validation, measurement mismatch) and what the user can retry.
Mind map: receipt and proof display
A concrete UI layout: three panels
Panel A: Request Summary
- Job ID (copy button)
- Measurement type (e.g., âcoverage checkâ)
- Constraints: time window, region, and any quality threshold
- Client-side quote: expected reward range or fee estimate
- Submission timestamp and the node identity used (or selection rule)
Panel B: Proof Summary
- Proof status:
Pending,Verified,Rejected - Verification result:
Pass/Failplus a short reason code - Evidence digest list: hashes of measurement artifacts and proof objects
- Freshness indicators: âmeasurement timestamp within allowed windowâ (with the actual bounds)
Panel C: Receipt Summary
- Receipt ID and settlement state:
FinalizedorReverted - Payout breakdown: operator reward, client fee, and any quality multiplier applied
- Dispute fields (if any): challenge ID, evidence submitted, final ruling
- Links to raw artifacts (optional) and always-present digests
This structure keeps the user oriented: inputs first, then the proof, then the settlement.
Example: a transparent receipt screen
Below is an example of what a user might see after submitting a coverage request.
Request Summary
- Job ID:
job_7f2a...9c - Measurement:
coverage: region=R3, time=2026-03-20T10:00Z..10:15Z - Quality threshold:
>= 0.80 - Submitted by:
client_app_v1.4 - Node selection:
top-3 eligible by stake(rule shown, not hidden)
Proof Summary
- Proof status:
Verified - Result:
Pass - Reason code (for audit):
freshness_ok + signature_ok + measurement_match - Measurement timestamp:
10:07:42Z - Allowed window:
10:00Z..10:15Z - Evidence digests:
- Measurement artifact hash:
0x9a31...c0 - Proof object hash:
0x4b02...11 - Node attestation signature:
0x77dd...aa
- Measurement artifact hash:
Receipt Summary
- Receipt ID:
rcpt_1b8e...44 - Settlement state:
Finalized - Payout:
- Operator reward:
0.42 ETH - Client fee:
0.03 ETH - Quality multiplier:
1.05x(applied because score=0.84)
- Operator reward:
- Dispute:
None
Notice whatâs missing: the UI doesnât require the user to understand the proof system to trust the outcome. It provides the exact inputs, the verification outcome, and the digests that tie the receipt to evidence.
Evidence panels: show digests, not mystery blobs
Users often want to confirm âthis receipt corresponds to that proof.â The UI should present evidence as a list of labeled digests.
A good evidence row includes:
- Label:
Measurement artifact hash - Digest value:
0x... - Evidence type:
artifact/attestation/proof - Optional: a âcopy digestâ button
If the UI also provides a âdownload artifactâ action, it should still keep the digest visible so the user can compare what they downloaded to what the receipt claims.
Verification status: reason codes that map to UI text
Instead of a single âfailedâ message, use a small set of reason codes that the UI can render into human-readable explanations.
Example mapping:
freshness_okâ âMeasurement time is within the allowed window.âsignature_okâ âNode attestation signature validated.âmeasurement_matchâ âMeasured value matches the requested constraint.âproof_format_invalidâ âProof object did not match the expected format.âchallenge_lostâ âDispute was resolved against the submitted evidence.â
This approach prevents the UI from repeating the same paragraph while still giving concrete troubleshooting signals.
Timeline view: make state transitions explicit
A timeline reduces confusion when verification takes time or when disputes occur.
Example timeline entries:
Submittedat10:01:03ZProof receivedat10:02:10ZVerification startedat10:02:12ZVerification passedat10:02:18ZReceipt finalizedat10:02:45Z
If something stalls, the timeline should show where it stalled (e.g., âawaiting verifier quorumâ vs âawaiting client confirmationâ).
Exportable receipt: keep it machine-checkable
Provide an export button that outputs a structured receipt object containing:
- job parameters (as submitted)
- proof status and reason codes
- evidence digests
- settlement fields and payout breakdown
- timestamps and identifiers
This makes receipts useful for internal audits and for users who want to store them.
{
"jobId": "job_7f2a...9c",
"request": {"measurement": "coverage", "region": "R3"},
"proof": {"status": "Verified", "reasons": ["freshness_ok","signature_ok"]},
"evidenceDigests": {
"measurementArtifact": "0x9a31...c0",
"proofObject": "0x4b02...11",
"attestation": "0x77dd...aa"
},
"receipt": {
"receiptId": "rcpt_1b8e...44",
"state": "Finalized",
"payout": {"operatorReward": "0.42 ETH", "clientFee": "0.03 ETH"}
}
}
Failure states: what the UI should say
When verification fails, the UI should:
- Show the failing reason code(s)
- Preserve the evidence digests that were checked
- Offer a safe retry path (e.g., re-submit with a new time window or request a different node selection)
Example failure rendering:
- Proof status:
Rejected - Result:
Fail - Reason:
freshness_window_missed - Details: âMeasurement timestamp 10:16:02Z is outside allowed window 10:00Z..10:15Z.â
This is not just user-friendly; it prevents repeated submissions that are doomed by the same constraint.
Mind map: UX details that prevent confusion
Summary requirement checklist
- The UI displays request inputs, proof verification result, and receipt settlement in separate, labeled sections.
- Evidence is shown as digest-labeled items so the receipt is auditable.
- Status is state-specific (not one generic spinner).
- Failures include reason codes mapped to concrete explanations.
- Receipts can be exported as structured data.
When these requirements are met, users can understand what happened without needing to trust the interface blindly or reverse-engineer the protocol.
10.5 Integration Example: Building a Minimal Client for a Single Use Case
This section shows a minimal client that completes one end-to-end workflow: request a quote, submit a task to operators, collect proofs, and finalize settlement. The goal is not to cover every feature; itâs to demonstrate the smallest set of moving parts that still behaves correctly.
Use case definition (what the client must do)
Assume a network where a client wants a physical measurement (e.g., âtemperature reading at location X during time window Tâ). The client must:
- Discover the current network parameters needed to form requests.
- Create a task request with a unique id.
- Submit the task to the network.
- Receive operator submissions (proofs + metadata).
- Verify basic proof structure locally.
- Submit the finalized result to the chain.
- Wait for confirmation and record the receipt.
A minimal client can skip advanced features like multi-round renegotiation, operator reputation scoring, or complex dispute UX. It should still handle timeouts and idempotency.
Mind map: minimal client components
Data model: the smallest set of types
Use explicit types so you can reason about what is safe to retry.
Quote: includesprice,currency,maxTaskDuration, and any required request fields.TaskRequest: includesrequestId,taskId,measurementSpec,timeWindow, andpaymentPlan.OperatorSubmission: includestaskId,operatorId,proof,evidenceRefs, andsubmissionId.FinalResult: includestaskId,aggregatedValue(or selected value),proofBundle, andqualityScore.Receipt: includestaskId,chainTxHash,status, andsettlementAmount.
Step-by-step workflow with concrete examples
1) Configuration and idempotency
Pick a stable clientId and generate a requestId per user action.
Example values:
clientId:client-7f2arequestId:req-20260324-0012idempotencyKey:idem-req-20260324-0012
Idempotency matters because retries can happen after network hiccups. The client should reuse the same idempotencyKey for the same requestId.
2) Quote
Call quote() to learn what the network expects.
Example request:
- measurementSpec:
{ type: "temperature", unit: "C" } - location:
{ lat: 40.741, lon: -73.989 } - timeWindow:
{ start: 1710000000, end: 1710000600 }
Example quote response:
price:0.02currency:ETHmaxTaskDuration:300srequiredFields:["location", "timeWindow", "measurementSpec"]
The client should validate that the quoteâs required fields exist in the task request before submitting.
3) Submit task
Create a TaskRequest and submit it.
Example TaskRequest (conceptual):
requestId:req-20260324-0012taskId:task-9c1b(generated by client or returned by network)measurementSpec:{ type: "temperature", unit: "C" }timeWindow:{ start: 1710000000, end: 1710000600 }paymentPlan:{ escrow: true, maxFee: 0.02 ETH }
The client should include:
idempotencyKey:idem-req-20260324-0012clientSignature: signature over the task request fields
If the network returns âalready existsâ for the idempotency key, the client should switch to polling rather than failing.
4) Poll submissions
Poll pollSubmissions(taskId) until either:
- enough submissions arrive to meet the verification threshold, or
maxTaskDurationexpires.
Example threshold logic (minimal):
- require at least
k=3operator submissions - accept submissions until
now > start + maxTaskDuration
Store each OperatorSubmission by submissionId to avoid double-processing.
5) Local verification (basic, not heroic)
Before touching the chain, do cheap checks:
- Proof schema matches expected structure.
- Evidence references are present (e.g., commitment hashes).
- If proofs include operator signatures, verify them.
- If proofs include commitments, recompute hashes from evidence refs.
Example local checks:
proof.type == "temperature_measurement_v1"proof.commitment == hash(evidenceBytes)operatorSignatureverifies againstoperatorIdâs public key (from registry data fetched earlier)
If a submission fails local checks, mark it as invalid and continue polling until you have enough valid submissions.
6) Aggregate or select a result
A minimal client can use a simple selection rule. For instance:
- choose the median of submitted values
- compute
qualityScoreas the fraction of valid submissions
Example submissions (values in °C):
- operator A: 22.1
- operator B: 22.3
- operator C: 22.0
Median is 22.1. If only two valid submissions arrive, the client can either fail (strict mode) or proceed with a smaller set (lenient mode). For a minimal client, strict mode is easier to reason about.
7) Submit result to chain
Call submitResult(finalResult).
The client should include:
taskIdaggregatedValueproofBundle(the subset of operator proofs used)qualityScore- a signature from the client (if required by the contract)
If the chain call fails due to âresult already submitted,â fetch the receipt and stop.
8) Wait for receipt
Call getReceipt(taskId) until status is Finalized (or equivalent). Record:
chainTxHashsettlementAmountfinalValue
Minimal client mind map: control flow
Control Flow
Start
-> loadConfig
-> requestId = new
-> idempotencyKey = f(requestId)
-> quote = quote(measurementSpec, location, timeWindow)
-> task = buildTask(quote, requestId)
-> submitTask(task, idempotencyKey)
-> if alreadyExists: taskId = existing
-> submissions = []
-> loop until threshold or timeout
-> newSubs = pollSubmissions(taskId)
-> for each sub: if not seen -> localVerify
-> if valid: add
-> if threshold not met: fail with reason
-> final = aggregate(validSubs)
-> submitResult(final)
-> receipt = waitReceipt(taskId)
-> return receipt
Example pseudo-implementation (kept intentionally small)
type Receipt = { taskId: string; txHash: string; status: string; amount: string };
async function runMinimalClient(input: {
measurementSpec: any; location: any; timeWindow: { start: number; end: number };
}): Promise<Receipt> {
const requestId = `req-${Date.now()}`;
const idempotencyKey = `idem-${requestId}`;
const quote = await api.quote(input, { requestId });
const task = buildTask({ ...input, quote, requestId });
const submit = await api.submitTask(task, { idempotencyKey });
const taskId = submit.taskId;
const validSubs: any[] = [];
const deadline = Date.now() + quote.maxTaskDurationMs;
const seen = new Set<string>();
while (Date.now() < deadline && validSubs.length < 3) {
const subs = await api.pollSubmissions(taskId);
for (const s of subs) {
if (seen.has(s.submissionId)) continue;
seen.add(s.submissionId);
if (localVerify(s)) validSubs.push(s);
}
await sleep(1000);
}
if (validSubs.length < 3) throw new Error("threshold_not_met");
const finalResult = aggregateMedian(validSubs);
await api.submitResult(finalResult);
return await api.waitReceipt(taskId);
}
Practical notes that prevent common integration bugs
- Retry boundaries: retry
quote()andpollSubmissions()freely; retrysubmitTask()only with the sameidempotencyKey. - Proof handling: treat proofs as immutable blobs; store them by
submissionIdso you can reproduce what you aggregated. - Local verification scope: keep it to checks that are deterministic and cheap; if you need heavy computation, do it only on already-valid submissions.
- Aggregation determinism: ensure the aggregation rule is deterministic given the same set of valid submissions, so the on-chain result matches what you expect.
Minimal client output contract
Return a Receipt object to the caller with enough information to display progress and reconcile outcomes:
taskIdtxHashstatusamount- optionally
finalValueandusedOperators(useful for debugging and audit trails)
This is the smallest client that still respects the networkâs lifecycle: request, collect, verify, finalize, and record.
11. Security Architecture and Threat-Driven Controls
11.1 Threat Modeling Scope Example Assets, Actors, and Attack Surfaces
Threat modeling starts by drawing a clean boundary around what you care about. In a DePIN network, âcare aboutâ usually means: (1) money, (2) measurements and proofs, (3) who is allowed to participate, and (4) availability of the service. The scope section should name these explicitly, then map who can touch them and how.
Scope: define the assets (what must stay correct)
Use a short list of assets with a one-line definition and a concrete failure mode.
- On-chain settlement state: eligibility, reward amounts, and finality markers. Failure mode: incorrect payouts or payouts that canât be reconciled.
- Proof artifacts (off-chain): signed measurements, proof bundles, and metadata linking them to a task. Failure mode: tampered artifacts that still verify, or valid artifacts attached to the wrong task.
- Measurement integrity: the relationship between a real-world observation and the submitted proof. Failure mode: fake measurements that pass verification due to weak freshness or weak challenge design.
- Node identity and membership: keys, registration records, and revocation status. Failure mode: unauthorized nodes join, or revoked nodes continue operating.
- Task assignment and result routing: mapping from a client request to a specific node and expected output. Failure mode: results are swapped between tasks or replayed.
- Client request data: parameters that define what was requested and how it should be validated. Failure mode: clients accept incorrect results because validation rules are ambiguous.
- Operational availability: the ability to submit proofs, verify them, and settle. Failure mode: queues back up, verification stalls, or nodes are effectively excluded.
A helpful rule: if you canât describe the failure mode in one sentence, the asset is probably too vague.
Scope: define the actors (who can act)
Actors should be concrete roles, not job titles. For each actor, note their capabilities.
- Client: requests work, verifies responses, and triggers settlement. Capabilities: can be honest or misconfigured; can be malicious if it controls the request.
- Node operator: runs measurement hardware/software and submits results. Capabilities: can be honest, negligent, or adversarial.
- Verifier / coordinator: checks proofs, enforces eligibility, and prepares on-chain updates. Capabilities: may be centralized or distributed; can be compromised.
- Smart contract / protocol logic: enforces rules deterministically. Capabilities: cannot be âhackedâ directly, but can be exploited through design flaws.
- Network adversary: can intercept, delay, reorder, or drop messages. Capabilities: can replay old messages if freshness is weak.
- Indexer / off-chain services: build read models and assist operators. Capabilities: can be wrong or incomplete, but should not affect settlement correctness.
- Governance participants: update parameters and policies. Capabilities: can be honest or malicious within the rules.
If your system has multiple verifiers, treat them as separate actors because compromise impact differs.
Scope: define attack surfaces (where things can go wrong)
Attack surfaces are the interfaces where an adversary can influence inputs, outputs, or timing.
-
Identity and admission
- Registration endpoints and key submission.
- Proof of control (e.g., signing challenge, certificate validation).
- Revocation and key rotation flows.
-
Task lifecycle
- Task creation and assignment.
- Result submission endpoints.
- Status updates and retries.
-
Proof submission and verification
- Proof bundle format parsing.
- Signature verification and domain separation.
- Verification pipeline stages and thresholds.
-
On-chain interaction
- Contract methods for registering, reporting, and settling.
- Event emission and parameter updates.
- Challenge/dispute windows.
-
Off-chain storage and retrieval
- Storage upload/download APIs.
- Content addressing and integrity checks.
- Metadata that links proofs to tasks.
-
Networking and transport
- Message schemas, authentication, and replay protection.
- Retry logic and idempotency keys.
- Peer-to-peer or coordinator-to-node communication.
-
Operational tooling
- Admin actions, operator dashboards, and runbooks.
- Secrets management and key storage.
- Monitoring/alerting pipelines that trigger automated actions.
Mind map: scope overview
Concrete example: one end-to-end flow and its threats
Consider a simplified flow: a client requests a measurement, a node submits a proof, a verifier checks it, and the contract settles rewards.
Assets touched: client request data, task routing, proof artifacts, settlement state.
Threats by surface:
- Task routing swap: An adversary replays a previously valid proof bundle but changes only the task identifier in transit. If the proof verification does not bind the proof to the exact task parameters (including a domain-separated task hash), the verifier might accept it.
- Freshness failure: A network adversary delays a valid submission until after a challenge window closes. If the contract or verifier doesnât enforce time bounds consistently, the system may settle stale results.
- Identity confusion: A node rotates keys but the verifier still trusts the old key for eligibility. If revocation and key rotation are not synchronized with admission checks, revoked nodes can keep reporting.
- Parser ambiguity: A malicious node crafts a proof bundle that parses differently across components (e.g., different canonicalization rules). If the verifier and the contract disagree on what was âactually submitted,â reconciliation breaks.
- Availability degradation: A compromised verifier accepts tasks but never finalizes verification, causing clients to time out and operators to waste resources. Even if settlement is safe, the service becomes unusable.
Each threat should map to a control you expect to exist. For example, binding proofs to task hashes addresses routing swaps; enforcing consistent time windows addresses freshness; synchronizing identity state addresses identity confusion.
Practical scoping checklist (what to write down)
- Trust assumptions: what you assume about each actor (e.g., âverifier is honest-but-bounded,â or âcontract is correct by constructionâ).
- Invariants: statements that must always hold, such as âa settled reward corresponds to a proof that verifies against the exact task parameters.â
- Out of scope: explicitly list what you are not modeling (e.g., hardware side-channel attacks) so the team doesnât waste time.
- Severity criteria: define what counts as high impact (e.g., direct theft of rewards vs. temporary delays).
A good scope section ends with a one-paragraph summary: which assets are protected, which actors are considered adversarial, and which interfaces are in scope for manipulation.
11.2 Authentication and Authorization Example Role-Based Access for Admin Actions
A DePIN network typically has multiple âadmin-likeâ capabilities: admitting nodes, changing verification parameters, pausing settlement, and reviewing disputes. Authentication answers âwho are you?â Authorization answers âwhat are you allowed to do?â In practice, RBAC (role-based access control) keeps these decisions explicit and testable.
The goal: separate identity from permissions
Use a clear split:
- Authentication: establish a principal (user or service) using signed credentials.
- Authorization: map that principal to roles, and roles to permissions.
A common mistake is to treat âbeing logged inâ as permission. Instead, treat login as identity proof, then apply RBAC rules for each admin action.
Mind map: RBAC for admin actions
Example roles and permissions
Define roles that match real workflows, not org charts. Hereâs a practical set for a DePIN admin surface:
- NetworkAdmin: admits and revokes nodes across the whole network.
- OperatorAdmin: manages operator-specific settings (like operator metadata) but cannot change global verification parameters.
- ParameterMaintainer: updates verification parameters and quality thresholds.
- DisputeReviewer: can submit dispute decisions but not pause settlement.
- EmergencyPauser: can pause settlement only under strict conditions (for example, after a signed incident ticket).
Permissions are the atomic actions your system checks:
node_admitnode_revokeparameter_updatesettlement_pausedispute_decision
Then map roles to permissions.
Concrete example: admin API request flow
Assume an admin endpoint:
POST /admin/nodes/{nodeId}/admit
The request includes:
- An Authorization header with a signed token.
- A JSON body with the admission reason and any required evidence.
- A
X-Request-Idheader for traceability.
Server-side enforcement:
- Authenticate the token.
- Extract
principal_idandkey_id. - Load the principalâs roles from a trusted store.
- Check whether the principal has
node_admitpermission for the targetnetwork_id. - Validate the request body schema and evidence requirements.
- Record an audit log entry.
- Execute the action.
The key nuance: step 4 must include resource scoping. If the network has multiple operators, a role might allow actions only for a specific operator_id.
Resource scoping: avoid âglobal admin by accidentâ
RBAC becomes safer when permissions are scoped. For example:
OperatorAdminmay haveparameter_updateonly for its own operator.NetworkAdminmay haveparameter_updateonly fornetwork_id = mainnet.
A simple policy model:
- Permission check uses
(permission, resource). - Resource is derived from the URL path and request body.
Example:
- Request:
POST /admin/operators/op-42/parameters - Resource:
operator_id = op-42 - Role:
OperatorAdmin - Permission:
parameter_update
If the token belongs to an admin for op-7, the check fails even though the role name matches.
Example policy table (human-readable)
| Role | Permission | Resource scope | Example action |
|---|---|---|---|
| NetworkAdmin | node_admit | network_id | Admit node to network |
| NetworkAdmin | node_revoke | network_id | Revoke misbehaving node |
| ParameterMaintainer | parameter_update | network_id | Update quality threshold |
| OperatorAdmin | parameter_update | operator_id | Update operator metadata |
| DisputeReviewer | dispute_decision | network_id + dispute_id | Approve dispute outcome |
| EmergencyPauser | settlement_pause | network_id + incident_ticket | Pause settlement |
Enforcement with âdeny by defaultâ
Authorization should default to deny when any required data is missing:
- Missing token â deny.
- Token valid but roles not found â deny.
- Permission not present for the scoped resource â deny.
- Request lacks required evidence fields â deny.
This is boring, which is good. It prevents accidental access when configuration is incomplete.
Example: role check pseudocode
function authorize(principal, action, resource):
roles = loadRoles(principal.id)
for role in roles:
if role.allows(action, resource):
return true
return false
function handleAdmitNode(request):
principal = authenticate(request.token)
resource = { network_id: request.networkId }
if not authorize(principal, 'node_admit', resource):
return 403
validateEvidence(request.body)
auditLog(principal, 'node_admit', resource, request.body)
admitNode(request.nodeId)
return 200
Multi-step constraints for sensitive admin actions
Some actions should require more than a single role check. For example, settlement_pause can require:
- Role:
EmergencyPauser - Evidence: an
incident_ticket_id - Constraint: only one pause per hour per network
- Optional: quorum approval if your governance model supports it
Even if you donât implement quorum, the evidence requirement is already a big improvement because it forces the caller to provide context that can be audited.
Audit logging: what to record and why
An audit log entry should include:
timestampprincipal_idandkey_idaction(permission)resourceidentifiers (network/operator/dispute)request_idfor correlationresult(success/failure)diff summaryfor changes (for example, old vs new parameter values)
A practical detail: store the effective authorization decision inputs (roles used, resource scope) so you can explain âwhyâ later without re-running ambiguous logic.
Service accounts and mTLS: keep admin actions from being âjust another clientâ
Admin actions often come from:
- Human operators (web console)
- Backend services (automation jobs)
Use different authentication mechanisms:
- Human console: signed tokens with short expiry.
- Internal admin service-to-service calls: mTLS with service identity.
Then apply the same RBAC checks in the admin service. That way, even if a service account is compromised, it still canât do actions outside its scoped roles.
Common pitfalls and how RBAC helps
- Over-broad roles: âAdminâ that can do everything. Fix by splitting roles by workflow.
- No resource scoping: Fix by including
network_idoroperator_idin permission checks. - No evidence for sensitive actions: Fix by requiring fields like
incident_ticket_idand validating them. - Authorization scattered across code paths: Fix by centralizing checks in the admin service layer.
RBAC is not a magic shield, but it makes the systemâs access rules explicit. When you can point to a permission check and see exactly which roles grant it, you can test it, review it, and keep it from quietly turning into âwhoever can call the endpoint can do everything.â
11.3 Data Tampering Defenses: Example Signed Artifacts and Hash Anchoring
In a DePIN pipeline, âdata tamperingâ usually means someone changes measurement inputs, proof artifacts, or the metadata that ties them together. The defense goal is simple: if the bytes change, the system must notice, and the on-chain record must make the mismatch provable.
Threats to defend against
- Measurement tampering: a node alters raw sensor readings before producing a proof.
- Proof substitution: a node swaps a valid proof for one measurement with a proof for a different measurement.
- Metadata drift: a node changes timestamps, task IDs, or device identifiers so the proof appears to match a different request.
- Replay: an attacker reuses an old proof or artifact for a new task.
These attacks are defeated by combining signed artifacts (authenticity) with hash anchoring (immutability and linkage).
Mind map: tampering defenses
Signed artifacts: what to sign and why
A common mistake is signing only a few metadata fields like taskId and timestamp. That can still allow an attacker to replace the measurement bytes while keeping the signed fields unchanged. The fix is to sign a payload that includes hashes of the actual content, plus the metadata that binds it to a specific request.
Example payload structure
Assume the client requests a measurement for a specific task:
taskId:0x9a...31deviceId:node-17timeWindow:[1700000000, 1700000060]challengeNonce:0x4c...aa(provided by the verifier to prevent replay)
The node produces:
measurementBytes: raw sensor data (or a canonical serialization)proofBytes: proof artifact (e.g., zk proof, attestation bundle, or signed statement)
The node computes:
measurementHash = SHA256(measurementBytes)proofHash = SHA256(proofBytes)
Then it signs the following canonical payload:
payload = { taskId, deviceId, timeWindow, challengeNonce, measurementHash, proofHash }
The signature is created with the nodeâs long-term signing key (or an active rotation key registered in the membership contract).
Why hash-first signing works
Signing the full measurementBytes can be large and slow. Hash-first signing keeps the signed payload small while still binding the signature to the exact content. If any byte changes, the corresponding hash changes, and signature verification fails.
Hash anchoring: making the linkage public
Signatures prove âwho produced this,â but they donât automatically prove âthis exact content was the one accepted for this taskâ unless the system records a commitment. Hash anchoring provides that record.
Example anchoring flow
- Client or verifier creates a task with
taskIdand achallengeNonce. - Node submits:
measurementBytes(or a content-addressed reference)proofBytessignedPayloadcontainingmeasurementHashandproofHash
- On-chain contract stores commitments:
commitment[taskId].measurementHash = measurementHashcommitment[taskId].proofHash = proofHash
- Later, during verification, the contract (or verifier logic) checks that the submitted bytes hash to the stored values.
This prevents proof substitution. If someone tries to swap proofBytes, the recomputed proofHash wonât match the anchored one.
Concrete example: end-to-end verification checks
Below is a minimal verification checklist for a single task submission.
Inputs:
- taskId, deviceId, timeWindow, challengeNonce
- measurementBytes, proofBytes
- signedPayload (signature + payload fields)
- anchored hashes from chain: anchoredMeasurementHash, anchoredProofHash
Steps:
- measurementHash’ = SHA256(measurementBytes)
- proofHash’ = SHA256(proofBytes)
- Verify signature over canonical(payload):
payload includes taskId, deviceId, timeWindow, challengeNonce,
measurementHash, proofHash - Check payload.taskId == taskId (and similarly for deviceId/timeWindow/nonce)
- Check payload.measurementHash == measurementHash’
- Check payload.proofHash == proofHash’
- Check anchoredMeasurementHash == measurementHash’
- Check anchoredProofHash == proofHash’
- If any check fails: reject and emit mismatch details
Example mismatch event
When rejecting, include enough data to diagnose without leaking secrets:
- expected
anchoredProofHash - received
proofHash' - expected
challengeNonce - received
payload.challengeNonce
This makes disputes practical because you can point to the exact field that diverged.
Replay defense: freshness inside the signed payload
Replay attacks succeed when an old proof can be reused for a new task. The defense is to include a freshness value that is unique per task, such as challengeNonce or a verifier-provided challengeSeed.
Key rule: the freshness value must be included in both:
- the signed payload (so it canât be altered without breaking the signature)
- the on-chain anchoring (so the contract can enforce it matches the task)
If a node tries to submit an old signed payload, the signature verification might still pass, but the payloadâs taskId or challengeNonce wonât match the current task, so the submission is rejected.
Canonicalization: avoid âsame meaning, different bytesâ
Hash anchoring assumes that the bytes being hashed are canonical. If the measurement serialization is ambiguous, two parties can compute different hashes for the same logical data.
Practical rule: define a canonical encoding for measurement and proof metadata.
Example canonicalization choices:
- JSON with sorted keys and fixed number formatting (or avoid JSON entirely)
- fixed-width binary encoding for numeric fields
- explicit units in the encoded representation
If you skip canonicalization, you can end up with false rejects that look like tampering.
Dispute-ready evidence: what to store and what to recompute
A dispute mechanism should allow a challenger to prove that the accepted commitment doesnât match the submitted bytes.
A robust pattern is:
- store only hashes on-chain (small and immutable)
- store signed payloads off-chain (or submit them during dispute)
- during dispute, recompute hashes from the provided bytes and compare to anchored hashes
This keeps the chain lean while still making tampering provable.
Summary of the defense design
- Signed artifacts ensure authenticity and bind content to task-specific metadata.
- Hash anchoring ensures immutability and prevents substitution after acceptance.
- Freshness values inside the signed payload stop replay.
- Canonicalization prevents accidental hash mismatches.
Together, these checks turn âmaybe someone changed somethingâ into âthe system can point to the exact mismatch and reject it deterministically.â
11.4 Replay, Ordering, and Freshness Example Nonces and Timestamp Windows
A DePIN network usually has a simple problem hiding under the hood: messages arrive late, arrive twice, and sometimes arrive in the wrong order. If you treat every incoming proof submission as equally valid, you invite replay attacks (old proofs paid again) and ordering bugs (a later state update overwrites an earlier one). The fix is not magic; itâs disciplined freshness checks, explicit nonces, and deterministic ordering rules.
Core goals
- Replay resistance: A proof submission should be accepted only once for a given task.
- Ordering correctness: If two submissions relate to the same task, the protocol must define which one wins.
- Freshness enforcement: Submissions must be ârecent enoughâ relative to the taskâs expected time window.
Mind map: where replay and ordering bugs come from
Nonces: the simplest replay stopper that actually works
A nonce is a unique value included in what gets signed. If an attacker replays an old signed message, the nonce check fails because the network remembers it has already processed that nonce for that task.
Example: task-scoped nonce
Assume a client creates a task request for measurement:
taskId:0xabc...operatorId:0x123...nonce: random 128-bit value generated by the client for this task
The operator signs a payload that includes the nonce:
payload = hash(taskId || operatorId || nonce || measurementHash || proofType)signature = Sign(operatorKey, payload)
On-chain (or in a verification service that mirrors on-chain rules), the contract stores a record:
usedNonce[taskId][operatorId][nonce] = true
Acceptance rule:
- If
usedNonce[...]is already true, reject. - Otherwise, mark it used and proceed.
This design has two nice properties:
- The nonce is task-scoped, so you donât need a global nonce registry.
- The nonce is part of the signed payload, so an attacker canât swap it without invalidating the signature.
Practical example: idempotent submission
Operator submits proof twice due to a network retry.
- First submission: nonce
N1accepted, nonce marked used. - Second submission: same nonce
N1arrives again.
Result: the second submission is rejected as âalready processed,â and the client can safely treat it as a duplicate without double-paying.
Ordering: define âwinnerâ rules instead of hoping for the best
Ordering issues show up when multiple submissions exist for the same task. You need deterministic tie-breaking.
Example: sequence numbers per task
Let each operator maintain a monotonic sequence number for each taskId.
- Operator sends
seq = 1with an initial proof. - If they later produce a better proof, they send
seq = 2.
The signed payload includes seq:
payload = hash(taskId || operatorId || nonce || seq || measurementHash || proofType)
The contract stores:
lastSeq[taskId][operatorId]
Acceptance rule:
- Accept only if
seq > lastSeq[taskId][operatorId]. - Update
lastSeqto the newseq.
This prevents âlate arrival overwrites newer stateâ because older submissions with smaller seq get rejected.
Tie-breaking when multiple operators compete
If multiple operators can submit proofs for the same task, ordering is not about sequence numbers alone. The protocol should define how to select the accepted proof, for example:
- Prefer proofs that pass verification.
- If multiple pass, choose the one with the earliest valid submission timestamp (or lowest
submissionIndex), or choose the one with the highest quality score.
A clean approach is to separate acceptance (valid and fresh) from selection (which valid proof becomes the one that earns rewards).
Freshness: timestamp windows with tolerance
Freshness checks prevent old proofs from being accepted long after the task deadline. Timestamp windows work well when you treat timestamps as inputs with uncertainty, not absolute truth.
Example: client-issued timestamp and window
When the client creates the task, it includes:
taskCreatedAt(client time)validFrom = taskCreatedAtvalidUntil = taskCreatedAt + windowSeconds
The operator includes taskCreatedAt (or the derived validUntil) in the signed payload.
On verification, the contract checks the current chain time now:
- Accept only if
now <= validUntil
To handle clock skew and network delay, you choose a window that covers expected delays.
Example: timestamp window with tolerance
If you also want to reject proofs that arrive too early (rare, but useful for some workflows), you can use:
now >= validFrom - tolerancenow <= validUntil + tolerance
Where tolerance is a small constant that accounts for minor timing differences.
Why the timestamp must be signed
If the operator can submit a proof with an arbitrary timestamp, they can extend freshness indefinitely. The timestamp (or its derived window bounds) must be included in the signed payload so the verifier can trust it.
Putting it together: a concrete acceptance algorithm
Below is a compact, deterministic rule set you can implement in a verifier service and mirror on-chain.
Inputs: taskId, operatorId, nonce, seq, proof, signedWindow
State: usedNonce[taskId][operatorId][nonce], lastSeq[taskId][operatorId]
Now: chain time
1) Verify signature over hash(taskId, operatorId, nonce, seq, proofHash, signedWindow)
2) Check usedNonce[taskId][operatorId][nonce] == false
3) Check seq > lastSeq[taskId][operatorId]
4) Check freshness: now within signedWindow (validFrom/tolerance to validUntil/tolerance)
5) Verify proof against measurement target for taskId
6) If all checks pass:
- usedNonce[...] = true
- lastSeq[...] = seq
- mark proof as valid for selection/reward
Example scenarios
Scenario A: replay attack
- Attacker replays an old signed proof for
taskId = T. - The old message contains nonce
Nold. - The contract already processed
usedNonce[T][operatorId][Nold] = true.
Result: rejected at step 2, regardless of proof validity.
Scenario B: out-of-order arrival
- Operator submits
seq=1thenseq=2. - Network delivers
seq=2first. - Contract sets
lastSeq[T][operatorId] = 2. - Later,
seq=1arrives.
Result: rejected at step 3 because seq > lastSeq fails.
Scenario C: stale proof
- Task window ends at
validUntil. - Proof arrives after deadline.
Result: rejected at step 4 even if signature and proof are correct.
Design notes that prevent subtle bugs
- Nonce uniqueness scope: Use task-scoped nonces to avoid global coordination.
- Nonce storage size: Store only what you need (e.g., per task and operator) and expire entries when the task is finalized.
- Sequence numbers vs. nonces: Nonces stop replays; sequence numbers stop âolder overwrites newer.â You often want both.
- Timestamp windows: Use chain time for
now, and include window bounds in the signed payload.
When these three controls are combinedânonce for replay, sequence for ordering, and signed timestamp windows for freshnessâthe verifier can be strict without being fragile. The protocol becomes predictable, and retries stop being a source of accidental double acceptance.
11.5 Operational Security Example Key Storage, Rotation, and Audit Logs
Operational security is where âthe cryptography is correctâ meets âthe system still behaves correctly on a Tuesday.â This subsection focuses on three practical areas: key storage, key rotation, and audit logs. The goal is to make key handling boring in the best possible way.
Key storage: keep keys where they canât be copied casually
A good key-storage design answers four questions:
- Where does the key live at rest? (disk, database, HSM, KMS, or memory-only)
- Where does the key live at runtime? (process memory, secure enclave, HSM session)
- Who can access it? (service account, operator, CI pipeline)
- How is access proven? (identity, policy checks, and logged operations)
A common pattern for DePIN nodes is to separate keys by purpose:
- Node identity key: used to sign measurements, proofs, and requests.
- Transport keys: used for TLS/mTLS.
- Signing keys for payloads: used to sign proof artifacts or receipts.
- Admin keys: used only for governance actions or emergency operations.
Example: node identity key with a KMS/HSM-backed signing API
- The private key never leaves the signing service.
- The node process requests signatures by sending a digest (or structured signing request) to the signing service.
- The signing service enforces policy: only the nodeâs service identity can request signatures, and only for allowed key IDs.
This design reduces the risk of âsomeone copied a key file from a container image.â It also makes rotation easier because you rotate the key in the signing service, not across every node container.
Operational checks that matter
- No plaintext keys in environment variables: environment variables often end up in logs, crash dumps, and monitoring.
- No keys in build artifacts: CI logs and caches are frequent accidental leak sources.
- Least privilege IAM: the node identity should have permission to sign, not to export.
- Locked-down admin access: admin actions require separate credentials and are logged with extra detail.
Mind map: key storage and access boundaries
Key rotation: rotate without breaking verification
Rotation has two goals that often conflict:
- Reduce exposure time: limit how long a compromised key can be used.
- Maintain continuity: existing verifiers must still accept signatures produced during the valid window.
A practical rotation strategy uses versioned key IDs and overlapping validity windows.
Example: versioned node signing keys
- Each node has a key ID like
node-signing-v3. - The node includes the key ID in every signed payload.
- Verifiers maintain a mapping of
node_id -> allowed key IDs -> validity windows.
Rotation steps
- Generate a new key in the signing service.
- Publish the new public key and its validity window to the on-chain registry or a signed off-chain directory.
- Start signing with the new key while keeping the old key valid for a short overlap.
- Stop using the old key after the overlap window ends.
- Optionally revoke early if compromise is suspected.
Example: overlap window choice
If proof submissions can be delayed by network issues, choose an overlap window that covers the maximum expected delay plus a buffer. The key point is not the number; itâs that the overlap is tied to system behavior, not guesswork.
Rotation mechanics that prevent foot-guns
- Atomic cutover: update the nodeâs âcurrent key IDâ in one operation so you donât mix key IDs and signatures.
- Idempotent publishing: publishing the new key should be safe to retry without creating duplicates.
- Verifier tolerance: verifiers should accept signatures only when the key ID is known and the timestamp falls within the keyâs window.
Audit logs: record what happened, not just that it happened
Audit logs should answer: who did what, to which key, when, and from where. For key operations, âwhatâ should include the action type and the target key ID.
Log categories for key operations
- Key creation: key ID, algorithm, key generation request ID.
- Key activation: when a key becomes the âcurrent signing key.â
- Key signing requests: node identity, key ID, request ID, and whether the request was allowed or denied.
- Key rotation publication: the new public key fingerprint and the validity window.
- Key revocation: reason code (e.g., operator action vs automated policy), and the effective time.
- Admin actions: any permission changes, policy updates, or export attempts.
Example audit event schema (conceptual)
timestampactor_id(service identity or admin user)actor_type(node service, admin console, automation)action(SIGN_REQUEST, KEY_ACTIVATE, KEY_REVOKE)key_idtarget_node_id(if applicable)request_id(for correlation)result(ALLOWED/DENIED)reason_code(for DENIED)source(host, region, or network identifier)
Example: signing request log
A node service requests a signature for a proof digest.
- If allowed: the log records
SIGN_REQUEST,key_id=node-signing-v3,result=ALLOWED, and therequest_id. - If denied: the log records
result=DENIEDand areason_codesuch asPOLICY_NO_SIGN_PERMISSION.
This makes it possible to distinguish âthe node is brokenâ from âthe key policy is wrong.â
Mind map: rotation and audit logging
Concrete example: end-to-end key handling workflow
Scenario: rotate a node identity signing key from v3 to v4.
- The node operator triggers rotation in the signing service.
- The signing service generates
node-signing-v4and logsKEY_CREATEwith arequest_id. - The node publishes
v4public key and validity window to the registry. - The node updates its local âcurrent key IDâ from
v3tov4in a single config write. - For a defined overlap window, verifiers accept both
v3andv4based on signature timestamps. - After overlap, the node stops requesting signatures from
v3. - The signing service logs
KEY_REVOKEorKEY_DEACTIVATEdepending on your policy.
What you should be able to prove from logs
- The exact time
v4became active. - Whether any signing requests were denied during the cutover.
- Whether verifiers rejected signatures due to key ID mismatch or timestamp outside the window.
Practical audit log retention and integrity
Audit logs are only useful if they survive tampering and operational mistakes.
- Write-once storage or append-only logs: prevent silent edits.
- Separate access controls: the component that signs should not be able to delete its own audit logs.
- Time synchronization: audit timestamps should come from a trusted time source.
- Integrity checks: store a hash chain or signed log batches so you can detect missing or altered entries.
A simple integrity approach is to batch logs every minute and sign the batch digest with an audit signing key held in the same signing service. This keeps the audit trail consistent with the rest of the systemâs key-handling discipline.
Summary checklist
- Store private keys in a signing service (KMS/HSM) and never export them.
- Use versioned key IDs and overlapping validity windows for rotation.
- Include key IDs and timestamps in signed payloads so verifiers can enforce windows.
- Log every key operation with actor, action, key ID, request ID, result, and reason codes.
- Make audit logs append-only and integrity-protected.
When these pieces are in place, key management becomes a controlled workflow rather than a recurring emergency.
12. Reliability Engineering and Operational Readiness
12.1 SLOs and Error Budgets Example Translating Metrics Into Actions
SLOs (Service Level Objectives) turn âwe should be reliableâ into measurable targets. Error budgets turn those targets into a decision system: when you spend too much reliability debt, you stop adding new work and fix whatâs breaking.
Step 1: Pick SLOs that match user impact
Start with a user-facing action and define what âgoodâ means for that action.
Example DePIN workflow: a client submits a task, an operator produces a proof, and the network verifies and settles.
Choose SLOs that map to each stage:
- Proof submission success rate: fraction of tasks that reach the network with a valid submission within a time window.
- Proof verification latency: time from âproof accepted by the networkâ to âverification result finalized.â
- Settlement finality timeliness: fraction of eligible tasks that reach settlement within a deadline.
A common mistake is measuring internal throughput only. Throughput can look great while users experience timeouts and missing receipts.
Step 2: Define the measurement window and the unit of account
SLOs need a consistent window and a clear denominator.
Example definitions (weekly window):
- Denominator: number of tasks created by clients during the week.
- Numerator: tasks that meet the success criteria.
- Time window: âwithin 30 minutesâ for submission, âwithin 2 minutesâ for verification.
If you use different windows for different SLOs, youâll spend time reconciling dashboards instead of fixing issues.
Step 3: Convert SLOs into error budgets
Error budget is the allowed fraction of âbadâ outcomes.
For an SLO target of \(99.5\%\) over a period, the error budget is: \[ \text{Error Budget Fraction} = 1 - 0.995 = 0.005 \]
If you track bad events by count, you can compute the budget in events.
Example:
- Total tasks this week: \(T = 200{,}000\)
- SLO target: \(99.5\%\)
- Allowed failures: \(T \times 0.005 = 1{,}000\) bad tasks
Once you exceed 1,000 bad tasks (by your definition), you enter âbudget burnâ mode.
Step 4: Add burn-rate alerts that trigger actions early
Waiting until the end of the week defeats the purpose. Burn-rate alerts detect fast spending.
Use two burn-rate windows:
- Short window: catches acute incidents (e.g., 1 hour)
- Long window: catches sustained problems (e.g., 1 day)
Example policy:
- Alert if error budget is burning at 10Ă the allowed rate over 1 hour.
- Alert if error budget is burning at 2Ă the allowed rate over 1 day.
This gives you both âsomething is on fireâ and âsomething is slowly wrong.â
Step 5: Translate alerts into concrete actions (the part teams skip)
Define an action ladder. Each rung has a trigger, an owner, and a stopping rule.
Action ladder example for DePIN
-
Warning (budget burn detected)
- Trigger: short-window burn-rate alert fires.
- Owner: on-call verifier/operator coordinator.
- Actions:
- Pause non-essential deployments.
- Increase retry aggressiveness for idempotent steps (e.g., proof submission retries with the same idempotency key).
- Check for a single failing dependency (RPC timeouts, storage retrieval errors).
-
Mitigation (budget burn continues)
- Trigger: both short-window and long-window alerts fire.
- Owner: incident commander.
- Actions:
- Roll back the last change that touched verification or settlement.
- Temporarily reduce concurrency to protect downstream systems and avoid cascading failures.
- Enforce stricter input validation to prevent malformed proofs from consuming verification capacity.
-
Stabilization (budget exhausted or near-exhausted)
- Trigger: projected budget exhaustion within the current period.
- Owner: release manager + protocol maintainer.
- Actions:
- Freeze feature work.
- Route new tasks to a âdegraded but safeâ path (e.g., accept proofs but delay settlement until verification backlog clears).
- Communicate internally with a single status note: whatâs broken, whatâs being changed, and what SLO is at risk.
-
Post-incident learning (after recovery)
- Trigger: SLO violation occurred or mitigation required rollback.
- Owner: reliability lead.
- Actions:
- Write a short incident report focused on the failure mode and the exact metric that moved.
- Add or adjust one measurement definition if the SLO didnât reflect user impact.
Mind map: SLOs to actions
SLOs and Error Budgets â Actions Mind Map
Concrete example with numbers
Assume a weekly period with:
- SLO target for verification latency: \(99.0\%\) within 2 minutes
- Total tasks this week: \(50{,}000\)
Allowed failures: \[ 50{,}000 \times (1 - 0.99) = 500 \]
At 3 days in, you observe:
- Bad tasks so far: 420
- Remaining time: 4 days
- Current burn rate suggests youâll add ~200 more bad tasks
That projection means youâll exceed 500. The action ladder should move you into stabilization mode now, not after the week ends.
Practical guidance for defining âbadâ
Bad outcomes must be crisp. If âverification failedâ can mean five different things, youâll argue during incidents.
Example bad definitions:
- Verification latency bad: verification result not finalized within 2 minutes.
- Verification invalid bad: proof rejected due to signature mismatch or malformed proof structure.
- Verification timeout bad: verifier service timed out while fetching required artifacts.
Then map each bad type to likely causes and the first mitigation step.
Avoiding two common traps
- Trap 1: SLOs that nobody can influence. If the SLO depends on an external system you canât control, you still need an SLO, but you must also define an internal âcontrol SLOâ (e.g., internal verification queue latency) and tie actions to it.
- Trap 2: Too many SLOs. Three well-chosen SLOs for the critical path are easier to manage than twelve that overlap.
Summary
Good SLOs measure user-visible outcomes, error budgets quantify allowed unreliability, and burn-rate alerts provide early warning. The final step is the most important: predefine what you do when the budget is being spent, so reliability becomes a sequence of decisions rather than a postmortem ritual.
12.2 Backpressure and Rate Limiting Example Protecting Verification Pipelines
A DePIN verification pipeline can be thought of as a conveyor belt with a strict rule: if the belt gets overloaded, you donât speed it upâyou slow down the upstream work so the system stays correct and responsive. Backpressure and rate limiting are the two main knobs.
Why verification pipelines need protection
Verification often includes expensive steps: signature checks, proof parsing, data retrieval, and sometimes multi-stage validation. When requests arrive faster than verification can complete, queues grow, latency spikes, and timeouts start causing retries. Retries can multiply load, turning a temporary burst into a sustained overload.
Backpressure prevents this by signaling âstop or slow downâ to upstream components. Rate limiting prevents âtoo many at onceâ by enforcing caps per identity, per client, or per resource.
Mind map: where to apply backpressure and rate limits
A concrete pipeline and the pressure points
Assume a typical flow:
- Client submits a proof request:
SubmitProof(clientId, jobId, payloadRef). - The network assigns a verification job to a worker.
- The worker fetches payload data from storage.
- The worker verifies signatures and proof structure.
- The worker runs verification stages and emits a result.
- The client later queries status and settlement eligibility.
Pressure points:
- Intake endpoint: too many submissions at once.
- Job queue: unbounded growth.
- Worker pool: too many concurrent verifications.
- Storage/RPC: slow dependencies cause worker threads to block.
- Result submission: if result publishing is slow, workers pile up.
Backpressure strategy: bounded queues plus explicit signals
Backpressure works best when it is visible to upstream. A common pattern is bounded admission:
- Maintain a queue with a fixed maximum length.
- If the queue is full, reject new work quickly with a clear retry instruction.
- If the queue is not full, accept and enqueue.
This prevents memory blowups and keeps latency predictable.
Example: bounded admission with retry hints
- Queue capacity: 5,000 jobs.
- Per-client rate limit: 20 submissions/minute.
- Global worker concurrency: 200.
- If the queue is full, return
429 Too Many RequestswithRetry-After: 2seconds.
Clients should treat 429 as a âslow downâ signal rather than an error that triggers immediate retries.
Rate limiting strategy: token buckets per identity and per resource
Rate limiting should be layered. A single global limit is rarely enough because one noisy client can still dominate the queue.
Use token buckets:
- Per client: cap submission rate.
- Per node operator (if operators submit tasks): cap operator-driven load.
- Per dependency: cap calls to storage or RPC.
Token buckets are simple: tokens refill at a steady rate up to a maximum burst size. If tokens are empty, requests wait (or are rejected, depending on your policy).
Example policy
- Client submissions: 20/minute with burst 5.
- Storage fetches: 1,000/minute globally with burst 100.
- Proof parsing CPU-heavy stage: enforced by worker concurrency (see below).
This combination ensures that even if clients behave badly, the systemâs expensive parts remain bounded.
Concurrency caps: the most important backpressure knob
Even with rate limiting, concurrency can still spike due to bursts or retries. Concurrency caps directly limit the number of in-flight verifications.
Use semaphores:
- Global semaphore for verification workers (e.g., 200 permits).
- Optional per-stage semaphores if stages have different costs (e.g., proof parsing vs data retrieval).
If a worker tries to start a verification stage but cannot acquire a permit, it should either:
- wait briefly (bounded wait), or
- fail fast and return a âtry laterâ status.
Waiting inside workers can still tie up threads, so a bounded wait is usually safer.
Queue management: choosing what to drop when full
When the queue is full, you must choose a drop policy. Two common options:
- Drop newest: reject the latest submissions.
- Drop oldest: evict older jobs.
For verification pipelines, drop newest is often safer because older jobs may already be close to completion, while newest jobs are more likely to be part of a burst. However, if jobs are time-sensitive (e.g., must be verified before a deadline), drop oldest can be correct.
Example decision
- If jobs are tied to a fixed
jobIdand clients can resubmit later, drop newest. - If jobs expire quickly and older ones are likely to become invalid, drop oldest.
Idempotency: preventing retry storms from multiplying work
Backpressure and rate limiting reduce overload, but retries still happen. Idempotency ensures retries donât create duplicate verification work.
Use an idempotency key derived from (clientId, jobId) or (clientId, payloadHash).
- If the same key is submitted again while a job is already in progress, return the existing job status.
- If the job completed, return the stored result.
This turns retries into âstatus checks,â not ânew work.â
Dependency backpressure: circuit breakers and timeouts
If storage is slow, worker threads can block and exhaust concurrency permits. Add:
- strict timeouts for storage fetches,
- a circuit breaker that temporarily stops fetching when error rate is high,
- a fallback path that marks the job as âpending retryâ rather than âfailed.â
Example
- Storage fetch timeout: 2 seconds.
- Circuit breaker opens after 50% failures over a 30-second window.
- While open, jobs that require storage fetches are marked
DEFERREDwith a retry-after of 5 seconds.
This keeps the system from spending all its capacity on a failing dependency.
Putting it together: an end-to-end example
Consider a worker service with:
- bounded job queue capacity 5,000,
- global verification concurrency 200,
- per-client token bucket 20/minute,
- idempotency cache for in-flight jobs.
Flow:
- Client submits 300 jobs in 10 seconds.
- The per-client limiter allows only 20/minute plus burst 5, so most submissions receive
429withRetry-After. - Accepted jobs fill the queue up to capacity.
- If the queue becomes full, additional submissions are rejected immediately with
429. - Workers process jobs up to 200 concurrent verifications.
- Storage fetches have timeouts; if storage is degraded, circuit breaker defers jobs instead of failing them.
- Client retries use the same
(clientId, jobId)idempotency key, so retries return status rather than creating duplicates.
Result: the system stays stable, latency remains bounded, and clients get clear instructions on when to try again.
Practical metrics to watch (and what they mean)
- Queue length: should hover below capacity; sustained growth indicates insufficient admission control.
- Verification latency (p50/p95/p99): rising p99 often precedes timeouts.
- 429 rate: if itâs high, clients are overshooting limits; if itâs low during overload, limits may be too permissive.
- In-flight verifications: should not exceed concurrency caps.
- Dependency error rate and timeout rate: spikes justify circuit breaker behavior.
- Retry rate per client: high retry rates suggest idempotency or client backoff is not working.
Minimal implementation sketch (conceptual)
Admission (HTTP/gRPC)
- Check idempotency key
- if in-flight: return existing jobId/status
- if completed: return stored result
- Apply per-client token bucket
- If queue length >= capacity: return 429 + Retry-After
- Enqueue job and return accepted status
Worker
- Acquire global verification semaphore
- Fetch dependency with timeout
- If dependency circuit breaker open: mark DEFERRED
- Run verification stages
- Persist result and release semaphore
Backpressure and rate limiting are not separate features you bolt on at the end. They are the systemâs way of negotiating capacity: intake says ânot now,â workers say âI can only do so much at once,â and retries become controlled status checks instead of accidental amplification.
12.3 Runbooks and Incident Response Example Handling Node Outages
A node outage is the kind of failure that looks simple from the outside (âthe node is downâ) but becomes expensive when you consider how many other components depend on timely proofs, liveness, and settlement eligibility. A good runbook turns that complexity into a sequence of checks with clear decision points.
What ânode outageâ means in this context
Treat an outage as any condition that prevents a node from completing its expected workflow within defined time bounds. Typical symptoms:
- Missing heartbeats beyond the liveness window.
- Proof submissions timing out or arriving late.
- Node failing health checks (bad signatures, invalid measurement format, repeated rejected proofs).
- Node reachable on the network but not progressing (stuck job queue, repeated internal errors).
Roles during an incident
Keep responsibilities explicit so people donât âhelpâ by doing the same thing twice.
- On-call responder: Executes the runbook steps, collects evidence, triggers mitigations.
- Protocol/contract owner (or delegate): Confirms whether on-chain actions are required (e.g., slashing, eligibility changes).
- Operations/infra: Checks host-level issues (disk, CPU, network, container health).
- Client/partner liaison: If clients are impacted, confirms whether to pause requests or adjust routing.
Evidence checklist (before changing anything)
Start with observations that can be repeated and compared.
- Time window: Record incident start time and the liveness/proof deadlines that were missed.
- Scope: Identify affected nodes (single node vs. a cluster) and affected tasks (all jobs vs. specific measurement types).
- Logs: Capture node logs around the last successful heartbeat and last successful proof submission.
- Network view: Confirm whether the node is reachable and whether requests are timing out.
- Chain view (if applicable): Check whether the node is still eligible or has already been marked inactive.
Mind map: incident flow for node outages
Step-by-step runbook
Step 1: Confirm the node is actually failing
A common mistake is to treat âno proofs receivedâ as proof of outage. Sometimes the node is fine but the pipeline is blocked elsewhere.
- Verify the nodeâs last heartbeat timestamp.
- Check whether the coordinator still has the node in its active set.
- Look for coordinator-side errors when dispatching tasks to that node.
Example: Node A stops sending heartbeats at 12:05 UTC. The coordinator still assigns tasks to Node A until 12:10 UTC. In this case, the outage is real, but mitigation should include both node-side recovery and coordinator-side routing changes.
Step 2: Classify the failure mode
Use a small set of categories so the next steps are predictable.
- Offline: No connectivity and no heartbeats.
- Unhealthy: Heartbeats may continue, but proofs fail validation or time out.
- Stuck: Heartbeats continue, but job progress stops (queue length grows, no new proof artifacts).
- Identity mismatch: Node can connect but submissions are rejected due to signature/key/nonce issues.
Example: Node B sends heartbeats every 30 seconds, but every proof submission is rejected with âmeasurement hash mismatch.â That points to a local bug in measurement serialization or a config mismatch, not a network outage.
Step 3: Mitigate quickly (reduce impact)
Mitigation should be reversible and should not require chain operations unless the protocol defines it.
- Stop assigning new work: Update the coordinator scheduler to exclude the node.
- Mark inactive locally: If your system has a local eligibility cache, set it to inactive.
- Adjust routing: Ensure clients or verifiers can select alternative nodes.
- Rate-limit retries: If the coordinator keeps retrying the same node, you can create a self-inflicted load spike.
Example: If the coordinator retries failed dispatches every 2 seconds, switch to exponential backoff and exclude the node after the first liveness breach.
Step 4: Diagnose with targeted checks
Do not run a full âeverything is brokenâ checklist. Pick checks based on the failure mode.
Offline checks
- Host/container health: process running, container restarts, OOM events.
- Disk: free space and filesystem errors (proof artifacts often write to disk).
- Network: outbound connectivity to coordinator/verifier endpoints.
Unhealthy checks
- Proof validation errors: compare expected schema/version with nodeâs config.
- Clock skew: if signatures or freshness windows are enforced, verify time sync.
Stuck checks
- Queue metrics: pending tasks vs. in-progress tasks.
- Worker thread health: deadlocks, blocked IO, or stuck external calls.
Identity mismatch checks
- Key rotation status: confirm node uses the currently registered public key.
- Nonce/freshness handling: ensure the node is not reusing stale nonces.
Example: Node C is stuck with a growing âpending proofsâ queue. Logs show it is failing to write proof artifacts due to âno space left on device.â The fix is operational (disk cleanup or resizing), not protocol changes.
Step 5: Decide whether protocol actions are needed
Only take on-chain or protocol-level actions when the protocol defines a deterministic trigger.
- Eligibility updates: If the protocol requires explicit inactivity marking, do it.
- Slashing/dispute: Trigger only when evidence matches the defined misbehavior conditions.
- Settlement pause: Pause only if settlement depends on missing proofs and your design requires a minimum quorum.
Example: Your protocol slashes only for provable invalid proofs, not for missed heartbeats. In that case, you should mark the node inactive and avoid slashing based solely on downtime.
Step 6: Restore service and validate before re-enabling
Re-enabling a node should be gated by correctness checks, not just âitâs back.â
- Confirm heartbeats resume within the liveness window.
- Run a short warm-up: request a small number of tasks and verify proof acceptance.
- Confirm identity and config match the current protocol version.
Example: Node D returns after a restart, but its config is still on the previous measurement schema version. Warm-up tasks fail validation, so you keep it disabled until config is corrected.
Step 7: Communicate internally and document closure
Close the incident with concrete outcomes.
- What was the root cause category (offline/unhealthy/stuck/identity mismatch).
- What mitigations were applied (routing changes, retries, eligibility updates).
- Whether any protocol actions occurred.
- Any runbook gaps discovered (e.g., missing log fields, unclear thresholds).
Practical example runbook: one node outage
Scenario: Node E misses heartbeats for 8 minutes. Coordinator liveness window is 3 minutes.
- Evidence: last heartbeat at 12:00 UTC; incident start at 12:03 UTC.
- Classification: offline (no connectivity, no heartbeats).
- Mitigation: exclude Node E from scheduler; route tasks to other nodes; apply backoff to dispatch retries.
- Diagnosis: infra checks show container repeatedly restarting due to disk full.
- Protocol actions: none (missed heartbeats only).
- Recovery: free disk, restart container, confirm heartbeats resume.
- Validation: warm-up tasks produce accepted proofs.
- Re-enable: add Node E back to active set.
- Close: update runbook to include a âdisk free spaceâ check as the first offline diagnostic.
Decision thresholds to define in advance
To keep the runbook executable under pressure, define these values in configuration:
- Liveness window (heartbeat miss threshold).
- Proof timeout (dispatch-to-acceptance deadline).
- Retry policy (max retries before exclusion).
- Warm-up size (number of tasks before re-enabling).
- Quorum behavior (what happens when too many nodes are inactive).
When these thresholds are explicit, responders spend less time arguing and more time fixing.
12.4 Monitoring and Alerting Example Dashboards for Proof Latency and Failure Rates
A DePIN network lives or dies by what happens between âa client asked for a proofâ and âthe chain accepted the settlement.â Monitoring should therefore focus on proof latency, proof failure modes, and the operational signals that explain why things are slow or failing. The goal is simple: when an alert fires, you should be able to answer three questions quickly: Is it widespread? Is it getting worse? What component is responsible?
What to measure (and why)
Proof latency is usually the sum of multiple stages, and each stage has different failure causes.
- Queue time: time from request submission to worker pickup. High queue time often means capacity or scheduling issues.
- Acquisition time: time to collect measurements (sensor fetch, operator task execution, or data retrieval). High acquisition time often correlates with network or node health.
- Proof generation time: time to format and sign proof artifacts. High generation time can indicate CPU pressure or library-level bottlenecks.
- Verification time: time for verifiers to validate proofs (on-chain simulation, off-chain checks, or both). High verification time often correlates with verifier load or expensive checks.
- Finality/settlement lag: time from âproof acceptedâ to âsettlement finalized.â This is chain- and finality-dependent.
Failure rates should be broken down by where and how the failure occurred.
- Submission failures: client request rejected before work starts (bad parameters, missing eligibility, signature issues).
- Worker failures: operator/node couldnât complete measurement or couldnât produce required artifacts.
- Proof format failures: proof rejected due to schema mismatch, missing fields, or invalid signatures.
- Verification failures: proof fails cryptographic checks or consistency checks.
- Settlement failures: on-chain transaction reverted, or dispute/challenge flow ended without acceptance.
Mind map: dashboard layout
Dashboard 1: Proof latency overview (stage breakdown)
Use a single page that answers: âHow slow is it, and where is the time going?â
Recommended panels
-
Latency percentiles by stage
- Chart: stacked or grouped bars for p50/p90/p99 of each stage.
- Example interpretation: if queue time is flat but acquisition time jumps, the bottleneck is likely measurement collection, not worker capacity.
-
Latency trend over time
- Chart: line for p90 total latency.
- Add a second line for throughput to avoid misreading âslower because fewer requests.â
-
Stage contribution breakdown
- Chart: average stage durations as a percentage of total.
- Example: if verification time grows from 10% to 35% of total, you likely have verifier overload or a heavier proof path.
-
Top operators by median acquisition time
- Table: operator/node id, median acquisition time, proof success rate.
- This helps you spot âa few slow nodesâ versus âthe whole fleet is slow.â
Concrete example thresholds
Assume you define a target for total proof latency: p90 <= 30s.
- Alert A (latency regression): trigger if p90 > 30s for 10 minutes.
- Alert B (tail risk): trigger if p99 > 60s for 10 minutes.
- Alert C (stage-specific): trigger if acquisition p90 > 20s for 10 minutes.
These are starting points; the key is that each alert maps to a stage so the on-call engineer doesnât have to guess.
Dashboard 2: Failure rates by category (with examples)
A failure dashboard should show both rate and shape. Rate tells you how often; shape tells you whether failures are concentrated in a specific mode.
Recommended panels
-
Error rate by failure category
- Chart: stacked area or grouped bars for categories listed earlier.
- Example: if âproof format failuresâ spikes while âworker failuresâ stays flat, the issue is likely a schema or signing change.
-
Failure rate by stage
- Chart: error rate for queue/acquisition/proof generation/verification/settlement.
- Example: if verification failures rise but acquisition failures do not, you focus on verifier logic or proof construction.
-
Top failure reasons (cardinality-limited)
- Table: reason code, count, percentage of failures.
- Keep reason codes stable and bounded; free-text reasons create unhelpful dashboards.
-
Success rate by operator and task type
- Heatmap: operator vs task type, colored by success rate.
- Example: a single task type failing across many operators suggests a client-side request or proof schema issue.
Concrete example: interpreting a spike
Suppose you see:
- Total p90 latency increases from 25s to 40s.
- Worker failures increase from 1% to 8%.
- Proof format failures remain at 0.2%.
That combination points to measurement execution problems (node health, external dependencies, or timeouts) rather than serialization/signing.
Dashboard 3: Capacity and queue health (to explain latency)
Latency alerts are more actionable when you can see whether the system is overloaded.
Recommended panels
-
Queue depth over time
- Chart: requests waiting for worker pickup.
- If queue depth rises while throughput stays flat, you have a capacity mismatch.
-
Worker utilization
- Chart: active workers / available workers.
- If utilization is low but queue depth rises, tasks may be stuck due to scheduling constraints or eligibility filters.
-
Task timeout counts
- Chart: acquisition timeout, verification timeout, settlement timeout.
- Example: acquisition timeouts rising with node heartbeat drops suggests node instability.
-
Event ingestion health (âno dataâ checks)
- Chart: last event timestamp per stream (proof requests, worker results, verification outcomes).
- Alerts should fire if streams go silent; otherwise youâll chase phantom issues.
Alerting strategy: fewer, sharper alerts
Avoid alerting on every metric. Instead, define alerts that correspond to operational decisions.
Alert set (example)
- Latency regression: p90 total latency > target for 10 minutes.
- Tail risk: p99 total latency > 2Ă target for 10 minutes.
- Acquisition degradation: acquisition p90 > stage target for 10 minutes.
- Failure spike (category): worker failures > 3Ă baseline for 10 minutes.
- Verification failure spike: verification failures > 2Ă baseline for 10 minutes.
- Settlement failure spike: settlement failures > 2Ă baseline for 10 minutes.
- No data: missing event stream updates for 5 minutes.
Triage workflow (what the dashboard should enable)
- Confirm scope: is it all regions/operators or a subset?
- Identify stage: compare queue vs acquisition vs verification.
- Identify failure mode: category spike and top reason codes.
- Correlate with capacity: queue depth and worker utilization.
- Correlate with node health: heartbeat drops or liveness failures.
Mermaid: end-to-end proof timeline for correlation
flowchart LR
A[Client submits proof request] --> B[Queue]
B --> C[Worker picks task]
C --> D[Acquire measurement/data]
D --> E[Generate proof artifact]
E --> F[Submit for verification]
F --> G[Verification outcome]
G --> H[Settlement transaction]
H --> I[Finality reached]
subgraph Metrics
B --> M1[Queue time]
D --> M2[Acquisition time]
E --> M3[Proof generation time]
F --> M4[Verification time]
H --> M5[Settlement lag]
end
Example dashboard layout (what it looks like in practice)
Use a consistent top-to-bottom order: latency first, failures second, capacity third, then drill-down tables.
Panel order
- Total latency p50/p90/p99 (line + stage breakdown)
- Latency trend vs throughput (p90 total)
- Failure rate by category (stacked)
- Failure rate by stage (grouped)
- Top failure reasons (table)
- Queue depth + worker utilization (two lines)
- Node health summary (heartbeat success rate)
- Drill-down tables: operator/task type
When implemented this way, alerts become explanations rather than mysteries. Youâll still need judgment, but the dashboard will already have done the heavy lifting: it separates âslowâ into the stage that caused it and separates âfailedâ into the category that explains it.
12.5 Backup and Recovery Example Restoring Off-Chain Proof Metadata
Off-chain proof metadata is the âpaper trailâ that makes on-chain settlement understandable: it links a proof to the measurement, the evidence files, the verifierâs inputs, and the exact parameters used. Backups matter because the chain can confirm that something was settled, but it usually canât reconstruct why it was considered valid.
What to back up (and what not to)
Back up metadata that is required to reproduce verification inputs and to reconcile payouts. A practical rule: if you would need it to answer âWhich evidence produced this proof, under which rules, and with what version?â then it belongs in backups.
Back up
- Proof index records: mapping from
proofId(or on-chain event id) to evidence locations, hashes, and parameter versions. - Evidence manifests: file list + content hashes + sizes + canonical ordering.
- Verifier inputs: normalized measurement inputs, challenge parameters, and any derived intermediate values needed to re-run checks.
- Parameter snapshots: the exact policy/config version used when the proof was accepted.
- Audit logs: operator actions (submission, retries, challenge outcomes) with timestamps and correlation ids.
Do not back up
- Large raw media that can be re-fetched from a content-addressed store using hashes.
- Ephemeral caches that can be rebuilt from manifests.
- Secrets (keys, tokens). Those require separate key management and rotation procedures.
A concrete metadata model
Assume your system stores proof artifacts off-chain and anchors only hashes on-chain. Your off-chain database holds a record like:
proofIdonChainSettlementTxHashevidenceManifestHashevidenceManifestURI(content-addressed)verifierInputHashparameterVersionpolicyHashacceptedAtandacceptedByVerifierId
The backup goal is to restore these records so you can:
- locate the evidence manifest,
- confirm it matches the anchored hash,
- re-run verification (if needed), and
- reconcile rewards.
Mind map: backup and recovery responsibilities
Backup strategy that survives real mistakes
A common failure is restoring the database snapshot but not the evidence manifests (or restoring manifests from a different time). To avoid this, treat evidence manifests as content-addressed objects and treat database snapshots as indexes.
Recommended approach
- Evidence manifests and verifier inputs are stored by hash (content-addressed).
- Database backups store only the index and policy snapshots needed to find those objects.
- Audit logs are append-only and backed up in order.
This separation reduces the blast radius: if a database snapshot is corrupted, you can still fetch evidence manifests by hash once you have the anchored values.
Example: restoring after a metadata outage
Scenario: your off-chain metadata database is lost for a subset of proofs. The on-chain settlement events remain intact.
Step 1: identify what must be restored
From on-chain events, collect the set of proofIds that were settled during the outage window. For each, extract the anchored hashes:
evidenceManifestHashverifierInputHashpolicyHash(orparameterVersion+ policy hash)
You now have the minimum truth needed to validate restored metadata.
Step 2: restore the index from backups
Restore the most recent consistent backup set that covers the outage window. A consistent set includes:
- the proof index table,
- parameter snapshots table,
- audit log segment(s) up to the same cutoff.
If you use point-in-time recovery, restore to a timestamp just after the last successful backup write.
Step 3: validate restored records against chain anchors
For each proofId:
- Load the restored record.
- Verify
restored.evidenceManifestHash == chain.evidenceManifestHash. - Verify
restored.verifierInputHash == chain.verifierInputHash. - Verify
restored.policyHash == chain.policyHash.
If any check fails, treat the record as invalid and do not reconcile payouts from it.
Step 4: re-fetch evidence manifests by hash
If the index is correct but evidence objects are missing, fetch them using the stored evidenceManifestURI or directly by hash from your content-addressed store.
Then validate:
- compute the manifest hash from the fetched manifest bytes,
- compare to
chain.evidenceManifestHash.
This ensures you didnât restore the âwrongâ manifest that happens to share a filename.
Step 5: re-run verification inputs checks (optional but practical)
If your verifier can be run deterministically from stored verifier inputs, re-run the verification pipeline using the restored verifier inputs and parameter snapshot.
Even if you donât re-run full verification, you can still validate structural integrity:
- input schema version matches,
- challenge parameters match,
- derived input hash matches
verifierInputHash.
Step 6: reconcile payouts safely
Once metadata is validated, reconcile rewards by linking:
- on-chain settlement event id,
- proofId,
- operator/client attribution stored in the metadata record.
If attribution fields are missing, fall back to audit logs for that proofId. If audit logs are also missing, mark the payout as âneeds manual reviewâ rather than guessing.
Example: backup record layout
Use explicit versioning so restores donât depend on implicit schema assumptions.
A small but important detail: store a recordHash computed over canonical JSON. During restore, you can detect silent corruption even if the recordâs internal fields âlookâ plausible.
Example: restore checklist (fast and strict)
Restore Checklist
- Restore proof index + parameter snapshots to a consistent cutoff
- For each proofId, compare anchored hashes from chain
- Reject records with any hash mismatch
- Fetch evidence manifests by hash if missing
- Validate fetched manifest hash matches anchor
- Validate verifier input hash matches anchor
- Reconcile payouts only for fully validated proofs
- Produce a report: restored, validated, rejected, missing
Handling partial restores without creating inconsistencies
If you restore only the proof index but not the parameter snapshots, you may end up with âunknown policyâ errors. The fix is procedural: treat parameter snapshots as required dependencies.
If you restore parameter snapshots but not the index, you can still fetch evidence manifests by hash, but you wonât know which proofs map to which settlements. In that case, rebuild the index by scanning audit logs and settlement events, then re-validate hashes.
What the final restore output should look like
A good recovery run produces a deterministic report:
- Restored proofIds: present in backups.
- Validated proofIds: hashes match chain anchors.
- Rejected proofIds: mismatches or missing dependencies.
- Missing proofIds: no backup coverage.
This report is not just for humans; it becomes the input to your reconciliation job so it never mixes validated and unvalidated data.
13. Governance, Parameter Management, and Policy Enforcement
13.1 Governance Objects Example Proposals, Votes, and Execution Rules
A DePIN governance system needs three things to work reliably: (1) a way to describe changes precisely, (2) a way to decide with clear voting rules, and (3) a way to execute changes safely without surprising side effects. In practice, you model governance as a set of on-chain objects with explicit fields, then enforce execution rules that map those fields to protocol behavior.
Governance objects: what you store
Think of governance as a small database. Each object has a purpose and a minimal set of fields.
- Proposal: the human-readable intent plus machine-readable parameters.
- Vote: a record of a participantâs stance at a specific time.
- Execution: the final âthis change is now activeâ action, tied to the proposal outcome.
A proposal should include enough structure that execution can be deterministic. For example, if you want to change a quality threshold, you should store the new threshold value, the unit, and the target module.
Proposal fields (example)
id: unique identifier.type: one of a small set (e.g.,PARAM_UPDATE,ELIGIBILITY_RULE,TREASURY_ACTION).target: which module or contract is affected (e.g.,RewardsEngine,VerifierPolicy).payload: structured data describing the change.proposer: identity of the submitter.startTime,endTime: voting window.quorumRuleId,thresholdRuleId: which voting rules apply.executionRuleId: which execution constraints apply.status:PENDING,VOTING,SUCCEEDED,FAILED,EXECUTED.
Vote fields (example)
proposalId.voter.choice:YES,NO, orABSTAIN.weight: computed from stake, reputation, or operator status atstartTime.castTime.
Execution fields (example)
proposalId.executedBy.executionTime.resultHash: hash of the proposal payload to bind execution to the approved content.
Mind map: governance flow and objects
Execution rules: how decisions become changes
Execution rules prevent âapproved but brokenâ outcomes. They also make the system easier to reason about because the same approval logic always leads to the same execution behavior.
A good execution rule set typically includes:
- Outcome gating: only execute if the proposal succeeded.
- Payload binding: execute only the exact payload that was voted on.
- Bounds and invariants: reject changes that violate constraints.
- Compatibility checks: ensure the new configuration matches expected formats.
- Activation timing: optionally delay activation to allow monitoring and operational preparation.
Example: parameter update with bounds
Suppose you manage a verifier policy parameter minConfidence used to accept measurements. You want governance to change it, but only within safe bounds.
- Payload example:
target:VerifierPolicypayload:{ "minConfidence": 0.70 }
Execution rule example:
- Preconditions:
- Proposal succeeded.
payloadHashmatches the stored hash.
- Safety checks:
0.50 <= minConfidence <= 0.95.- If
minConfidenceincreases, require a staged activation delay of 7 days.
- Effects:
- Update
VerifierPolicy.minConfidence. - Emit
VerifierPolicyUpdated(minConfidence, proposalId).
- Update
This is not just âvalidation.â Itâs a contract between governance and the rest of the system: the protocol knows exactly what will happen after execution.
Voting rules: quorum, thresholds, and weighting
Voting rules should be explicit and modular so you can reuse them across proposal types.
Example voting rule: quorum + supermajority
- Quorum: total voting weight of
YES+NOmust be at least 20% of eligible weight. - Threshold:
YESweight must be at least 60% of total cast weight. - ABSTAIN: does not count toward quorum but is tracked for transparency.
A typical computation uses weights captured at startTime to avoid last-minute stake changes.
Concrete example: a proposal from start to finish
Scenario: Operators and clients rely on a reward multiplier qualityMultiplier that maps a measurement quality score to rewards. You want to adjust it.
-
Proposal submission
type:PARAM_UPDATEtarget:RewardsEnginepayload:{ "qualityMultiplier": 1.15 }quorumRuleId:QUORUM_20PCTthresholdRuleId:SUPERMAJ_60PCTexecutionRuleId:BOUNDED_PARAM_UPDATEstartTime: block timestamp + 1 hourendTime: startTime + 3 days
-
Voting
- Each eligible voter casts
YESorNOduring the window. - Vote weight is computed from a snapshot at
startTime. - The system rejects a second vote from the same voter.
- Each eligible voter casts
-
Tally and outcome
- After
endTime, the contract computes:castWeight = yesWeight + noWeightquorumMet = castWeight >= 0.20 * eligibleWeightthresholdMet = yesWeight >= 0.60 * castWeight
- If both are true, status becomes
SUCCEEDED; otherwiseFAILED.
- After
-
Execution
- Anyone can call
execute(proposalId)after success. - Execution rule checks:
payloadHashmatches the stored hash.qualityMultiplieris within bounds, e.g.,0.90 <= qualityMultiplier <= 1.30.- If the multiplier increases, enforce a timelock of 2 days.
- On success, the contract updates the parameter and emits an event.
- Anyone can call
Practical design notes that prevent governance headaches
- Use small, typed payloads: a structured payload reduces ambiguity and makes execution deterministic.
- Bind execution to payload hash: this prevents âapproved intent, different data executed.â
- Separate rule selection from rule logic: store
ruleIds in the proposal so the execution engine can apply consistent logic. - Make failure states explicit:
FAILEDshould include a reason code (e.g.,QUORUM_NOT_MET,THRESHOLD_NOT_MET,BOUNDS_VIOLATION) so operators can interpret outcomes without reading contract code.
Example: execution rule as a checklist
Execution Rule: BOUNDED_PARAM_UPDATE
- Require proposal.status == SUCCEEDED
- Require payloadHash == storedPayloadHash
- Switch on payload.target
- Validate payload fields
- numeric bounds
- type checks
- required keys present
- Compatibility checks
- ensure module version supports this parameter
- Activation policy
- immediate or timelocked based on direction of change
- Apply update
- Emit event with proposalId and new values
When these objects and rules are designed together, governance becomes less like a debate and more like a controlled change pipeline: proposals describe changes, votes decide outcomes under known math, and execution rules ensure the protocol only accepts changes that match the approved content and the systemâs constraints.
13.2 Parameter Versioning Example Safe Rollouts With Compatibility Checks
Parameter versioning is the boring part that keeps the exciting parts from breaking. In a DePIN network, parameters affect eligibility, measurement interpretation, reward math, and dispute rules. If you change them without a compatibility plan, you can end up with nodes submitting proofs that the protocol later refuses, or with clients expecting one settlement behavior while the chain enforces another.
What âversioningâ should mean
A parameter version should capture three things:
- Semantics: what the parameter means (e.g., âquality score is computed as weighted averageâ).
- Constraints: what ranges and formats are allowed (e.g., score must be in \([0,1]\)).
- Compatibility rules: how old data is treated when new parameters are active.
A practical approach is to store a versioned parameter set and require every proof and settlement request to reference the version it was produced against.
A concrete parameter set example
Assume the network has a quality parameter set with:
quality_version: integermin_quality: minimum acceptable qualityquality_weights: weights for sub-metricsscore_formula_id: selects the scoring functionchallenge_window_blocks: how long disputes remain open
When you update these values, you create a new parameter set with a new quality_version.
Compatibility checks: the three gates
Safe rollouts work best when you enforce compatibility at three points.
Gate 1: Admission control for nodes
When a node registers, it declares which parameter versions it supports. The network can then accept tasks only for compatible versions.
Example: If quality_version=3 requires a new measurement normalization step, nodes that only support versions 1â2 should not be assigned tasks that will be verified under version 3.
Gate 2: Proof submission validation
Each proof submission includes quality_version and any required metadata (like score_formula_id). The verifier checks:
- The submitted version is known.
- The proof format matches that versionâs expected schema.
- The proofâs computed score satisfies
min_qualityfor that version.
Example: A proof created under version 2 might compute quality as an unweighted average. If it is labeled as version 2, the verifier uses the version-2 formula. If it is mislabeled as version 3, the verifier rejects it because the formula ID and schema donât match.
Gate 3: Settlement and dispute consistency
Settlement logic must also reference the parameter version used for eligibility and scoring. Disputes must use the same version to avoid âmoving the goalposts.â
Example: If a client requests settlement for a task completed under quality_version=2, the contract uses version 2âs challenge_window_blocks and scoring rules, even if version 3 is already active.
Rollout strategy: staged activation with explicit windows
A safe rollout usually has three phases.
- Publish: create the new parameter set and mark it as available.
- Pre-activate: allow clients and nodes to start using it, but do not finalize tasks under it yet.
- Activate: switch the default parameter version for new tasks.
To keep this deterministic, use block heights (or epochs) for phase boundaries.
Example timeline:
- Block 1,000,000: publish
quality_version=3. - Block 1,050,000: pre-activate (new tasks may specify version 3 explicitly).
- Block 1,100,000: activate (default becomes version 3 for tasks that donât specify a version).
This prevents a common failure mode: a client starts producing proofs under the new rules at the same time the chain flips defaults.
Data model pattern: versioned references everywhere
A robust design stores version IDs in every record that later affects verification.
- Task record:
task_id,parameter_version,assigned_node_id,created_at_block - Proof record:
task_id,parameter_version,proof_hash,score_formula_id - Settlement record:
task_id,parameter_version,payout_amount,settled_at_block - Dispute record:
task_id,parameter_version,challenge_deadline_block
Reasoning: if you only store the latest parameter values, you lose the ability to verify historical tasks consistently.
Compatibility matrix: define what âcompatibleâ means
Not every change is compatible. Create a compatibility matrix that classifies updates.
- Type A (backward compatible): old proofs remain valid under the new version.
- Type B (forward compatible): new proofs can be verified, but old proofs must be handled separately.
- Type C (breaking): old proofs are invalid or require different schema.
Example:
- Changing
min_qualityfrom 0.70 to 0.75 is Type B: proofs under version 2 can still be verified, but they may no longer qualify for new tasks. - Changing the scoring formula (different normalization) is Type C: proofs must be tied to the correct
score_formula_id.
Mind map: parameter versioning and safe rollout
Mind Map: Parameter Versioning Safe Rollouts
Example: contract-side checks (conceptual)
Below is a compact pseudocode sketch showing how version references prevent mismatches.
function verifyProof(taskId, proof, submittedVersion):
task = tasks[taskId]
expectedVersion = task.parameter_version
require(submittedVersion == expectedVersion)
params = parameterSets[expectedVersion]
require(params.exists)
require(proof.score_formula_id == params.score_formula_id)
require(proof.schema_version == params.schema_version)
score = computeScore(proof, params)
require(score >= params.min_quality)
return true
This pattern ensures that even if the network default moves forward, verification remains anchored to the taskâs chosen parameter set.
Example: staged rollout with defaults
A common usability feature is letting clients omit the version and rely on defaults. Defaults must be time-bound.
function resolveDefaultVersion(currentBlock):
if currentBlock < preActivateBlock:
return defaultVersionOld
else if currentBlock < activateBlock:
return defaultVersionOld # explicit version required
else:
return defaultVersionNew
Reasoning: during pre-activation, you avoid a silent behavior change for clients that forget to specify the version.
Operational checklist for a safe rollout
- Create new parameter set with a new version ID.
- Assign a schema/formula identity so proofs can be validated deterministically.
- Define compatibility type (A/B/C) and document it in the governance proposal.
- Set publish, pre-activate, and activate block heights.
- Update node admission rules so unsupported nodes arenât assigned incompatible tasks.
- Ensure all task/proof/settlement/dispute records store the parameter_version.
- Run a dry-run simulation using historical tasks to confirm old proofs still verify under their stored versions.
When these steps are followed, parameter updates become a controlled change in rules rather than a surprise change in behavior. The network can move forward without invalidating the past.
13.3 Policy Enforcement Example Quality Thresholds and Eligibility Rules
A DePIN network usually has two separate questions:
- Is this node allowed to participate right now? (eligibility)
- Did this node produce work that meets the required quality? (quality thresholds)
Keeping these separate makes enforcement simpler and reduces âmystery failures,â where a node is rejected for a reason that looks like a quality issue but is actually an eligibility issue.
Policy objects: what you store and where you enforce it
A practical design uses three policy layers:
- Eligibility policy (static-ish): who can submit, under what conditions, and what âgood standingâ means.
- Quality policy (dynamic-ish): what score or proof properties qualify for rewards.
- Enforcement policy (deterministic): the exact rules that convert inputs into accept/reject and reward eligibility.
Enforcement should be deterministic for on-chain settlement. Off-chain components can compute intermediate values (like scores), but the final decision must be reproducible.
Mind map: policy enforcement flow
Eligibility rules with concrete examples
Assume a network where nodes submit physical measurements for a task type called AIR_TEMP. Each task has a time window and a target location.
Eligibility rule set (example):
- Admitted membership: the node must be registered in the registry for the current policy version.
- Liveness: the node must have a heartbeat within the last
T_heartbeatseconds. - Stake floor: the node must have at least
S_mintokens locked. - Rate limit: no more than
N_maxsubmissions per epoch to prevent spam. - Role constraint: only nodes of type
SENSOR_PROVIDERcan submitAIR_TEMPtasks.
Example scenario:
- Node A submits a proof that is perfectly formatted and scores high.
- However, Node A missed heartbeats for 2 hours, while
T_heartbeatis 10 minutes. - Result: the submission is rejected as ineligible, even though quality would have passed.
This is intentional. If you let âqualityâ override âeligibility,â you end up rewarding nodes that are not actually participating reliably.
Implementation detail:
- Eligibility checks should run first and produce a clear rejection reason code, such as
REJECT_INELIGIBLE_LIVENESS. - That reason code should be emitted in an event so operators can fix the right problem.
Quality thresholds: turning evidence into a decision
Quality thresholds should be expressed in terms of values you can compute from the submission and task context.
A common pattern is to compute a quality score from multiple components, then apply thresholds.
Example scoring model:
coverage_score(0 to 1): how much of the required time window is supported by evidenceplausibility_score(0 to 1): how consistent the measurement is with expected boundsconfidence_score(0 to 1): derived from proof structure (e.g., number of samples, sensor calibration attestation)
Compute a final score: \[ \text{quality} = 0.45\cdot \text{coverage} + 0.35\cdot \text{plausibility} + 0.20\cdot \text{confidence} \]
Then enforce thresholds:
quality >= Q_minconfidence >= C_mincoverage >= K_min
Example thresholds:
Q_min = 0.72C_min = 0.60K_min = 0.70
Example scenario 1 (fails confidence):
- coverage = 0.90
- plausibility = 0.80
- confidence = 0.40
- quality = 0.45(0.90)+0.35(0.80)+0.20(0.40)=0.405+0.28+0.08=0.765
- quality passes
Q_min, but confidence failsC_min. - Result: rejected with
REJECT_LOW_CONFIDENCE.
This prevents âhigh-lookingâ scores from compensating for weak evidence.
Example scenario 2 (fails coverage):
- coverage = 0.60
- plausibility = 0.95
- confidence = 0.85
- quality = 0.45(0.60)+0.35(0.95)+0.20(0.85)=0.27+0.3325+0.17=0.7725
- quality passes, but coverage fails
K_min. - Result: rejected with
REJECT_LOW_COVERAGE.
Coverage is often a proxy for âdid you actually measure the whole requested window,â which matters for physical infrastructure tasks.
Eligibility vs quality: how to avoid confusing outcomes
A clean decision order prevents operators from arguing about âqualityâ when the real issue is eligibility.
Recommended decision order:
- Verify proof format and freshness.
- Check eligibility (membership, liveness, stake, role, rate).
- Compute quality components.
- Apply threshold gates.
- Decide reward eligibility and dispute window.
Example of clear outcomes:
- If proof freshness fails, return
REJECT_INVALID_FRESHNESS. - If liveness fails, return
REJECT_INELIGIBLE_LIVENESS. - If thresholds fail, return
REJECT_LOW_QUALITYplus the specific gate that failed.
Policy versioning and compatibility
Quality thresholds change over time, so policy enforcement should be tied to a policy version.
Example:
- Task
AIR_TEMPat epoch 120 usespolicyVersion = 3. - Node submissions include the
policyVersionthey were evaluated against (or the network derives it from task metadata). - If a node submits after a policy update, the network evaluates using the policy version associated with the taskâs epoch.
This avoids edge cases where the same proof would be accepted under one policy and rejected under another without a clear reason.
Minimal enforcement pseudocode (deterministic)
function evaluateSubmission(task, node, proof, policy):
if not verifyProof(proof, task, policy):
return Reject("REJECT_INVALID_PROOF")
if not isEligible(node, task, policy):
return Reject(reasonForEligibility(node, policy))
components = computeQualityComponents(task, proof)
quality = 0.45components.coverage + 0.35components.plausibility + 0.20*components.confidence
if components.confidence < policy.C_min:
return Reject("REJECT_LOW_CONFIDENCE")
if components.coverage < policy.K_min:
return Reject("REJECT_LOW_COVERAGE")
if quality < policy.Q_min:
return Reject("REJECT_LOW_QUALITY")
return AcceptWithReward(quality, components)
Accounting outputs: what to record for auditability
When a submission is accepted or rejected, record enough detail to explain the decision without re-running everything.
Event fields (example):
taskId,nodeId,policyVersioneligibilityResult(pass/fail + reason)qualityComponents(coverage, plausibility, confidence)qualityScorethresholdsUsed(Q_min, C_min, K_min)finalDecision(accept/reject)
This makes policy enforcement operationally useful: operators can see whether they need better evidence quality, more complete coverage, or simply to fix liveness.
13.4 Auditing Governance Changes Example Immutable Logs and Human-Readable Summaries
Governance changes are the rare kind of event that can quietly reshape incentives, verification rules, and eligibility. Auditing is how you make those changes legible after the fact: what changed, why it changed, who approved it, and how it affected the system.
What to audit (and what not to)
Audit scope should be narrow enough to stay useful.
- On-chain governance actions: proposal creation, voting, execution, parameter updates, and any role changes.
- Off-chain governance artifacts: the human-readable rationale, risk assessment notes, and the exact configuration payload that was executed.
- Derived effects: which parameter versions became active at which block/time, and which rulesets were used for subsequent settlements.
Avoid auditing every internal discussion message. Instead, store a stable summary and the executed payload. If someone needs the full thread, they can find it elsewhere; the audit log should remain compact and authoritative.
Immutable logs: event design that survives time
An immutable log is only as good as its ability to answer questions without guessing. Use a consistent event schema and include identifiers that let you correlate on-chain and off-chain records.
Core fields
- proposalId: stable identifier for the proposal.
- actionType: e.g.,
CREATE,VOTE,EXECUTE,ROLE_GRANT,ROLE_REVOKE. - actor: the address or identity that performed the action.
- blockNumber / timestamp: when the action occurred.
- payloadHash: hash of the exact configuration or call data that was executed.
- humanSummaryHash: hash of the human-readable summary document (or its canonical JSON form).
- result: success/failure and any error code.
Why include hashes? Hashes let you prove that the human-readable summary corresponds to the executed payload. Without them, summaries can drift from reality.
Human-readable summaries: make them deterministic
A good summary is not a story; itâs a structured explanation. Keep it short, but include the details that auditors actually check.
Recommended summary sections
- Change overview: one paragraph describing what the governance action changes.
- Affected parameters: list each parameter name, old value, new value, and scope (global vs per-network).
- Reasoning: the problem being addressed and the constraints considered.
- Safety checks: what invariants or validations were expected to pass.
- Operational notes: any migration steps, required operator actions, or client compatibility notes.
- Payload reference: the
payloadHashand the canonical summary hash.
To keep summaries auditable, define a canonical format (for example, a fixed JSON schema) and compute a hash over that canonical form.
Mind map: auditing governance changes
Example: parameter update with linked hashes
Assume a governance proposal updates a verification threshold.
Executed payload (conceptual)
- Parameter:
verification.minConfidence - Old:
0.70 - New:
0.80 - Scope:
global - Activation: immediately at execution block
On-chain events (conceptual)
PROPOSAL_CREATEwithproposalId = 42VOTEevents from multiple actorsEXECUTEevent containing:payloadHash = H(payload)humanSummaryHash = H(summaryCanonicalJson)
Human-readable summary (canonical JSON form, conceptually)
overview: âIncrease minimum confidence required for verifier acceptance.âaffectedParameters: includes name, oldValue, newValue, scope.reasoning: âReduce acceptance of borderline proofs observed in recent audits.âsafetyChecks: âInvariant: minConfidence must be in [0,1]. Client verification logic must accept new threshold.âoperationalNotes: âOperators must update their local verifier config before next epoch.âpayloadReference: includes both hashes.
Audit verification steps
- Fetch all events with
proposalId = 42. - Extract the
EXECUTEeventâspayloadHashandhumanSummaryHash. - Retrieve the canonical summary document and recompute
H(summaryCanonicalJson). - Retrieve the executed payload bytes and recompute
H(payload). - Confirm both recomputed hashes match the values from the
EXECUTEevent. - Determine the activation point and record that subsequent settlements used the new threshold.
This workflow prevents a common failure mode: a summary that says âwe only changed Xâ while the executed payload actually changed X and Y.
Example: role change with explicit impact notes
Role changes are often under-audited because they donât look like âparameter updates.â Treat them as first-class governance effects.
Suppose a proposal grants PARAMETER_ADMIN to a new operator key.
Immutable log expectations
ROLE_GRANTevent includes:actor(the governance executor)targetRoleandtargetIdentitypayloadHashof the role change callhumanSummaryHash
Human-readable summary expectations
- List the role being granted.
- State the operational reason (e.g., key rotation due to compromised old key).
- Include any revocation that happened in the same proposal.
- Note the effective time and whether any clients need to update allowlists.
Auditors should be able to answer: âWho could change parameters after this block?â without reading governance chat logs.
Practical checklist for auditors
- Hash match: executed payload bytes hash equals
payloadHashin the log. - Summary match: canonical summary hash equals
humanSummaryHash. - Completeness: all changed parameters and roles are listed in the summary.
- Activation clarity: the summary states when the new rules take effect.
- Derived effects recorded: the system records which ruleset version was active for later settlements.
When these checks pass, governance auditing becomes a mechanical process rather than a scavenger hunt. It also makes governance safer for everyone: fewer surprises, fewer âwait, thatâs not what we approved,â and fewer arguments that start with âI thought it wasâŠâ
13.5 Community and Operator Coordination: Publishing Operational Guidelines
Operational guidelines are the boring part that keeps the network from becoming a group project where everyone forgets their homework. In a DePIN, coordination matters because operators, verifiers, and clients all depend on the same assumptions: what counts as âgood data,â how quickly proofs must arrive, and what happens when something goes wrong.
This section shows how to publish guidelines that are clear enough to follow during an outage, specific enough to prevent disputes, and structured enough to update without breaking expectations.
What to publish (and what to avoid)
Publish guidelines as a small set of documents with stable ownership and a predictable update process.
Include:
- Operational roles and responsibilities (who does what, with concrete examples)
- Eligibility and onboarding steps (what operators must do to start)
- Measurement and proof submission workflow (timing, formats, and failure handling)
- Quality expectations (how quality is measured and how it affects rewards)
- Incident response and escalation (what to do when things break)
- Communication channels and response times (where issues are reported, and how fast)
- Change management (how updates are announced and when they take effect)
Avoid:
- Vague phrases like âpromptlyâ without a time window.
- Rules that conflict with on-chain logic (guidelines should explain behavior, not contradict it).
- âOne-offâ procedures that never get written down.
A practical structure for the guidelines
Use a consistent outline so operators can find answers quickly.
- Scope: Which operator types and which network components the document covers.
- Definitions: Terms like âproof,â âchallenge window,â âliveness,â and âquality score.â
- Daily operations: Routine tasks and checks.
- Event-driven operations: What changes during failures, restarts, or disputes.
- Reporting and escalation: How to file issues and who responds.
- Compliance checklist: A short list that can be used during onboarding and audits.
- Versioning and effective dates: What changes, and when.
Mind map: operational guideline components
Concrete examples to make rules usable
Guidelines should include small scenarios that map directly to real operator behavior.
Example 1: Proof arrives late
- Rule: Proofs must be submitted within a defined window after measurement completion.
- Scenario: An operatorâs device finishes measurement at 12:00:10 UTC, but the operator submits at 12:05:40 UTC.
- Expected behavior:
- Operator files a âlate proofâ report including measurement timestamp, device identifier, and submission attempt time.
- Operator does not retry blindly if the system marks the proof as expired.
- Operator checks whether the late proof can still be used for quality scoring even if rewards are reduced.
Example 2: Device clock drift
- Rule: Proof freshness depends on timestamps within an allowed skew.
- Scenario: After a power outage, a device clock drifts by 8 minutes.
- Expected behavior:
- Operator detects drift via monitoring alerts.
- Operator pauses submissions for that device until time sync is corrected.
- Operator documents the drift window and the corrective action.
Example 3: Dispute escalation
- Rule: Challenges require evidence in a specific format and within a challenge window.
- Scenario: A verifier challenges a proof due to suspected measurement mismatch.
- Expected behavior:
- Operator acknowledges within the response-time target.
- Operator provides the raw measurement artifact hash, the proof artifact, and any relevant logs.
- Operator confirms whether the device was in a degraded mode during measurement.
Communication and escalation: make it operational, not social
Coordination fails when people donât know where to report issues or what âgoodâ looks like.
Define channels by purpose
- Announcements: protocol parameter changes, guideline updates.
- Operational issues: liveness failures, proof format errors, device-side problems.
- Disputes: evidence submission and resolution tracking.
Define response-time targets
- For example: âAcknowledge within 4 hoursâ and âProvide an initial assessment within 24 hours.â
- Targets should match the networkâs timing constraints. If proofs expire in 30 minutes, waiting 24 hours is not a plan.
Provide report templates Templates reduce back-and-forth and make disputes easier to resolve.

Onboarding checklist: the fastest way to prevent future confusion
A short checklist helps new operators avoid common mistakes.
Operator Onboarding Checklist
- Identity keys registered and rotated policy understood
- Proof submission endpoint tested with a dry-run
- Device liveness monitoring configured
- Timestamp freshness checks enabled
- Idempotency behavior verified (retries wonât double-submit)
- Evidence packaging format confirmed for disputes
- Incident report template reviewed
Change management: updates without breaking expectations
Guidelines will change. The goal is to change behavior predictably.
Use three dates
- Announcement date: when the change is published.
- Effective date: when the new rules apply.
- Grace period end: when old behavior is no longer accepted.
Include compatibility notes
- If a proof format changes, specify whether old proofs remain valid.
- If a timing window changes, specify whether it affects already-submitted work.
Example: proof format update
- Announcement: 2026-04-01
- Effective: 2026-04-15
- Grace end: 2026-04-22
- Notes: âProofs using format v1 are accepted for rewards until grace end; challenges accept both v1 and v2 during the grace period.â
Governance alignment: guidelines should map to enforceable rules
Operational guidelines must reflect the protocolâs enforceable logic.
- If the protocol slashes for specific misbehavior, guidelines should list the misbehavior triggers and the evidence operators should retain.
- If eligibility depends on uptime, guidelines should define how liveness is measured and what operators can do to recover.
- If disputes depend on evidence packaging, guidelines should specify the exact artifacts and how to compute their hashes.
Publishing workflow: who writes, who reviews, who approves
A simple workflow prevents guideline churn.
- Draft owner: typically a protocol coordinator or a designated working group.
- Reviewers: at least one operator representative and one verifier/maintainer representative.
- Approval: a governance mechanism or a clearly defined maintainer role.
- Publication: guidelines are published with a version number and effective date.
Example workflow
- Draft posted for review with a summary of changes.
- Reviewers comment using a structured checklist: clarity, timing alignment, evidence requirements.
- Final version published with effective date and grace period.
Closing principle
Good operational guidelines are not a handbook of opinions. They are a set of instructions that match the protocolâs rules, include concrete scenarios, and specify timing and evidence requirements so operators can act consistently when the network is under stress.
14. End-to-End Build Guides With Integrated Examples
14.1 Build Guide for a Proof-of-Measurement Network: Example Steps and Artifacts
This guide walks through building a small proof-of-measurement DePIN network end-to-end. The goal is simple: a client requests a measurement, an operator submits a proof, and the protocol verifies it and settles rewards. The design choices below are meant to be implementable without magic.
Scope and the âone measurementâ example
Pick one measurement type so the system stays concrete.
Example measurement: âReport the temperature at a location at time T.â
Assumptions for the example:
- Operators run a device that reads a sensor.
- The device can produce a signed measurement with a timestamp.
- Verification checks freshness and basic plausibility.
You can later generalize the same pattern to other measurements (GPS distance, signal strength, meter readings), but start with one.
Mind map: core build blocks
Step 1: Define the measurement request schema
A measurement request must contain everything needed to verify without guessing.
Request fields (example):
requestId: unique identifierclient: addresslocation: geohash or coordinatestargetTime: Unix timestamp (or time window)maxSkewSec: allowed difference between device time and protocol timemeasurementType: e.g.,temperature_cexpectedRange:[minC, maxC]for plausibility checksnonce: prevents replay across requests
Why these fields matter:
noncemakes âold proofsâ useless.maxSkewSecdefines freshness in a way contracts can enforce.expectedRangeenables cheap plausibility checks before any heavier verification.
Step 2: Define the proof format (what operators submit)
Operators should submit a proof that is verifiable with deterministic rules.
Proof fields (example):
requestIdoperatorId(or derived from signature)devicePubKey(or reference to operatorâs device key)measuredValueC: numeric valuemeasuredAt: device timestamplocationCommitment: hash of location data used by the devicenonceEcho: must match request noncesignature: signature over the proof payload
Proof payload to sign (example):
hash(requestId || measuredValueC || measuredAt || locationCommitment || nonceEcho)
Reasoning:
- Signing prevents tampering.
nonceEchobinds the proof to a specific request.locationCommitmentallows the device to commit to location inputs without forcing the protocol to trust raw GPS strings.
Step 3: Build the on-chain verification contract interface
Keep the contract minimal: accept proof, verify rules, emit events, and update accounting.
Contract functions (example)
submitProof(requestId, proof)verifyProof(requestId, proof) -> (bool, reasonCode)(can be internal)claimReward(requestId)(after verification)
Events (example)
ProofSubmitted(requestId, operator, measuredValueC)ProofVerified(requestId, operator, pass, reasonCode)RewardSettled(requestId, operator, amount)
Step 4: Implement verification rules (deterministic checks)
Start with checks that are cheap and unambiguous.
Verification checklist (example):
- Membership: operator is registered and active.
- Nonce match:
proof.nonceEcho == request.nonce. - Freshness:
abs(proof.measuredAt - blockTime) <= request.maxSkewSec. - Signature validity: signature matches the operator/device key over the proof payload hash.
- Plausibility bounds:
measuredValueCwithinexpectedRange. - Single-use: request can be satisfied once (or track best-of-N if you allow multiple).
Reasoning:
- If you skip freshness, replay attacks become easy.
- If you skip plausibility, a valid signature can still report nonsense.
- If you skip single-use, operators can spam submissions and force extra accounting logic.
Step 5: Define reward accounting for the example
Use a straightforward rule so you can test it.
Example reward model:
- Base reward
Rfor a passing proof. - Optional quality multiplier based on how close the value is to a reference (only if you have a reference source).
For a first build, omit multipliers.
Settlement artifacts:
requestId -> operator -> status(pass/fail)requestId -> rewardAmount
Step 6: Off-chain operator workflow and artifacts
Operators need a repeatable process.
Operator workflow (example)
- Receive assignment for
requestId. - Read sensor and capture
measuredValueC. - Compute
locationCommitmentfrom the deviceâs location inputs. - Set
measuredAtfrom device clock. - Build proof payload and sign it.
- Submit
submitProofwith the proof.
Operator artifacts
deviceConfig.json: device key references and measurement calibration parametersproof.json: the exact proof payload and signaturesubmissionReceipt.json: transaction hash and event parsing
Mind map: data flow
Data Flow (Request -> Proof -> Verification)
Client
- creates Measurement Request
- posts requestId and parameters
Protocol
- stores request
- waits for submitProof
Operator
- reads sensor
- builds proof payload
- signs proof
- submits proof
Contract
- checks membership
- checks nonce + freshness
- verifies signature
- checks plausibility
- emits events
- updates settlement state
Step 7: Example end-to-end run (with concrete values)
Given request:
requestId = 0xREQ1nonce = 0xN1maxSkewSec = 30expectedRange = [0, 50](°C)targetTime = 1710000000
At time of submission:
blockTime = 1710000030
Operator proof:
measuredValueC = 22.4measuredAt = 1710000030locationCommitment = H(locationInputs)nonceEcho = 0xN1signature = Sign(deviceKey, proofPayloadHash)
Verification outcome:
- Freshness:
abs(1710000030 - 1710000030) = 0 <= 30â - Plausibility:
22.4 in [0, 50]â - Signature: valid â
- Result: pass, reward settled.
Failure example (same request):
measuredValueC = 120.0fails plausibility.- Even with a valid signature, the contract rejects it.
Step 8: Testing plan and âartifact-drivenâ validation
Write tests that assert on artifacts and events, not just return values.
Test cases (minimum set):
- Valid proof passes and emits
ProofVerified(pass=true). - Wrong nonce fails.
- Stale timestamp fails.
- Invalid signature fails.
- Out-of-range value fails.
- Second submission for the same request is rejected (single-use).
Artifact checks:
- Parse
ProofVerifiedevent and confirmreasonCodematches the failing rule. - Confirm
RewardSettledonly occurs after a passing verification.
Step 9: Minimal âspec sheetâ for implementation
Create a one-page spec that your code can mirror.
Step 10: Build checklist for launch readiness
- Request schema is fixed and versioned.
- Proof payload hash is defined and used consistently in signing and verification.
- Freshness uses a clear time source and explicit skew window.
- Plausibility bounds exist and are configurable per request.
- Contract emits events that tests can assert.
- Reward settlement is gated on verification success.
- Operators can produce
proof.jsondeterministically from sensor inputs.
When these pieces line up, the network behaves predictably: clients get verifiable measurements, operators know exactly what to submit, and the protocol has rules it can enforce without guessing.
14.2 Build Guide for a Coverage and Quality Network: Example Scoring and Rewards
This guide shows one concrete scoring design for a DePIN network where clients request work, operators submit results, and the protocol pays based on coverage (did you contribute useful measurements?) and quality (were they accurate and consistent?). The goal is to make scoring explainable, auditable, and resistant to obvious gaming.
Network roles and the scoring contract
- Client: requests coverage for a region/time window and defines acceptance criteria.
- Operator: submits one or more measurement/proof bundles.
- Verifier: checks proofs and computes per-bundle scores.
- Contract: stores eligibility, aggregates scores, and settles rewards.
A practical rule: the contract should not âguessâ quality. It should only accept verifier outputs (scores and reasons) that are deterministic given the submitted evidence.
Mind map: scoring and rewards
Coverage & Quality Scoring (Mind Map)
Step 1: Define the request and what âcoverageâ means
Coverage should be measurable without subjective judgment.
Example request fields:
region: a geohash prefix (e.g.,u4pruyd)time_window:[T0, T1]metric:temperature_ccoverage_target: at leastkdistinct operatorsacceptance:|value - reference| <= 2.0°C(reference may come from consensus or a trusted source)
Coverage score should reward useful participation rather than raw volume. A simple approach:
- Each operator submission is assigned a coverage unit if it is relevant, valid, and not a duplicate.
- Coverage score is then a capped function of how many coverage units the operator contributed.
Concrete example:
k = 3operators needed for full coverage.- For operator
i, letc_ibe the number of unique valid bundles for that request. - Define:
- \(\text{coverageScore}_i = \min(1, c_i / k)\)
This makes it hard to spam duplicates: once you hit k, extra submissions donât increase coverage.
Step 2: Define quality in a way that can be checked
Quality should combine accuracy and uncertainty. If you only score accuracy, operators can submit extreme values with wide uncertainty and still pass. If you only score uncertainty, operators can submit precise nonsense. Combine both.
Assume each bundle includes:
value: measuredtemperature_cuncertainty: reported standard deviation or boundreference: computed by verifier from consensus (e.g., median of valid submissions) or a known anchor
For each bundle j from operator i, compute:
- Accuracy error: \(e_{ij} = |value_{ij} - reference|\)
- Uncertainty penalty: \(p_{ij} = \max(0, uncertainty_{ij} - u_0)\)
Then define a per-bundle quality score:
- \(\text{qualityBundle}*{ij} = \exp\left(-\frac{e*{ij}}{\alpha}\right) \cdot \exp\left(-\frac{p_{ij}}{\beta}\right)\)
Where α and ÎČ are scale parameters chosen to match your acceptance thresholds. If you want to avoid exponentials, use a piecewise linear function; the key is that it must be deterministic.
Example with piecewise linear (easier to reason about):
- Let
acceptError = 2.0°C. - Let
uncertaintyCap = 1.0°C. - Define:
- \(\text{accuracyScore} = \max(0, 1 - e_{ij}/acceptError)\)
- \(\text{uncertaintyScore} = \max(0, 1 - p_{ij}/uncertaintyCap)\)
- \(\text{qualityBundle}_{ij} = \text{accuracyScore} \cdot \text{uncertaintyScore}\)
Step 3: Aggregate bundle scores into operator scores
Operators may submit multiple bundles. Aggregation should reward consistent quality, not just one lucky submission.
Let B_i be the set of valid, unique bundles for operator i in the request.
- Compute average quality: \(\overline{q}*i = \frac{1}{|B_i|}\sum*{j\in B_i} \text{qualityBundle}_{ij}\)
- Apply a consistency bonus using variance (optional but useful):
- \(\sigma_i^2 = \text{Var}(\text{qualityBundle}_{ij})\)
- \(\text{consistencyFactor}_i = \max(0, 1 - \sigma_i^2 / s_0)\)
- Final quality score:
- \(\text{qualityScore}_i = \overline{q}_i \cdot \text{consistencyFactor}_i\)
If you want to keep it minimal, skip variance and use the average. The rest of the system (coverage cap, duplicate detection, validity checks) already prevents most obvious abuse.
Step 4: Combine coverage and quality into a reward weight
Let baseReward be the total budget for the request (from escrow). Define a weight per operator:
- \(w_i = \text{coverageScore}_i \cdot \text{qualityScore}_i\)
Then normalize:
- \(\text{reward}*i = baseReward \cdot \frac{w_i}{\sum*{m} w_m}\)
Add floors to avoid paying tiny amounts that cost more to process than theyâre worth:
- If \(w_i < w_{min}\), set \(w_i = 0\).
Example numbers:
baseReward = 1000- Operator A: coverageScore=1.0, qualityScore=0.8 â w=0.8
- Operator B: coverageScore=0.67, qualityScore=0.9 â w=0.603
- Operator C: coverageScore=0.33, qualityScore=0.95 â w=0.314
- Sum w = 1.717
- Rewards:
- A: 1000*(0.8/1.717)=466
- B: 1000*(0.603/1.717)=351
- C: 1000*(0.314/1.717)=183
This produces intuitive outcomes: A contributes enough coverage and has good quality, so it earns the most.
Step 5: Make duplicate detection explicit
Coverage depends on uniqueness. Define a deterministic duplicate key per bundle:
duplicateKey = hash(nodeId, requestId, evidenceHash)
If an operator submits the same evidence hash for the same request, count it once. If they submit different evidence, count each bundle up to the coverage cap.
Step 6: Verification output schema (what the contract needs)
Verifier should output a compact, deterministic structure per operator per request.
Example fields:
requestIdoperatorIdvalidBundlesCountcoverageScorequalityScoreweightreasons[](human-readable codes, not long text)
The contract uses only numeric fields for settlement and stores reasons for audit.
Step 7: Example end-to-end scoring walkthrough
Assume one request with k=3, acceptError=2.0°C, uncertaintyCap=1.0°C, and baseReward=1000.
Valid unique bundles:
- Operator A submits 3 bundles with errors [0.5, 1.0, 0.8] and uncertainties [0.6, 0.7, 0.5].
- accuracyScores: [1-0.25, 1-0.5, 1-0.4] = [0.75, 0.5, 0.6]
- uncertaintyScores: all 1 - (max(0,u-1)/1) = 1.0
- qualityBundle: [0.75, 0.5, 0.6]
- qualityScore (avg): (0.75+0.5+0.6)/3=0.617
- coverageScore: min(1, 3/3)=1
- w=0.617
- Operator B submits 2 bundles with errors [1.2, 1.8] and uncertainties [0.4, 1.2].
- accuracyScores: [1-0.6, 1-0.9]=[0.4, 0.1]
- uncertaintyScores: [1.0, 1-(0.2/1)=0.8]
- qualityBundle: [0.4, 0.08]
- qualityScore avg: 0.24
- coverageScore: min(1, 2/3)=0.667
- w=0.160
- Operator C submits 1 bundle with error [0.2] and uncertainty [2.0].
- accuracyScore: 1-0.1=0.9
- uncertaintyScore: 1-(1.0/1)=0
- qualityBundle: 0
- qualityScore: 0
- coverageScore: min(1, 1/3)=0.333
- w=0
Normalize weights: sum w = 0.777
- A: 1000*(0.617/0.777)=794
- B: 1000*(0.160/0.777)=206
- C: 0
This outcome is consistent with the design: C covered a bit but reported uncertainty so large that its quality collapses to zero.
Step 8: Contract settlement logic (minimal and safe)
The contract should:
- Check operator eligibility (registered, stake locked, not slashed).
- Accept verifier-submitted numeric scores for the request.
- Compute weights, apply
w_min, normalize, and pay from escrow. - Store settlement events for reconciliation.
A simple pseudocode sketch:
for each operator i in request:
if not eligible(i): continue
if coverageScore[i] == 0 or qualityScore[i] == 0: continue
w[i] = coverageScore[i] * qualityScore[i]
if w[i] < w_min: w[i] = 0
sumW = sum(w[i])
for each operator i:
if w[i] == 0 or sumW == 0: payout = 0
else payout = baseReward * w[i] / sumW
transferEscrow(i, payout)
emit Settlement(requestId, operatorId, payout, w[i])
Practical best practices embedded in the design
- Cap coverage so operators canât earn more by submitting redundant bundles.
- Combine accuracy and uncertainty so âprecise-lookingâ nonsense doesnât win.
- Use deterministic aggregation so the same evidence yields the same scores.
- Normalize weights so the request budget is fully distributed among contributors.
- Store reasons as codes so audits are fast and on-chain storage stays small.
With these pieces, you get a scoring system that is easy to explain: coverage answers âdid you contribute relevant, unique work?â and quality answers âwas it believable and consistent?â Rewards follow directly from those two numbers.
14.3 Build Guide for a Data Availability Network: Commitments and Retrieval
A Data Availability (DA) networkâs job is simple to state and picky to implement: clients must be able to (1) obtain a commitment that represents a piece of data and (2) retrieve enough data to reconstruct or verify what was committed. The trick is making commitments compact, retrieval verifiable, and failure modes predictable.
Define the DA unit and the commitment target
Start by choosing the smallest unit you will commit to. Common choices are âa blob,â âa batch,â or âa segment of a larger message.â Your commitment should target exactly that unit.
Example decision:
- Unit:
Batch= 128 data chunks, each chunk is 4 KiB. - Commitment: one digest for the entire batch.
Why this matters: If you later change chunking, you either break compatibility or add translation layers that complicate verification.
Choose a commitment scheme that supports partial verification
You need a commitment that can be checked against retrieved pieces. A practical pattern is:
- Split data into fixed-size chunks.
- Compute a per-chunk hash.
- Build a Merkle tree over chunk hashes.
- Commit to the Merkle root.
Example:
- Chunk hashes:
h_i = H(chunk_i). - Merkle root:
R = MerkleRoot(h_0..h_127). - Commitment published on-chain (or in a consensus layer):
commitment = R.
Retrieval verification: when a client fetches chunk_i, the server also provides a Merkle proof Ï_i so the client can verify H(chunk_i) is consistent with R.
Data layout and indexing rules
Make indexing boring and deterministic.
Rules to write down and enforce:
- Chunk size is fixed (e.g., 4096 bytes).
- Chunk index is zero-based and stable across all nodes.
- The batch includes metadata that affects chunking (e.g., total length) or you pad deterministically.
- Hashing uses a single canonical encoding.
Concrete example:
- If the last chunk is short, pad with zero bytes up to 4096.
- Hash the padded chunk bytes exactly as stored.
Retrieval API design: what the client asks for
A DA retrieval API should let a client request either:
- the whole batch (for reconstruction), or
- a set of chunks with proofs (for verification).
Example endpoints (conceptual):
GET /batch/{batchId}returns batch metadata and commitment.GET /batch/{batchId}/chunk/{i}returns{chunk_i, proof_i}.GET /batch/{batchId}/chunks?indices=...returns multiple chunks and proofs.
Client workflow example:
- Client learns
batchIdand commitment rootR. - Client requests chunks at indices
[3, 17, 88, 101]. - For each returned chunk, client verifies
MerkleProofVerify(R, i, chunk_i, proof_i). - If enough chunks are retrieved for your applicationâs reconstruction rules, the client reconstructs or accepts the data.
Mind map: commitments and retrieval flow
DA Commitments & Retrieval Mind Map
Server responsibilities: serving data without lying
A retrieval server must be able to answer chunk requests consistently with the published commitment.
Minimum server checklist:
- Store the batch (or be able to reconstruct it from storage).
- Compute and store the Merkle root
Rfor the batch. - For each chunk request, return:
- the exact chunk bytes used in the commitment
- the correct Merkle proof path for that chunk index
Example proof generation:
- If you store the full Merkle tree, proof generation is straightforward.
- If you store only chunks, you must rebuild the tree for proofs, which increases latency.
Client verification: treat proofs as the source of truth
The client should never âtrustâ the serverâs claim that a chunk belongs to a batch. The proof is what ties the chunk to the commitment.
Verification steps for one chunk:
- Compute
h = H(chunk_i). - Use
proof_iand indexito compute the candidate root. - Compare candidate root to
R.
Concrete example:
- Commitment root
Ris known. - Server returns
chunk_17andproof_17. - Client computes
H(chunk_17)and verifies the proof path yieldsR. - If it doesnât, the client discards the chunk and marks the server as unreliable for this batch.
Handling partial retrieval: selecting indices and acceptance rules
If your application only needs a subset of chunks, define acceptance rules precisely.
Example acceptance rule (simple):
- Client requests
kchunks uniformly at random from0..N-1. - Client accepts the batch if all
kproofs verify againstR.
Example acceptance rule (reconstruction):
- Client requests all chunks and reconstructs the original batch.
Important design note: Your acceptance rule must align with how the rest of the system uses DA. If you only verify proofs for a subset, your application must be comfortable with that level of certainty.
Batch identifiers and commitment publication
You need a stable way to map from âwhat the client heardâ to âwhat the client should verify.â
Example mapping:
batchId = H(creatorAddress || batchSequenceNumber).- Commitment published:
Rassociated withbatchId.
Why not just use the root as the ID? You can, but then you lose the ability to attach metadata like creator, sequence, or retrieval policies without extra structure.
Minimal end-to-end example (from commit to retrieval)
Setup:
- Batch has
N=8chunks of 4 KiB. - Commitment root
Ris computed and published.
Commit step:
- Server computes
h_0..h_7. - Builds Merkle tree and publishes
R.
Retrieval step:
- Client requests chunk indices
[1, 4, 6]. - Server returns
(chunk_1, proof_1),(chunk_4, proof_4),(chunk_6, proof_6).
Verification step:
- Client verifies each proof against
R. - If all three verify, the client records that these chunks are consistent with the committed batch.
Implementation notes that prevent common bugs
- Canonical hashing: ensure every component hashes the same byte representation.
- Index consistency: proofs must be generated with the same chunk index ordering the client uses.
- Padding determinism: padding must be identical across all nodes.
- Proof serialization: define a stable encoding for proof nodes (e.g., list of sibling hashes in order).
- Timeout behavior: if a chunk request times out, retry with another server or another index set according to your acceptance rule.
Practical build checklist
- Specify DA unit (batch) and chunking rules.
- Implement chunk hashing and Merkle tree root computation.
- Implement proof generation for arbitrary chunk indices.
- Implement retrieval endpoints for single and multiple chunks.
- Implement client proof verification against published commitment
R. - Define acceptance rules for partial retrieval.
- Add logging for proof verification failures (include batchId, index, and computed vs expected root).
When these pieces are aligned, the network becomes predictable: commitments are compact, retrieval is verifiable, and âbad dataâ fails loudly at the proof check instead of quietly in downstream logic.
14.4 Build Guide for a Service Provision Network: Task Dispatch and Settlement
A service provision DePIN network coordinates three things: (1) a clientâs request, (2) an operatorâs execution of a physical task, and (3) a settlement outcome that depends on verifiable evidence. This section focuses on the âtask dispatch â proof submission â settlementâ loop, with concrete design choices that keep failure modes understandable.
1) Define the service contract (what âdoneâ means)
Start by writing a service contract that is strict enough to be testable, but not so strict that it becomes impossible to satisfy.
Include these fields:
- Task type: e.g., âair-quality sampling,â âmeter inspection,â âdelivery verification.â
- Scope: location, time window, and any constraints (temperature range, access rules, required equipment).
- Success criteria: measurable outcomes (e.g., âat least 3 samples,â âphoto evidence includes meter serial,â âGPS trace covers route segmentâ).
- Evidence requirements: what the operator must submit (signed measurements, media hashes, logs, receipts).
- Dispute hooks: what evidence is acceptable during a challenge window.
Easy example:
- Task type: âInspect and photograph a water meter.â
- Success criteria: âOne clear photo of the meter face, one photo of the serial label, both with visible timestamps.â
- Evidence: âTwo images + metadata + operator signature.â
2) Design the dispatch flow (how tasks get assigned)
Dispatch is where you decide whether the network is âauction-like,â âfirst-available,â or âcommittee-based.â For a service provision network, a practical default is eligibility filtering + weighted selection.
Core steps:
- Client posts a request with scope, budget, and evidence requirements.
- Network selects eligible operators based on membership and capability.
- Operator accepts the task by signing an acceptance message.
- Operator executes and produces evidence.
- Operator submits proof within a deadline.
- Settlement finalizes based on verification and dispute rules.
Concrete example (eligibility filtering):
- Only operators with a âmeter-inspectionâ capability can accept the task.
- Operators must also be âliveâ (recent heartbeat) to reduce the chance of timeouts.
3) Mind map: task dispatch and settlement
Task Dispatch and Settlement Mind Map
4) Define the task state machine (keep it boring and correct)
A state machine prevents âwho decided whatâ confusion. Use explicit transitions and record them as events.
Recommended states:
RequestedAssignedAcceptedInProgressProofSubmittedVerifiedDisputedFinalizedFailed
Transition rules (example):
Requested â Assigned: network picks an operator.Assigned â Accepted: operator signs acceptance beforeacceptDeadline.Accepted â InProgress: optional; can be implicit.Accepted â Failed: if acceptance deadline passes.InProgress â ProofSubmitted: operator submits proof beforeproofDeadline.ProofSubmitted â Verified: verifier passes.Verified â Disputed: client or verifier opens a challenge withinchallengeWindow.Disputed â Finalized: dispute resolution completes.
5) Build the dispatch mechanism (practical selection)
Implement selection with two layers: eligibility and policy.
Eligibility checks (simple):
- Operator has capability tag matching task type.
- Operator is currently live.
- Operator is not already overloaded (optional, but useful).
Selection policy example:
- Choose one operator with weighted random where weight = operator reliability score.
- If they donât accept in time, re-run selection once.
Easy example:
- Task budget: 10 tokens.
- Operator A reliability 0.9, B reliability 0.6.
- Weighted selection picks A more often.
6) Proof package format (what gets submitted)
A proof package should be self-contained enough to verify without guessing.
Include:
taskIdoperatorIdacceptanceSignatureevidenceArtifacts(or references)artifactHashes(hashes of each artifact)evidenceTimestampproofSignatureverificationHints(optional: e.g., which fields correspond to which criteria)
Concrete example (photo evidence):
- Operator submits
photo1,photo2. - The proof includes
hash(photo1),hash(photo2). - Verifier checks that the hashes match the submitted artifacts and that metadata meets freshness rules.
7) Verification pipeline (staged to reduce cost)
Verification should be staged so you can reject bad proofs early.
Stage 1: structural checks
- Correct taskId and operatorId.
- Signatures valid.
- EvidenceTimestamp within allowed window.
Stage 2: evidence checks
- Hashes match artifacts.
- Required number/type of artifacts present.
- Basic content constraints (e.g., image contains required region).
Stage 3: scoring and thresholding
- Compute a quality score.
- Compare against success threshold.
Example thresholding:
- Photo clarity score must be â„ 0.8.
- Serial label must be detected with confidence â„ 0.7.
8) Settlement logic (pay only when it makes sense)
Settlement is the mapping from verification outcome to token movements.
Success path example:
- Client escrow is held at request creation.
- On
Verified, pay:- Operator reward (service fee)
- Optional verifier reward (if verification is incentivized)
- Network fee (protocol fee)
Failure path example:
- If operator fails to submit proof by
proofDeadline, refund client minus a small dispatch cost (or refund fully if you prefer simplicity). - If operator submits malformed proof, treat as failure and apply penalties only if you have strong evidence of misbehavior.
Dispute path example:
- On
Disputed, pause final settlement until dispute resolution completes. - Dispute resolution re-runs verification with the dispute evidence rules.
9) Idempotency and retries (so you donât pay twice)
Every external action should be safe to retry.
Rules:
- Use an idempotency key for acceptance and proof submission.
- If a proof submission is repeated with the same
taskIdandoperatorId, treat it as the same submission. - Settlement should be triggered only once per
taskIdand outcome.
Easy example:
- Operatorâs network connection drops after submitting proof.
- Operator retries the same proof package.
- The contract recognizes the same proof hash and does not double-pay.
10) Mermaid diagram: end-to-end loop
flowchart TD
A[Client creates Request + escrow] --> B[Network filters eligible operators]
B --> C[Select operator by policy]
C --> D[Operator accepts ïŒsignedïŒ]
D --> E[Operator executes task]
E --> F[Operator submits Proof package]
F --> G[Verifier stages: structure -> evidence -> score]
G --> H{Verified?}
H -- Yes --> I[Settlement: pay operator + fees]
H -- No --> J[Failure: refund/penalty rules]
I --> K[Finalized]
J --> K
F --> L{Client opens dispute within window?}
L -- Yes --> M[Dispute resolution re-check]
M --> I
L -- No --> I
11) Minimal implementation checklist (what to code first)
- Task state machine + events (so you can observe behavior).
- Dispatch selection (eligibility + one retry).
- Acceptance signature + idempotency.
- Proof package schema + hash commitments.
- Verification stages with clear pass/fail outputs.
- Settlement transitions that are single-shot and auditable.
- Dispute window enforcement and dispute re-verification rules.
Concrete example of âsingle-shotâ settlement:
- Settlement function checks
task.state == Verified(orFinalizednot already set). - It writes
finalizedOutcomeonce, then emitsTaskFinalized.
12) Example: one complete task from start to finish
- Client requests
taskId=77for âwater meter inspectionâ with budget 10 tokens. - Network selects Operator A as eligible and assigns it.
- Operator A accepts before
acceptDeadlineand starts execution. - Operator A submits proof before
proofDeadlinewith two photos and their hashes. - Verifier passes structural checks, validates hashes, and scores clarity.
- Score meets threshold, so state becomes
Verified. - Client does not dispute within
challengeWindow. - Settlement finalizes: Operator A receives 8 tokens, protocol fee 1 token, verifier reward 1 token.
- Events recorded:
Requested,Assigned,Accepted,ProofSubmitted,Verified,TaskFinalized.
The result is a service provision loop where dispatch, evidence, and settlement each have explicit inputs and outputs. That clarity makes debugging feasible and keeps the networkâs behavior consistent even when operators or clients behave imperfectly.
14.5 Integrated Reference Implementation Checklist Example (From Specs to Launch)
This checklist is written for a small but complete DePIN network: a client requests work, operators submit measurements, verifiers validate proofs, and the protocol settles rewards. Each item includes a concrete âwhat to buildâ example so you can turn requirements into code and tests.
Mind map: end-to-end build flow
1) Specs to artifacts (turn words into enforceable rules)
- Write invariants before writing code. Example invariant: âA reward for task
Tcan be paid only if a valid proof forTis finalized.â Put it in a short list and reference it in contract comments. - Define the minimal on-chain state. Example: store only
taskId â status,taskId â aggregated proof hash, andoperatorId â stake. Keep measurement payloads off-chain. - Create an event schema that matches your accounting. Example events:
TaskCreated,ProofSubmitted,ProofFinalized,RewardClaimed,OperatorSlashed. Each event should carry the fields needed to reconstruct balances. - Map each API endpoint to a state transition. Example:
POST /tasks/{id}/proofsmust correspond toProofSubmittedand must reject iftask.status != SUBMISSION_OPEN. - Define failure modes as first-class outputs. Example: if proof verification fails, return
verification_error_codeand record it in an off-chain log keyed bytaskId.
2) On-chain core (small contracts, clear boundaries)
- Registry contract: admission and identity. Example:
registerNode(pubkey, metadataHash)androtateKey(oldKey, newKey)with checks that the node is active and not revoked. - Task lifecycle contract: deterministic state machine. Example states:
CREATED â SUBMISSION_OPEN â VERIFICATION_PENDING â FINALIZED/REJECTED. Implement transitions as explicit functions. - Proof submission hook. Example:
submitProof(taskId, operatorId, proofHash, freshnessNonce, signature)verifies signature and freshness nonce, then emitsProofSubmitted. - Verification result handling. Example: a verifier worker posts
verificationOutcome(taskId, operatorId, verdict, evidenceHash). The contract checks that the outcome matches the previously committedproofHash. - Rewards accounting with integer math. Example: compute
reward = baseReward * qualityMultiplier / 1e6using integer division rules you test. Avoid floats entirely. - Slashing rules tied to explicit triggers. Example triggers: âproof hash mismatchâ or âstale freshness nonce.â Slashing should be a single function with clear preconditions.
- Dispute window and evidence commitment. Example: after
ProofFinalized, openchallengeUntil = blockTime + window. Store onlyevidenceHashon-chain; keep evidence blobs off-chain.
3) Off-chain services (agents that do the boring work reliably)
- Operator agent: measurement â proof â submission. Example flow: collect sensor reading, compute
measurementHash, sign it, assemble proof artifact, upload artifact, then submitproofHashandevidenceHash. - Verifier worker: validate â verdict â publish. Example: verify signature, check freshness nonce, validate proof structure, then compute
verdictand publish outcome. - Client SDK: request â quote â proof receipt. Example:
client.requestTask(params)returnstaskIdandexpectedProofFormat. After submission,client.getStatus(taskId)showsSUBMISSION_OPEN,VERIFICATION_PENDING, orFINALIZED. - Storage layer: content addressing and integrity checks. Example: store artifacts under
CIDorsha256keys. Every upload returns a hash; every submission references that hash. - Idempotency and retries. Example: operator uses
submissionId = hash(taskId, operatorId, freshnessNonce)so retries donât create duplicate submissions.
4) Security and correctness (make attacks boring by design)
- Replay protection. Example: freshness nonce is unique per
(operatorId, taskId)and contract rejects reused nonces. - Authorization boundaries. Example: only the verifier role can publish
verificationOutcome. Operator role can submit proofs but cannot finalize rewards. - Accounting invariants as tests. Example invariants: total rewards paid †budget; operator balance never negative; slashing reduces stake before reward payout.
- Dispute correctness. Example: if a challenge is filed, contract should freeze claimability until dispute resolution updates the task status.
- Signature verification tests. Example: test wrong key, wrong message domain separator, and altered
proofHashall fail.
5) Reliability and operations (so it keeps working after launch day)
- SLOs tied to metrics you can measure. Example: proof submission success rate ℠99%, proof verification latency p95 †30s, and task finalization within a bounded time.
- Backpressure in workers. Example: verifier queue limits concurrent verification jobs; when overloaded, it returns
429to operators or delays pulls. - Runbooks for the top three incidents. Example runbooks: (a) verifier outage, (b) storage upload failures, (c) contract upgrade rollback.
- Upgrade procedure with versioned interfaces. Example: contracts expose
protocolVersion. Off-chain workers refuse to run if their expected version doesnât match.
6) Launch readiness (a test matrix that matches the real flow)
- End-to-end integration test: happy path. Example: create task, operator submits proof, verifier finalizes, client claims reward, and balances update correctly.
- End-to-end integration test: partial failures. Example: operator uploads artifact but submission fails; retry should reuse the same
proofHashand not double-pay. - Adversarial test: stale nonce. Example: submit proof with an old nonce; contract rejects and verifier never publishes an outcome.
- Adversarial test: proof hash mismatch. Example: upload artifact A but submit hash of artifact B; verifier rejects and operator is eligible for slashing only if your rules say so.
- Dispute test: challenge and resolution. Example: challenge within window, evidence hash matches, contract updates status and prevents reward claim until resolved.
- Migration test: contract upgrade or parameter change. Example: change
qualityMultiplierconfig; ensure old tasks still settle under old rules.
Concrete âspec-to-launchâ checklist (printable)
| Stage | Done when | Example acceptance criteria |
|---|---|---|
| Specs | invariants + state machine defined | âFINALIZED implies reward eligibilityâ is enforced |
| On-chain | contracts compile and unit tests pass | reward math matches golden vectors |
| Off-chain | workers run in staging | operator retries are idempotent |
| Security | replay and signature tests pass | stale nonce always rejected |
| Ops | dashboards and alerts exist | proof latency p95 alert triggers |
| Launch | end-to-end suite green | happy path + dispute + partial failure all pass |
Minimal sign-off rubric
- Correctness: all invariants have automated tests.
- Traceability: every critical action emits an event or a keyed off-chain log entry.
- Recoverability: retries and restarts do not corrupt state or double-pay.
- Safety: dispute and slashing paths are tested, not just implemented.
When these boxes are checked, you can launch with confidence that the system behaves like the spec, not like a collection of loosely connected components.
15. Design Review, Documentation, and Verification of Correctness
15.1 Architecture Review Checklist Example Interfaces, Trust, and Failure Modes
Use this checklist when you review a DePIN architecture before implementation or major refactors. The goal is not to âapproveâ the design, but to force crisp answers about interfaces, trust assumptions, and what happens when things go wrong.
Mind map: what you must pin down
1) Interface review (contracts, not vibes)
For each interface, confirm: inputs, outputs, ordering, idempotency, and error semantics.
1.1 Client â Protocol API
- Request identity: Every request should carry a
request_idused for idempotency. - Response completeness: The protocol response must include enough data for the client to either proceed or stop (e.g., proof status, required fields, and next action).
- Error taxonomy: Separate âtemporaryâ failures (retryable) from âpermanentâ failures (e.g., invalid parameters).
Example: A client asks for a measurement quote.
- Input:
{request_id, target, constraints} - Output:
{quote_id, max_price, expected_proof_type, expires_at} - Failure:
EXPIRED_QUOTEis permanent;PROVER_TIMEOUTis retryable.
1.2 Protocol â Node Operator task dispatch
- Task determinism: The task payload should be deterministic so the operator can reproduce the expected measurement procedure.
- Freshness: Include a
task_epochorchallenge_windowso old tasks canât be replayed. - Result schema: Define a strict result format with required fields and canonical encoding.
Example: A task includes {task_id, measurement_spec_hash, start_window, end_window}. The operator returns {task_id, measurement_value, uncertainty, proof_blob_hash}.
1.3 Protocol â Verifier workflow
- Verifier role clarity: Decide whether verifiers are checking cryptographic validity only, or also checking plausibility against constraints.
- Threshold behavior: If multiple verifiers contribute, specify the aggregation rule (e.g.,
k-of-nsignatures). - Staleness handling: Verifier results must include the
task_epochthey were based on.
Example: Three verifiers sign the same proof_blob_hash. The protocol accepts when at least two signatures match the same hash.
1.4 Protocol â Storage (off-chain data)
- Content addressing: Store artifacts by hash so the protocol can verify integrity without trusting storage.
- Retrieval contract: Define what happens if an artifact is missing: does the client re-request, or does the protocol mark the job failed?
Example: The protocol stores proof_blob at hash=H. If retrieval returns bytes not matching H, the job fails with ARTIFACT_INTEGRITY_MISMATCH.
1.5 Protocol â Chain (on-chain state and events)
- Minimal on-chain state: Confirm what must be on-chain for security and what can remain off-chain.
- Event determinism: Events should be derived from canonical data so off-chain indexers can reconstruct state.
Example: Store only {job_id, status, accepted_proof_hash, reward_amount} on-chain; keep verbose evidence off-chain.
2) Trust boundary review (who can lie, and what stops them)
Write down each trust assumption explicitly. If you canât state it, you probably havenât designed around it.
2.1 Node operator trust
- Assumption: Operators may be honest-but-buggy or actively malicious.
- Controls: Require signed task acceptance, proof-of-measurement format, and challenge windows.
Example: If an operator submits a proof, the protocol should be able to verify it without trusting the operatorâs narrative.
2.2 Verifier trust
- Assumption: Verifiers can be wrong or collude.
- Controls: Use threshold rules, independent verification steps, and bind verifier outputs to the same canonical proof hash.
Example: Verifier signatures must cover {job_id, proof_hash, verifier_id, epoch}.
2.3 Client trust
- Assumption: Clients can submit arbitrary parameters.
- Controls: Validate constraints, enforce pricing caps, and ensure accounting uses protocol-approved values.
Example: Even if a client claims âquality score 0.9,â the protocol should compute or verify the score from evidence.
2.4 Storage trust
- Assumption: Storage can be unavailable or return wrong bytes.
- Controls: Hash anchoring and integrity checks.
3) Failure mode review (what breaks, and how you respond)
Treat failure modes as first-class design inputs.
3.1 Network and timing failures Checklist:
- Retries are safe due to idempotency keys.
- Timeouts map to explicit statuses.
- Ordering assumptions are documented.
Example: If a result arrives after the end_window, the protocol should reject with WINDOW_EXPIRED rather than silently accept.
3.2 Node misbehavior Checklist:
- Invalid proofs are rejected deterministically.
- Suspicious patterns trigger eligibility changes or slashing conditions.
- Disputes have a defined evidence format.
Example: If an operator reuses the same proof_blob_hash across different task_ids, the protocol flags REPLAY_SUSPECTED.
3.3 Proof invalidity and partial validity Checklist:
- Cryptographic invalidity vs semantic invalidity are separated.
- The system records why verification failed.
Example: PROOF_SIGNATURE_INVALID (cryptographic) differs from MEASUREMENT_OUT_OF_BOUNDS (semantic).
3.4 Accounting drift and rounding Checklist:
- Reward computation uses integer math or fixed-point rules.
- All multipliers are applied in a documented order.
- On-chain settlement matches off-chain previews.
Example: If you preview rewards off-chain, the preview must use the same rounding mode as settlement.
3.5 Governance and parameter mistakes Checklist:
- Parameter updates are versioned.
- Jobs reference the parameter version used.
- Rollbacks are defined for operational errors.
Example: A job stores policy_version=7. Even if version 8 is activated later, settlement uses version 7.
4) Evidence and auditability review (make failures explainable)
Confirm that every important decision has an evidence trail.
- Decision inputs logged: Store hashes of inputs used for verification.
- Decision outputs recorded: Record accepted/rejected proof hashes and computed reward components.
- Challenge readiness: Ensure the protocol can reconstruct the evidence needed for disputes.
Example: For a rejected job, the protocol records {job_id, proof_hash, failure_code, verifier_ids} so operators can fix the right thing.
5) Quick scoring rubric (useful during reviews)
For each checklist item, assign one of:
- Pass: Clear contract and defined behavior.
- Partial: Contract exists but failure handling is vague.
- Fail: Missing or contradictory assumptions.
Example: If the interface says âretry on timeoutâ but does not define idempotency, mark it Fail.
6) Mini review scenario (end-to-end sanity test)
Run this scenario through your architecture:
- Client submits a job request with
request_id. - Protocol dispatches a task with
task_epoch. - Operator returns a result with
proof_blob_hash. - Verifiers sign the canonical hash.
- Protocol accepts or rejects, records the decision, and settles rewards.
- If rejected, the client can submit evidence during the challenge window.
At each step, verify that the interface contract and failure response are explicit. If you canât answer âwhat status is returned and why,â the design still needs work.
15.2 Documentation Standards Example Specs, Runbooks, and Data Dictionaries
Good documentation is a system component: it reduces ambiguity, speeds up debugging, and makes audits less painful. This section defines a practical standard you can apply to any DePIN moduleâregistry, measurement, verification, rewards, or client APIs.
Documentation set (what you write)
- Module Spec (normative): Defines interfaces, state transitions, invariants, and failure handling. Treat it like code comments that actually matter.
- Runbook (operational): Explains what to do when things go wrong, including alerts, triage steps, and rollback procedures.
- Data Dictionary (precise data model): Lists every field, type, unit, encoding, and validation rule. This is where âitâs a numberâ stops being acceptable.
- Example Pack (executable clarity): Includes one or two end-to-end flows with concrete payloads and expected outcomes.
Mind map: documentation coverage
Module Spec template (with an example)
Use a consistent structure so reviewers can find answers quickly.
Module Spec: Verification Service (example)
- Purpose: Verify signed measurements and produce a verification result used for reward eligibility.
- Inputs:
ProofSubmissioncontainingnodeId,requestId,measurement,signature,freshness.
- Outputs:
VerificationResultcontainingrequestId,status,qualityScore,evidenceHash.
- State transitions:
Pending -> Verifiedwhen signature and freshness pass.Pending -> Rejectedwhen signature fails or evidence is stale.
- Invariants:
- A
requestIdcan be verified at most once pernodeId. evidenceHashmust equalhash(measurement || measurementMeta).
- A
- Failure modes:
- Signature verification failure is non-retryable.
- Storage retrieval failure is retryable up to
N=3with exponential backoff.
- Idempotency:
- If the same
proofSubmissionIdis received again, return the existingVerificationResult.
- If the same
A small but important detail: specify what âqualityScoreâ means. For example, define it as an integer in [0, 100] derived from measurement bounds, not an arbitrary float that nobody can reproduce.
Runbook template (with concrete triage steps)
Runbooks should be written for someone who is competent but not currently familiar with your system.
Runbook: Proof verification latency spike
- Trigger: Alert when
p95(verification_duration_ms) > 5000for 10 minutes. - Immediate actions (first 5 minutes):
- Check whether the spike is global or limited to one stage by comparing:
signature_verify_duration_msevidence_fetch_duration_msquality_scoring_duration_ms
- Look up a sample of failing requests using
correlationIdfrom the alert payload. - Confirm whether queue depth increased for
verification_tasks.
- Check whether the spike is global or limited to one stage by comparing:
- Triage decision:
- If
evidence_fetch_duration_msdominates, verify object storage health and credentials. - If
signature_verify_duration_msdominates, check CPU saturation and key cache hit rate. - If
quality_scoring_duration_msdominates, validate that scoring parameters match the current protocol version.
- If
- Mitigation:
- Temporarily reduce verification concurrency from
CtoC/2. - If backlog grows, pause new submissions while keeping in-flight verifications running.
- Temporarily reduce verification concurrency from
- Recovery:
- Resume submissions after
p95returns below3000 msfor 15 minutes. - Reprocess any tasks that were marked
timeoutbut later became available.
- Resume submissions after
- Post-incident notes:
- Record the exact change set (config version, deployment hash) and the measured before/after metrics.
This runbook avoids âcheck everythingâ instructions. It tells you what to measure first, what to assume second, and what to change last.
Data dictionary standards (what âpreciseâ means)
Every field in every payload should have:
- Type: e.g.,
uint64,bytes32,string. - Encoding: e.g., hex with
0xprefix, base64, UTF-8. - Unit: e.g., milliseconds, meters, seconds.
- Constraints: ranges, allowed values, max length.
- Validation rules: exact checks, including canonicalization.
- Example value: one realistic instance.
Data Dictionary: VerificationResult (example)
| Field | Type | Encoding | Constraints | Validation | Example | ||
|---|---|---|---|---|---|---|---|
requestId | bytes32 | hex 0x... | exactly 32 bytes | must match submitted request | 0x9f3a...c2 | ||
nodeId | bytes32 | hex 0x... | exactly 32 bytes | must belong to active membership | 0x01ab...77 | ||
status | enum | string | VERIFIED | REJECTED | must be one of allowed values | VERIFIED | ||
qualityScore | uint8 | decimal | 0..100 | computed from bounds | 87 | ||
evidenceHash | bytes32 | hex 0x... | exactly 32 bytes | hash(measurement | | meta) | 0x3c10...aa | ||
verifiedAt | uint64 | ms since epoch | >0 | must be monotonic per request | 1710000000123 |
Two practical rules:
- Canonicalize before hashing: specify the exact byte layout used for
evidenceHash. - Define enums as closed sets: never allow âotherâ unless you also define how it affects accounting.
Example specs: payloads and expected outcomes
Example 1: ProofSubmission (happy path)
proofSubmissionId:0xabc...01requestId:0x9f3a...c2nodeId:0x01ab...77freshness:nonce:0x55aa...10timestampMs:1710000000123
measurement:value:42.0unit:kWhbounds:[41.8, 42.2]
signature: signature overhash(canonicalProofSubmission)
Expected outcome:
status = VERIFIEDqualityScore = 87evidenceHashequals the hash of canonical measurement bytes.
Example 2: ProofSubmission (rejected due to stale freshness)
- Same fields as above, but
timestampMsis older than the allowed window.
Expected outcome:
status = REJECTEDqualityScoreomitted or set to0(pick one and document it)evidenceHashstill computed and returned for auditability
Mind map: data dictionary depth
Versioning and compatibility notes (keep it boring)
Document how schemas evolve:
- Additive changes: new optional fields with default behavior.
- Breaking changes: new
schemaVersionand explicit migration steps. - Hash changes: if canonicalization changes, specify whether old proofs remain verifiable and how
evidenceHashis interpreted.
A good standard is: if a reviewer canât tell whether a field is safe to ignore, you havenât documented it yet.
15.3 Correctness Verification Example Invariants for Accounting and Eligibility
Correctness in DePIN accounting usually fails in boring ways: off-by-one eligibility windows, double-counted proofs, inconsistent rounding, or âvalidâ proofs that donât match the task they were meant for. The goal of this section is to define invariantsâstatements that must always be trueâthen show how to test them with concrete examples.
Invariants for Accounting
Assume a simple model:
- A task is created by a client request.
- Operators submit proofs for that task.
- A verifier accepts or rejects proofs.
- Accepted proofs earn rewards paid from an escrow.
Invariant A1: Conservation of escrow
Let E0 be the escrow amount deposited for a task, Espent the total paid out, and Erefund the amount refunded after settlement.
\[ E0 = Espent + Erefund \]
Why it matters: If you canât account for every unit of escrow, you canât guarantee payouts match eligibility.
Example:
- Escrow
E0 = 100tokens. - Two accepted proofs earn
30and40tokens. - Dispute window ends with no further payouts.
- Refund should be
100 - (30+40) = 30.
A test should fail if the contract pays 70 but refunds 20 (sum 90), or pays 75 and refunds 30 (sum 105).
Invariant A2: No double credit per proof
Each accepted proof has a unique identifier proofId. Let credited(proofId) be a boolean.
Invariant:
- For any
proofId, ifcredited(proofId) = true, then it can never be credited again.
Example:
- Operator submits proof
P42. - Verifier accepts it.
- A retry message arrives with the same
P42. - The second submission must not increase rewards.
A practical implementation uses a mapping credited[proofId] checked before crediting.
Invariant A3: Reward equals deterministic function of accepted inputs
Define a reward function R that depends only on accepted data:
- task parameters (e.g.,
unitPrice,maxReward) - proof quality score
q - measurement bounds
b
\[ reward = \min(\text{maxReward}, \text{unitPrice} \cdot f(q,b)) \]
Invariant: Given the same accepted proof and task parameters, reward must be identical across all nodes and contract calls.
Example:
unitPrice = 10.f(q,b)returns a rational value that must be computed with integer math.- If you use floating-point in off-chain code and then re-compute on-chain differently, youâll violate this invariant.
A test should compare off-chain computed reward to on-chain computed reward for the same accepted proof.
Invariant A4: Monotonic settlement state
Let settlement state be an enum: Open, Dispute, Finalized.
Invariant:
- State transitions are monotonic:
Open -> Dispute -> Finalized. - No transition can move backward.
Example:
- If a late proof arrives after
Finalized, it must be ignored or rejected without changing balances.
Invariants for Eligibility
Eligibility defines who can be credited and under what conditions.
Invariant E1: Proof-task binding
A proof must be bound to the task it claims to satisfy.
Invariant:
- For accepted proof
pon taskt, the proofâstaskHash(or equivalent binding) must equal the taskâs canonical hash.
Example:
- Operator reuses a proof from a previous task with similar parameters.
- Even if the measurement looks good, the binding must fail.
This prevents âproof reuseâ attacks that look plausible at the data level.
Invariant E2: Eligibility window correctness
Let task t have an earliestSubmission and latestSubmission.
Invariant:
- A proof is eligible only if
timestamp(proof) \in [earliestSubmission, latestSubmission].
Example:
- If
latestSubmissionis inclusive, then a proof at exactly the boundary should count. - If itâs exclusive, it must not.
Pick one rule and test boundary timestamps explicitly.
Invariant E3: Node status gating
Let node status be Active, Suspended, Removed.
Invariant:
- Only nodes with status
Activeat the time of proof submission can be eligible.
Example:
- Node is suspended after sending a proof.
- The proof should still be eligible if suspension happened later than submission.
This requires storing or querying status at submission time, not just current status.
Invariant E4: Quality threshold enforcement
Let minQuality be a task parameter.
Invariant:
- Accepted proofs must satisfy
quality >= minQuality.
Example:
- If
minQuality = 80and a proof scores79.999due to rounding, define whether it fails or passes. - Use integer-scaled quality (e.g., basis points) to avoid ambiguity.
Mind Map: Accounting + Eligibility Invariants

Example Test Scenarios (Concrete)
Scenario 1: Double submission of the same accepted proof
- Task escrow
E0 = 100. - Proof
P42is accepted for taskTwith reward40. - The same proof
P42is submitted again.
Expected invariants:
- A2 holds: total credited remains
40. - A1 holds: escrow accounting still balances.
- A4 holds: state remains consistent (no re-opening).
Scenario 2: Proof reuse across tasks
- Task
T1andT2have different canonical hashes. - Operator submits proof
Pcreated forT1but claims it forT2.
Expected invariants:
- E1 fails:
proof.taskHash != task.hash. - No reward is credited, so A1 remains intact.
Scenario 3: Boundary timestamp eligibility
- Task window is
[1000, 2000]. - Proof
Parrives at timestamp2000.
Expected invariants:
- E2 holds according to your chosen inclusivity.
- If you define inclusive end, it must be eligible; otherwise it must not.
Scenario 4: Node suspended after submission
- Node is
Activeat time1500. - Node is suspended at time
1600. - Proof submitted at
1500is evaluated at1700.
Expected invariants:
- E3 holds: eligibility is based on status at submission time.
- A3 holds: reward is computed from accepted inputs only.
Practical Verification Approach
To verify these invariants, treat them as properties of transitions:
- âWhen a proof is accepted, then A2 and E1 and E2 and E3 and E4 must all hold.â
- âWhen settlement finalizes, then A1 and A4 must hold.â
In tests, you donât just check outputs; you check that the system cannot reach a state that violates an invariant. That mindset turns accounting from âit seems rightâ into âit cannot be wrong without breaking a rule.â
15.4 Security Review Checklist Example Threat Coverage and Control Mapping
A security review is most useful when it ties each threat to a specific control, then checks that the control is actually enforceable. This checklist is written to help you do that mapping without hand-waving.
Threat-to-Control Mind Map (Coverage Map)
Checklist: Map Each Threat to a Control (and a Test)
Use the table below as a working template. Each row should end with a concrete verification step.
| Threat | What goes wrong (example) | Control(s) you should have | How to verify it works |
|---|---|---|---|
| Node impersonation | An attacker submits proofs as a legitimate operator | Signed node identity; mutual authentication; admission control | Attempt submission with a non-member key; expect rejection at the boundary |
| Proof forgery | A client accepts a fabricated measurement | Proofs are signed by the measurement source; verifier checks signatures and format | Provide a proof with a valid signature over altered content; expect failure |
| Proof tampering in transit | Proof bytes change between operator and verifier | Hash anchoring; integrity checks on receipt | Flip one byte in transit; ensure the verifier rejects due to hash mismatch |
| Replay of old proofs | Old proofs are reused to claim rewards | Nonce/freshness binding to request; replay cache | Resubmit the same proof for a new request; expect rejection |
| Reordering of events | Settlement uses an outdated state | Deterministic event ordering; finality assumptions | Submit events out of order; confirm state machine rejects or waits |
| Double submission / double spend | Same work triggers multiple payouts | Idempotency keys; single-use request IDs | Submit the same request twice; verify only one settlement occurs |
| Challenge evasion | Disputes cannot be raised in time | Challenge windows; verifiable evidence submission | Try to challenge after expiry; confirm it is blocked and logged |
| Denial of service on verification | Verifier is overwhelmed by expensive checks | Rate limiting; staged verification; early rejection | Flood with invalid proofs; confirm CPU stays bounded |
| Storage poisoning | Retrieved proof artifacts are replaced | Content addressing; signature verification after retrieval | Point retrieval to wrong content hash; ensure mismatch triggers failure |
| Privacy leakage | Metadata reveals which client requested what | Minimize identifiers in off-chain messages; access control | Inspect logs and payloads; confirm sensitive fields are not emitted |
| Economic manipulation | Operator submits low-quality data but passes checks | Quality thresholds; multipliers tied to measurable signals | Use borderline-quality inputs; verify reward scaling matches rules |
| Governance parameter abuse | Malicious parameter change breaks safety | Role-based governance; timelocks; versioned compatibility | Propose an invalid parameter; ensure it fails validation and cannot execute |
| Key compromise blast radius | Stolen keys allow broad misuse | Key rotation; scoped keys; revocation propagation | Rotate keys and revoke old ones; confirm old signatures stop working |
Example Control Mapping: One End-to-End Scenario
Consider a simple flow: a client requests a measurement, an operator submits a proof, a verifier validates it, and settlement pays from escrow.
-
Spoofing threat: An attacker tries to submit a proof pretending to be an operator.
- Control: The operator identity is established during admission, and every proof includes an operator signature over the request ID and measurement payload.
- Control mapping check: The verifier must reject proofs where the operator signature does not match the admitted operator key.
- Test: Use a valid signature from a different admitted operator; confirm rejection.
-
Replay threat: The attacker replays a previously valid proof for a new request.
- Control: The proof is bound to a unique request ID and includes a freshness nonce issued by the verifier.
- Control mapping check: The verifier must track used nonces or request IDs and reject duplicates.
- Test: Submit the same proof bytes for a different request; confirm mismatch due to nonce binding.
-
Tampering threat: Proof bytes are modified in transit.
- Control: The verifier checks a content hash anchored in the signed proof envelope.
- Control mapping check: If any byte changes, the hash check fails before expensive verification.
- Test: Flip a single bit in the proof payload; confirm early rejection.
-
Economic threat: The operator tries to get paid twice for the same request.
- Control: Settlement uses a request ID as a unique key and enforces single-use payout.
- Control mapping check: The contract or settlement module must be idempotent with respect to request ID.
- Test: Submit two identical âready to settleâ messages; confirm only one payout event is emitted.
Security Review Prompts (What to Ask During the Review)
- Boundary clarity: âWhere exactly does trust begin and end?â Identify the first component that rejects unauthenticated inputs.
- Failure mode behavior: âWhat happens when verification is partially successful?â Ensure the system fails closed for integrity and freshness checks.
- Cost control: âWhich checks are cheapest and should run first?â Put hash and signature checks before heavy computation.
- Auditability: âCan we reconstruct why a proof was accepted or rejected?â Require structured logs that include request ID, operator ID, and failure reason codes.
- Key handling: âHow are keys stored, rotated, and revoked?â Confirm that revocation is enforced at verification time, not only at admission.
- Governance safety: âWhat prevents a parameter update from breaking invariants?â Validate parameter ranges and compatibility with existing request formats.
Minimal âControl Evidenceâ Checklist (So Reviewers Can Confirm Reality)
For each control you claim, collect evidence that it is implemented and enforced:
- A specification statement (what the control guarantees).
- A concrete enforcement point (which component rejects bad inputs).
- A test case (what input triggers the rejection).
- A log or event (how the system records the outcome).
If a threat has no control evidence, it is not covered yet. If a control exists but has no enforcement point, it is a policy, not security.
15.5 Launch Readiness Checklist Example Migration, Monitoring, and Rollback Plans
A launch plan is mostly about boring details: what changes, how you measure success, and what you do when reality disagrees with the spec. This checklist is written for a DePIN-style system with on-chain settlement, off-chain proof generation, and operator nodes.
Migration plan (what moves, when, and how you prove it)
A. Define the migration scope
- On-chain: contract addresses, registry formats, reward accounting rules, dispute/challenge parameters.
- Off-chain: proof schema versions, storage layout, indexing models, client request/response formats.
- Operators: node software version, signing keys, measurement/verification pipeline behavior.
B. Use versioned compatibility gates
- Introduce a protocol version field in every proof submission and every settlement-relevant event.
- Require the verifier to accept only supported versions and reject unknown ones with a clear error code.
Example: If you change a proof format from v1 to v2, keep v1 verification live until the last operator migrates. Clients can submit v2 proofs immediately, but settlement only credits v2 after the on-chain parameter update is activated.
C. Plan the activation order
- Deploy new verifier logic (off-chain and/or on-chain) in a way that can verify both old and new proof versions.
- Update client and operator software to produce the new proof version.
- Activate on-chain parameters that depend on the new proof semantics (e.g., quality multipliers, eligibility thresholds).
- Decommission old proof version only after you observe stable proof acceptance rates.
D. Migration rehearsal with a shadow run
- Run the new pipeline in shadow mode: generate
v2proofs while still submittingv1for settlement. - Compare acceptance outcomes and computed reward components between versions.
Example: If v2 introduces a stricter freshness check, you should see a predictable drop in accepted proofs during rehearsal. If the drop is sudden and large, you likely have a clock skew or timestamp parsing issue.
E. Data migration checklist
- Proof storage: ensure content addressing (hash-based keys) still resolves correctly after schema changes.
- Indexing: rebuild read models from canonical event logs; do not rely on cached transformations.
- Key material: confirm operator signing keys are valid for the new signing domain and that rotation rules are enforced.
Monitoring plan (what you watch so you can act quickly)
Monitoring should map directly to failure modes: submission failures, proof invalidity, verification delays, and settlement mismatches.
Mind map: Launch monitoring signals
A. Define SLO-style thresholds (with concrete actions)
- Proof acceptance rate: alert if it drops below a chosen floor for 10 minutes.
- Verification latency: alert if p95 exceeds a threshold for 5 minutes.
- Reason-code spikes: alert when a single failure reason jumps sharply (e.g.,
INVALID_SIGNATUREorPROOF_VERSION_UNSUPPORTED).
Example: If PROOF_VERSION_UNSUPPORTED spikes right after deployment, you likely updated clients but not verifiers (or vice versa). The fastest fix is often a rollback of the activation step, not a full redeploy.
B. Track reconciliation explicitly
- Maintain a job that compares:
- expected reward components computed off-chain, vs
- actual credited amounts from on-chain events.
- Store the delta breakdown by operator, client, and proof version.
Example: A rounding rule change might cause small deltas that are harmless for a single event but unacceptable in aggregate. Reconciliation turns that into a measurable, stoppable condition.
C. Instrument correlation IDs end-to-end
- Include a correlation ID in:
- client request,
- operator task assignment,
- proof submission,
- verifier processing,
- settlement event emission.
This makes it possible to answer: âWhich proof caused which settlement event?â without guessing.
Rollback plan (how you stop the bleeding without breaking everything)
Rollback should be designed as a sequence, not a single button.
A. Classify rollback types
- Soft rollback: revert activation flags/parameters so new proofs stop being credited.
- Medium rollback: revert verifier logic to the previous version while keeping the new client/operator running.
- Hard rollback: revert contract logic or redeploy critical components (rare; requires careful state handling).
B. Prefer feature flags and version gates
- Keep both proof versions verifiable during the rollout window.
- Use an on-chain (or config) switch like
activeProofVersionto control settlement crediting.
Example: If v2 proofs are being accepted but settlement deltas are wrong, you can set activeProofVersion = v1 immediately. Operators can keep producing v2 proofs, but they wonât affect settlement until you fix the accounting.
C. Rollback decision triggers
- Settlement mismatch exceeds tolerance (e.g., any non-zero mismatch for a category that must match exactly).
- Proof acceptance rate collapse with a reason-code that indicates a systemic parsing or signature issue.
- On-chain transaction failures rising above a threshold (e.g., due to gas estimation changes or invalid calldata).
D. Rollback execution steps (example sequence)
- Freeze crediting: set
activeProofVersionto the last known-good version. - Stop new activations: disable any scheduled parameter updates.
- Quarantine new proofs: mark
v2submissions as âreceived but not eligibleâ in off-chain indexing. - Revert verifier config: switch verifier to the previous configuration that matches the active proof version.
- Post-rollback validation: confirm that acceptance and settlement reconciliation return to baseline.
E. Communication inside the system
- Ensure clients receive a stable error message when they submit a proof version that is not currently eligible.
- Operators should get a clear status so they donât keep retrying blindly.
Mind map: Rollback playbook
Example launch checklist (copy/paste friendly)
- Migration rehearsal completed in shadow mode; acceptance and reconciliation deltas understood.
- Version gates implemented for proof submission and settlement eligibility.
- Activation order followed: verifier compatibility first, then client/operator updates, then on-chain parameter activation.
- Monitoring dashboards include: acceptance rate, reason codes, proof latency, tx success rate, reconciliation delta.
- Alert thresholds defined with clear runbook steps.
- Correlation IDs wired through client â operator â verifier â settlement.
- Rollback triggers documented and tested in staging.
- Rollback sequence prepared: freeze crediting â disable updates â revert verifier config â validate.
- Runbook includes âwhat to check firstâ based on the top 3 alert reason codes.
This is the part of the project where you earn the right to sleep: you make the system observable enough to diagnose quickly, and controllable enough to stop impact without inventing new failure modes.