From 2e23c644487dc65ac3310afaf4828a0ad9d088ab Mon Sep 17 00:00:00 2001 From: Ignotus Peverell Date: Tue, 8 Nov 2016 09:50:13 -0800 Subject: [PATCH] Intro to pruning doc, just some facts and size data. --- doc/pruning.md | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 doc/pruning.md diff --git a/doc/pruning.md b/doc/pruning.md new file mode 100644 index 000000000..90c3893c1 --- /dev/null +++ b/doc/pruning.md @@ -0,0 +1,43 @@ +# Pruning Blockchain Data + +One of the principal attractions of MimbleWimble is its theoretical space +efficiency. Indeed, a trusted or pre-validated full blockchain state only +requires unspent transaction outputs, which could be tiny. + +The grin blockchain includes the following types of data (we assume prior +understanding of the MimbleWimble protocol): + +1. Transaction outputs, which include for each output: + 1. A Pedersen commitment (33 bytes). + 2. A range proof (over 5KB at this time). +2. Transaction inputs which are just output references (32 bytes). +3. Transaction "proofs", which include for each transaction: + 1. The excess commitment sum for the transaction (33 bytes). + 2. A signature generated with the excess (71 bytes average). +4. A block header includes Merkle trees and proof of work (about 250 bytes). + +Assuming a blockchain of a million blocks, 10 million transactions (2 inputs, 2.5 +outputs average) and 100,000 unspent outputs, we get the following approximate +sizes with a full chain (no pruning, no cut-through): + +* 128GB of transaction data (inputs and outputs). +* 1 GB of transaction proof data. +* 250MB of block headers. +* Total chain size around 130GB. +* Total chain size, after cut-through (but incl. headers) of 1.8GB. +* UTXO size of 520MB. +* Total chain size, without range proofs of 4GB. +* UTXO size, without range proofs of 3.3MB. + +We note that out of all that data, once the chain has been fully validated, only +the set of UTXO commitments is strictly required for a node to function. + +There may be several contexts in which data can be pruned: + +* A fully validating node may get rid of some data it has already validated to +free space. +* A partially validating node (similar to SPV) may not do full validation and +hence not be interested in either receiving or keeping all the data. +* When a new node joins the network, it may temporarily behave as a partially +validating node to make it available for use faster, even if it ultimately becomes +a fully validating node.