2016-11-08 20:50:13 +03:00
|
|
|
# Pruning Blockchain Data
|
|
|
|
|
[1.1.0] Merge master into 1.1.0 (#2720)
* cleanup legacy "3 dot" check (#2625)
* Allow to peers behind NAT to get up to preferred_max connections (#2543)
Allow to peers behind NAT to get up to preffered_max connections
If peer has only outbound connections it's mot likely behind NAT and we should not stop it from getting more outbound connections
* Reduce usage of unwrap in p2p crate (#2627)
Also change store crate a bit
* Simplify (and fix) output_pos cleanup during chain compaction (#2609)
* expose leaf pos iterator
use it for various things in txhashset when iterating over outputs
* fix
* cleanup
* rebuild output_pos index (and clear it out first) when compacting the chain
* fixup tests
* refactor to match on (output, proof) tuple
* add comments to compact() to explain what is going on.
* get rid of some boxing around the leaf_set iterator
* cleanup
* [docs] Add switch commitment documentation (#2526)
* remove references to no-longer existing switch commitment hash
(as switch commitments were removed in ca8447f3bd49e80578770da841e5fbbac2c23cde
and moved into the blinding factor of the Pedersen Commitment)
* some rewording (points vs curves) and fix of small formatting issues
* Add switch commitment documentation
* [docs] Documents in grin repo had translated in Korean. (#2604)
* Start to M/W intro translate in Korean
* translate in Korean
* add korean translation on intro
* table_of_content.md translate in Korean.
* table_of_content_KR.md finish translate in Korean, start to translate State_KR.md
* add state_KR.md & commit some translation in State_KR.md
* WIP stat_KR.md translation
* add build_KR.md && stratum_KR.md
* finish translate stratum_KR.md & table_of_content_KR.md
* rename intro.KR.md to intro_KR.md
* add intro_KR.md file path each language's intro.md
* add Korean translation file path to stratum.md & table_of_contents.md
* fix difference with grin/master
* Fix TxHashSet file filter for Windows. (#2641)
* Fix TxHashSet file filter for Windows.
* rustfmt
* Updating regexp
* Adding in test case
* Display the current download rate rather than the average when syncing the chain (#2633)
* When syncing the chain, calculate the displayed download speed using the current rate from the most recent iteration, rather than the average download speed from the entire syncing process.
* Replace the explicitly ignored variables in the pattern with an implicit ignore
* remove root = true from editorconfig (#2655)
* Add Medium post to intro (#2654)
Spoke to @yeastplume who agreed it makes sense to add the "Grin Transactions Explained, Step-by-Step" Medium post to intro.md
Open for suggestions on a better location.
* add a new configure item for log_max_files (#2601)
* add a new configure item for log_max_files
* rustfmt
* use a constant instead of multiple 32
* rustfmt
* Fix the build warning of deprecated trim_right_matches (#2662)
* [DOC] state.md, build.md and chain directory documents translate in Korean. (#2649)
* add md files for translation.
* start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md
* add dandelion_KR.md && simulation_KR.md for Korean translation.
* add md files for translation.
* start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md
* add dandelion_KR.md && simulation_KR.md for Korean translation.
* remove some useless md files for translation. this is rearrange set up translation order.
* add dot end of sentence & translate build.md in korean
* remove fast-sync_KR.md
* finish build_KR.md translation
* finish build_KR.md translation
* finish translation state_KR.md & add phrase in state.md to move other language md file
* translate blocks_and_headers.md && chain_sync.md in Korean
* add . in chain_sync.md , translation finished in doc/chain dir.
* fix some miss typos
* Api documentation fixes (#2646)
* Fix the API documentation for Chain Validate (v1/chain/validate). It was documented as a POST, but it is actually a GET request, which can be seen in its handler ChainValidationHandler
* Update the API V1 route list response to include the headers and merkleproof routes. Also clarify that for the chain/outputs route you must specify either byids or byheight to select outputs.
* refactor(ci): reorganize CI related code (#2658)
Break-down the CI related code into smaller more maintainable pieces.
* Specify grin or nanogrins in API docs where applicable (#2642)
* Set Content-Type in API client (#2680)
* Reduce number of unwraps in chain crate (#2679)
* fix: the restart of state sync doesn't work sometimes (#2687)
* let check_txhashset_needed return true on abnormal case (#2684)
* Reduce number of unwwaps in api crate (#2681)
* Reduce number of unwwaps in api crate
* Format use section
* Small QoL improvements for wallet developers (#2651)
* Small changes for wallet devs
* Move create_nonce into Keychain trait
* Replace match by map_err
* Add flag to Slate to skip fee check
* Fix secp dependency
* Remove check_fee flag in Slate
* Add Japanese edition of build.md (#2697)
* catch the panic to avoid peer thread quit early (#2686)
* catch the panic to avoid peer thread quit before taking the chance to ban
* move catch wrapper logic down into the util crate
* log the panic info
* keep txhashset.rs untouched
* remove a warning
* [DOC] dandelion.md, simulation.md ,fast-sync.md and pruning.md documents translate in Korean. (#2678)
* Show response code in API client error message (#2683)
It's hard to investigate what happens when an API client error is
printed out
* Add some better logging for get_outputs_by_id failure states (#2705)
* Switch commitment doc fixes (#2645)
Fix some typos and remove the use of parentheses in a
couple of places to make the reading flow a bit better.
* docs: update/add new README.md badges (#2708)
Replace existing badges with SVG counterparts and add a bunch of new ones.
* Update intro.md (#2702)
Add mention of censoring attack prevented by range proofs
* use sandbox folder for txhashset validation on state sync (#2685)
* use sandbox folder for txhashset validation on state sync
* rustfmt
* use temp directory as the sandbox instead actual db_root txhashset dir
* rustfmt
* move txhashset overwrite to the end of full validation
* fix travis-ci test
* rustfmt
* fix: hashset have 2 folders including txhashset and header
* rustfmt
*
(1)switch to rebuild_header_mmr instead of copy the sandbox header mmr
(2)lock txhashset when overwriting and opening and rebuild
* minor improve on sandbox_dir
* add Japanese edition of state.md (#2703)
* Attempt to fix broken TUI locale (#2713)
Can confirm that on the same machine 1.0.2 TUI looks great and is broken on
the current master. Bump of `cursive` version fixed it for me.
Fixes #2676
* clean the header folder in sandbox (#2716)
* forgot to clean the header folder in sandbox in #2685
* Reduce number of unwraps in servers crate (#2707)
It doesn't include stratum server which is sufficiently changed in 1.1
branch and adapters, which is big enough for a separate PR.
* rustfmt
* change version to beta
2019-04-01 13:47:48 +03:00
|
|
|
*Read this in other languages: [Korean](pruning_KR.md).*
|
|
|
|
|
2016-11-08 20:50:13 +03:00
|
|
|
One of the principal attractions of MimbleWimble is its theoretical space
|
|
|
|
efficiency. Indeed, a trusted or pre-validated full blockchain state only
|
|
|
|
requires unspent transaction outputs, which could be tiny.
|
|
|
|
|
|
|
|
The grin blockchain includes the following types of data (we assume prior
|
|
|
|
understanding of the MimbleWimble protocol):
|
|
|
|
|
|
|
|
1. Transaction outputs, which include for each output:
|
2017-09-07 21:19:19 +03:00
|
|
|
1. A Pedersen commitment (33 bytes).
|
|
|
|
2. A range proof (over 5KB at this time).
|
|
|
|
2. Transaction inputs, which are just output references (32 bytes).
|
2016-11-08 20:50:13 +03:00
|
|
|
3. Transaction "proofs", which include for each transaction:
|
2017-09-07 21:19:19 +03:00
|
|
|
1. The excess commitment sum for the transaction (33 bytes).
|
|
|
|
2. A signature generated with the excess (71 bytes average).
|
2016-11-08 20:50:13 +03:00
|
|
|
4. A block header includes Merkle trees and proof of work (about 250 bytes).
|
|
|
|
|
|
|
|
Assuming a blockchain of a million blocks, 10 million transactions (2 inputs, 2.5
|
|
|
|
outputs average) and 100,000 unspent outputs, we get the following approximate
|
|
|
|
sizes with a full chain (no pruning, no cut-through):
|
|
|
|
|
|
|
|
* 128GB of transaction data (inputs and outputs).
|
|
|
|
* 1 GB of transaction proof data.
|
|
|
|
* 250MB of block headers.
|
|
|
|
* Total chain size around 130GB.
|
|
|
|
* Total chain size, after cut-through (but incl. headers) of 1.8GB.
|
|
|
|
* UTXO size of 520MB.
|
|
|
|
* Total chain size, without range proofs of 4GB.
|
|
|
|
* UTXO size, without range proofs of 3.3MB.
|
|
|
|
|
|
|
|
We note that out of all that data, once the chain has been fully validated, only
|
|
|
|
the set of UTXO commitments is strictly required for a node to function.
|
|
|
|
|
|
|
|
There may be several contexts in which data can be pruned:
|
|
|
|
|
|
|
|
* A fully validating node may get rid of some data it has already validated to
|
2018-10-03 23:31:28 +03:00
|
|
|
free space.
|
2017-01-10 02:16:44 +03:00
|
|
|
* A partially validating node (similar to SPV) may not be interested in either
|
2018-10-03 23:31:28 +03:00
|
|
|
receiving or keeping all the data.
|
2016-11-08 20:50:13 +03:00
|
|
|
* When a new node joins the network, it may temporarily behave as a partially
|
2018-10-03 23:31:28 +03:00
|
|
|
validating node to make it available for use faster, even if it ultimately becomes
|
|
|
|
a fully validating node.
|
2018-01-10 22:57:16 +03:00
|
|
|
|
2018-10-03 23:31:28 +03:00
|
|
|
## Validation of Fully Pruned State
|
2018-01-10 22:57:16 +03:00
|
|
|
|
|
|
|
Pruning needs to remove as much data as possible while keeping all the
|
|
|
|
guarantees of a full MimbleWimble-style validation. This is necessary to keep
|
|
|
|
a pruning node state's sane, but also on first fast sync, where only the
|
|
|
|
minimum amount of data is sent to a new node.
|
|
|
|
|
|
|
|
The full validation of the chain state requires that:
|
|
|
|
|
|
|
|
* All kernel signatures verify against their public keys.
|
|
|
|
* The sum of all UTXO commitments, minus the supply is a valid public key (can
|
2018-10-03 23:31:28 +03:00
|
|
|
be used to sign the empty string).
|
2018-01-10 22:57:16 +03:00
|
|
|
* The sum of all kernel pubkeys equals the sum of all UTXO commitments, minus
|
2018-10-03 23:31:28 +03:00
|
|
|
the supply.
|
2018-01-10 22:57:16 +03:00
|
|
|
* The root hashes of the UTXO PMMR, the range proofs PMMR and the kernels MMR
|
2018-10-03 23:31:28 +03:00
|
|
|
match a block header with a valid Proof of Work chain.
|
2018-01-10 22:57:16 +03:00
|
|
|
* All range proofs are valid.
|
|
|
|
|
|
|
|
In addition, while not necessary to validate the full chain state, to be able
|
|
|
|
to accept and validate new blocks additional data is required:
|
|
|
|
|
[1.1.0] Merge master into 1.1.0 (#2720)
* cleanup legacy "3 dot" check (#2625)
* Allow to peers behind NAT to get up to preferred_max connections (#2543)
Allow to peers behind NAT to get up to preffered_max connections
If peer has only outbound connections it's mot likely behind NAT and we should not stop it from getting more outbound connections
* Reduce usage of unwrap in p2p crate (#2627)
Also change store crate a bit
* Simplify (and fix) output_pos cleanup during chain compaction (#2609)
* expose leaf pos iterator
use it for various things in txhashset when iterating over outputs
* fix
* cleanup
* rebuild output_pos index (and clear it out first) when compacting the chain
* fixup tests
* refactor to match on (output, proof) tuple
* add comments to compact() to explain what is going on.
* get rid of some boxing around the leaf_set iterator
* cleanup
* [docs] Add switch commitment documentation (#2526)
* remove references to no-longer existing switch commitment hash
(as switch commitments were removed in ca8447f3bd49e80578770da841e5fbbac2c23cde
and moved into the blinding factor of the Pedersen Commitment)
* some rewording (points vs curves) and fix of small formatting issues
* Add switch commitment documentation
* [docs] Documents in grin repo had translated in Korean. (#2604)
* Start to M/W intro translate in Korean
* translate in Korean
* add korean translation on intro
* table_of_content.md translate in Korean.
* table_of_content_KR.md finish translate in Korean, start to translate State_KR.md
* add state_KR.md & commit some translation in State_KR.md
* WIP stat_KR.md translation
* add build_KR.md && stratum_KR.md
* finish translate stratum_KR.md & table_of_content_KR.md
* rename intro.KR.md to intro_KR.md
* add intro_KR.md file path each language's intro.md
* add Korean translation file path to stratum.md & table_of_contents.md
* fix difference with grin/master
* Fix TxHashSet file filter for Windows. (#2641)
* Fix TxHashSet file filter for Windows.
* rustfmt
* Updating regexp
* Adding in test case
* Display the current download rate rather than the average when syncing the chain (#2633)
* When syncing the chain, calculate the displayed download speed using the current rate from the most recent iteration, rather than the average download speed from the entire syncing process.
* Replace the explicitly ignored variables in the pattern with an implicit ignore
* remove root = true from editorconfig (#2655)
* Add Medium post to intro (#2654)
Spoke to @yeastplume who agreed it makes sense to add the "Grin Transactions Explained, Step-by-Step" Medium post to intro.md
Open for suggestions on a better location.
* add a new configure item for log_max_files (#2601)
* add a new configure item for log_max_files
* rustfmt
* use a constant instead of multiple 32
* rustfmt
* Fix the build warning of deprecated trim_right_matches (#2662)
* [DOC] state.md, build.md and chain directory documents translate in Korean. (#2649)
* add md files for translation.
* start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md
* add dandelion_KR.md && simulation_KR.md for Korean translation.
* add md files for translation.
* start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md
* add dandelion_KR.md && simulation_KR.md for Korean translation.
* remove some useless md files for translation. this is rearrange set up translation order.
* add dot end of sentence & translate build.md in korean
* remove fast-sync_KR.md
* finish build_KR.md translation
* finish build_KR.md translation
* finish translation state_KR.md & add phrase in state.md to move other language md file
* translate blocks_and_headers.md && chain_sync.md in Korean
* add . in chain_sync.md , translation finished in doc/chain dir.
* fix some miss typos
* Api documentation fixes (#2646)
* Fix the API documentation for Chain Validate (v1/chain/validate). It was documented as a POST, but it is actually a GET request, which can be seen in its handler ChainValidationHandler
* Update the API V1 route list response to include the headers and merkleproof routes. Also clarify that for the chain/outputs route you must specify either byids or byheight to select outputs.
* refactor(ci): reorganize CI related code (#2658)
Break-down the CI related code into smaller more maintainable pieces.
* Specify grin or nanogrins in API docs where applicable (#2642)
* Set Content-Type in API client (#2680)
* Reduce number of unwraps in chain crate (#2679)
* fix: the restart of state sync doesn't work sometimes (#2687)
* let check_txhashset_needed return true on abnormal case (#2684)
* Reduce number of unwwaps in api crate (#2681)
* Reduce number of unwwaps in api crate
* Format use section
* Small QoL improvements for wallet developers (#2651)
* Small changes for wallet devs
* Move create_nonce into Keychain trait
* Replace match by map_err
* Add flag to Slate to skip fee check
* Fix secp dependency
* Remove check_fee flag in Slate
* Add Japanese edition of build.md (#2697)
* catch the panic to avoid peer thread quit early (#2686)
* catch the panic to avoid peer thread quit before taking the chance to ban
* move catch wrapper logic down into the util crate
* log the panic info
* keep txhashset.rs untouched
* remove a warning
* [DOC] dandelion.md, simulation.md ,fast-sync.md and pruning.md documents translate in Korean. (#2678)
* Show response code in API client error message (#2683)
It's hard to investigate what happens when an API client error is
printed out
* Add some better logging for get_outputs_by_id failure states (#2705)
* Switch commitment doc fixes (#2645)
Fix some typos and remove the use of parentheses in a
couple of places to make the reading flow a bit better.
* docs: update/add new README.md badges (#2708)
Replace existing badges with SVG counterparts and add a bunch of new ones.
* Update intro.md (#2702)
Add mention of censoring attack prevented by range proofs
* use sandbox folder for txhashset validation on state sync (#2685)
* use sandbox folder for txhashset validation on state sync
* rustfmt
* use temp directory as the sandbox instead actual db_root txhashset dir
* rustfmt
* move txhashset overwrite to the end of full validation
* fix travis-ci test
* rustfmt
* fix: hashset have 2 folders including txhashset and header
* rustfmt
*
(1)switch to rebuild_header_mmr instead of copy the sandbox header mmr
(2)lock txhashset when overwriting and opening and rebuild
* minor improve on sandbox_dir
* add Japanese edition of state.md (#2703)
* Attempt to fix broken TUI locale (#2713)
Can confirm that on the same machine 1.0.2 TUI looks great and is broken on
the current master. Bump of `cursive` version fixed it for me.
Fixes #2676
* clean the header folder in sandbox (#2716)
* forgot to clean the header folder in sandbox in #2685
* Reduce number of unwraps in servers crate (#2707)
It doesn't include stratum server which is sufficiently changed in 1.1
branch and adapters, which is big enough for a separate PR.
* rustfmt
* change version to beta
2019-04-01 13:47:48 +03:00
|
|
|
* The output features, making the full output data necessary for all UTXOs.
|
2018-01-10 22:57:16 +03:00
|
|
|
|
|
|
|
At minimum, this requires the following data:
|
|
|
|
|
|
|
|
* The block headers chain.
|
|
|
|
* All kernels, in order of inclusion in the chain. This also allows the
|
2018-10-03 23:31:28 +03:00
|
|
|
reconstruction of the kernel MMR.
|
2018-01-10 22:57:16 +03:00
|
|
|
* All unspent outputs.
|
|
|
|
* The UTXO MMR and the range proof MMR (to learn the hashes of pruned data).
|
|
|
|
|
|
|
|
Note that further pruning could be obtained by requiring the validation of
|
|
|
|
only a subset of the range proofs, chosen randomly by the validating node.
|