grin/doc/chain/chain_sync.md
Yeastplume 5cb8025ddd
[1.1.0] Merge master into 1.1.0 (#2720)
* cleanup legacy "3 dot" check (#2625)

* Allow to peers behind NAT to get up to preferred_max connections (#2543)

Allow to peers behind NAT to get up to preffered_max connections

If peer has only outbound connections it's mot likely behind NAT and we should not stop it from getting more outbound connections

* Reduce usage of unwrap in p2p crate (#2627)

Also change store crate a bit

* Simplify (and fix) output_pos cleanup during chain compaction (#2609)

* expose leaf pos iterator
use it for various things in txhashset when iterating over outputs

* fix

* cleanup

* rebuild output_pos index (and clear it out first) when compacting the chain

* fixup tests

* refactor to match on (output, proof) tuple

* add comments to compact() to explain what is going on.

* get rid of some boxing around the leaf_set iterator

* cleanup

* [docs] Add switch commitment documentation (#2526)

* remove references to no-longer existing switch commitment hash

(as switch commitments were removed in ca8447f3bd
and moved into the blinding factor of the Pedersen Commitment)

* some rewording (points vs curves) and fix of small formatting issues

* Add switch commitment documentation

* [docs] Documents in grin repo had translated in Korean.  (#2604)

*  Start to M/W intro translate in Korean
*  translate in Korean
*  add korean translation  on intro
* table_of_content.md translate in Korean.
*  table_of_content_KR.md finish translate in Korean, start to translate State_KR.md
*  add state_KR.md & commit some translation in State_KR.md
*  WIP stat_KR.md translation
*  add build_KR.md && stratum_KR.md
*  finish translate stratum_KR.md & table_of_content_KR.md
*  rename intro.KR.md to intro_KR.md
*  add intro_KR.md file path each language's  intro.md
*  add Korean translation file path to stratum.md & table_of_contents.md
*  fix difference with grin/master

* Fix TxHashSet file filter for Windows. (#2641)

* Fix TxHashSet file filter for Windows.

* rustfmt

* Updating regexp

* Adding in test case

* Display the current download rate rather than the average when syncing the chain (#2633)

* When syncing the chain, calculate the displayed download speed using the current rate from the most recent iteration, rather than the average download speed from the entire syncing process.

* Replace the explicitly ignored variables in the pattern with an implicit ignore

* remove root = true from editorconfig (#2655)

* Add Medium post to intro (#2654)

Spoke to @yeastplume who agreed it makes sense to add the "Grin Transactions Explained, Step-by-Step" Medium post to intro.md

Open for suggestions on a better location.

* add a new configure item for log_max_files (#2601)

* add a new configure item for log_max_files

* rustfmt

* use a constant instead of multiple 32

* rustfmt

* Fix the build warning of deprecated trim_right_matches (#2662)

* [DOC] state.md, build.md and chain directory documents translate in Korean.  (#2649)

*  add md files for translation.

*  start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md

* add dandelion_KR.md && simulation_KR.md for Korean translation.

*  add md files for translation.

*  start to translation fast-sync, code_structure. add file build_KR.md, states_KR.md

* add dandelion_KR.md && simulation_KR.md for Korean translation.

* remove some useless md files for translation. this is rearrange set up translation order.

*  add dot end of sentence & translate build.md in korean

*  remove fast-sync_KR.md

*  finish build_KR.md translation

*  finish build_KR.md translation

*  finish translation state_KR.md & add phrase in state.md to move other language md file

* translate blocks_and_headers.md && chain_sync.md in Korean

*  add . in chain_sync.md , translation finished in doc/chain dir.

* fix some miss typos

* Api documentation fixes (#2646)

* Fix the API documentation for Chain Validate (v1/chain/validate).  It was documented as a POST, but it is actually a GET request, which can be seen in its handler ChainValidationHandler
* Update the API V1 route list response to include the headers and merkleproof routes.  Also clarify that for the chain/outputs route you must specify either byids or byheight to select outputs.

* refactor(ci): reorganize CI related code (#2658)

Break-down the CI related code into smaller more maintainable pieces.

* Specify grin or nanogrins in API docs where applicable (#2642)

* Set Content-Type in API client (#2680)

* Reduce number of unwraps in chain crate (#2679)

* fix: the restart of state sync doesn't work sometimes (#2687)

* let check_txhashset_needed return true on abnormal case (#2684)

*  Reduce number of unwwaps in api crate  (#2681)

* Reduce number of unwwaps in api crate

* Format use section

* Small QoL improvements for wallet developers (#2651)

* Small changes for wallet devs

* Move create_nonce into Keychain trait

* Replace match by map_err

* Add flag to Slate to skip fee check

* Fix secp dependency

* Remove check_fee flag in Slate

* Add Japanese edition of build.md (#2697)

* catch the panic to avoid peer thread quit early (#2686)

* catch the panic to avoid peer thread quit before taking the chance to ban
* move catch wrapper logic down into the util crate
* log the panic info
* keep txhashset.rs untouched
* remove a warning

* [DOC] dandelion.md, simulation.md ,fast-sync.md and pruning.md documents translate in Korean. (#2678)

* Show response code in API client error message (#2683)

It's hard to investigate what happens when an API client error is
printed out

* Add some better logging for get_outputs_by_id failure states (#2705)

* Switch commitment doc fixes (#2645)

Fix some typos and remove the use of parentheses in a
couple of places to make the reading flow a bit better.

* docs: update/add new README.md badges (#2708)

Replace existing badges with SVG counterparts and add a bunch of new ones.

* Update intro.md (#2702)

Add mention of censoring attack prevented by range proofs

* use sandbox folder for txhashset validation on state sync (#2685)

* use sandbox folder for txhashset validation on state sync

* rustfmt

* use temp directory as the sandbox instead actual db_root txhashset dir

* rustfmt

* move txhashset overwrite to the end of full validation

* fix travis-ci test

* rustfmt

* fix: hashset have 2 folders including txhashset and header

* rustfmt

* 
(1)switch to rebuild_header_mmr instead of copy the sandbox header mmr 
(2)lock txhashset when overwriting and opening and rebuild

* minor improve on sandbox_dir

* add Japanese edition of state.md (#2703)

* Attempt to fix broken TUI locale (#2713)

Can confirm that on the same machine 1.0.2 TUI looks great and is broken on
the current master. Bump of `cursive` version fixed it for me.
Fixes #2676

* clean the header folder in sandbox (#2716)

* forgot to clean the header folder in sandbox in #2685

* Reduce number of unwraps in servers crate (#2707)

It doesn't include stratum server which is sufficiently changed in 1.1
branch and adapters, which is big enough for a separate PR.

* rustfmt

* change version to beta
2019-04-01 11:47:48 +01:00

6.3 KiB

Blockchain Syncing

Read this in other languages: Korean.

We describe here the different methods used by a new node when joining the network to catch up with the latest chain state. We start with reminding the reader of the following assumptions, which are all characteristics of Grin or MimbleWimble:

  • All block headers include the root hash of all unspent outputs in the chain at the time of that block.
  • Inputs or outputs cannot be tampered with or forged without invalidating the whole block state.

We're purposefully only focusing on major node types and high level algorithms that may impact the security model. Detailed heuristics that can provide some additional improvements (like header first), while useful, will not be mentioned in this section.

Full History Syncing

Description

This model is the one used by "full nodes" on most major public blockchains. The new node has prior knowledge of the genesis block. It connects to other peers in the network and starts asking for blocks until it reaches the latest block known to its peers.

The security model here is similar to bitcoin. We're able to verify the whole chain, the total work, the validity of each block, their full content, etc. In addition, with MimbleWimble and full UTXO set commitments, even more integrity validation can be performed.

We do not try to do any space or bandwidth optimization in this mode (for example, once validated the range proofs could possibly be deleted). The point here is to provide history archival and allow later checks and verifications to be made.

What could go wrong?

Identical to other blockchains:

  • If all nodes we're connected to are dishonest (sybil attack or similar), we can be lied to about the whole chain state.
  • Someone with enormous mining power could rewrite the whole history.
  • Etc.

Partial History Syncing

In this model we try to optimize for very fast syncing while sacrificing as little security assumptions as possible. As a matter of fact, the security model is almost identical as a full node, despite requiring orders of magnitude less data to download.

A new node is pre-configured with a horizon Z, which is a distance in number of blocks from the head. For example, if horizon Z=5000 and the head is at height H=23000, the block at horizon is the block at height h=18000 on the most worked chain.

The new node also has prior knowledge of the genesis block. It connects to other peers and learns about the head of the most worked chain. It asks for the block header at the horizon block, requiring peer agreement. If consensus is not reached at h = H - Z, the node gradually increases the horizon Z, moving h backward until consensus is reached. Then it gets the full UTXO set at the horizon block. With this information it can verify:

  • the total difficulty on that chain (present in all block headers)
  • the sum of all UTXO commitments equals the expected money supply
  • the root hash of all UTXOs match the root hash in the block header

Once the validation is done, the peer can download and validate the blocks content from the horizon up to the head.

While this algorithm still works for very low values of Z (or in the extreme case where Z=1), low values may be problematic due to the normal forking activity that can occur on any blockchain. To prevent those problems and to increase the amount of locally validated work, we recommend values of Z of at least a few days worth of blocks, up to a few weeks.

What could go wrong?

While this sync mode is simple to describe, it may seem non-obvious how it still can be secure. We describe here some possible attacks, how they're defeated and other possible failure scenarios.

An attacker tries to forge the state at horizon

This range of attacks attempt to have a node believe it is properly synchronized with the network when it's actually is in a forged state. Multiple strategies can be attempted:

  • Completely fake but valid horizon state (including header and proof of work). Assuming at least one honest peer, neither the UTXO set root hash nor the block hash will match other peers' horizon states.
  • Valid block header but faked UTXO set. The UTXO set root hash from the header will not match what the node calculates from the received UTXO set itself.
  • Completely valid block with fake total difficulty, which could lead the node down a fake fork. The block hash changes if the total difficulty is changed, no honest peer will produce a valid head for that hash.

A fork occurs that's older than the local UTXO history

Our node downloaded the full UTXO set at horizon height. If a fork occurs on a block at an older horizon H+delta, the UTXO set can't be validated. In this situation the node has no choice but to put itself back in sync mode with a new horizon of Z'=Z+delta.

Note that an alternate fork at Z+delta that has less work than our current head can safely be ignored, only a winning fork of total work greater than our head would. To do this resolution, every block header includes the total chain difficulty up to that block.

The chain is permanently forked

If a hard fork occurs, the network may become split, forcing new nodes to always push their horizon back to when the hard fork occurred. While this is not a problem for short-term hard forks, it may become an issue for long-term or permanent forks. To prevent this situation, peers should always be checked for hard fork related capabilities (a bitmask of features a peer exposes) on connection.

Several nodes continuously give fake horizon blocks

If a peer can't reach consensus on the header at h, it gradually moves back. In the degenerate case, rogue peers could force all new peers to always become full nodes (move back until genesis) by systematically preventing consensus and feeding fake headers.

While this is a valid issue, several mitigation strategies exist:

  • Peers must still provide valid block headers at horizon Z. This includes the proof of work.
  • A group of block headers around the horizon could be asked to increase the cost of the attack.
  • Differing block headers providing a proof of work significantly lower could be rejected.
  • The user or node operator may be asked to confirm a block hash.
  • In last resort, if none of the above strategies are effective, checkpoints could be used.