Downloader
Seed/download historical data
Downloader is the service for seeding/downloading historical data using the BitTorrent protocol. This data is stored in the form of snapshots, which are actually immutable .seg
files.
The ETH core instructs the downloader
component to download (and then seed) specific files from the BitTorrent network. The files are specified by their "info hashes", which are a form of content addressing. The files that ETH core instructs to download are block headers and block bodies. Downloader
interacts then with the BitTorrent network to retrieve files needed by ETH core.
Start Erigon with snapshots support
Like many other Erigon components (txpool, sentry, rpc daemon) downloader
can be integrated into Erigon or run as a separate process.
Downloader
run by default inside Erigon with the --snapshots
flag:
./build/bin/erigon --snapshots --datadir=<your_datadir>
Info: --snapshots
flag is compatible with --prune
flag (more info here).
Running downloader as a separate process
It's possible to start Downloader
as independent process with --snapshots --downloader.api.addr=127.0.0.1:9093
flag.
Before using a separate downloader
process the executable must be built:
cd erigon
make downloader
And you can then start the downloader
./build/bin/downloader --downloader.api.addr=127.0.0.1:9093 --torrent.port=42068 --datadir=<your_datadir>
--downloader.api.addr
- is for internal communication with Erigon
--torrent.port=42068
- is for public BitTorrent protocol listen
Erigon on startup sends list of .torrent
files to Downloader
and waits for 100% download completion
./build/bin/erigon --snapshots --downloader.api.addr=127.0.0.1:9093 --datadir=<your_datadir>
Use --snap.keepblocks=true
to not delete retired blocks from DB.
Any network/chain can start with snapshot sync:
node will only download snapshots registered in the next repo https://github.com/erigontech/erigon-snapshot
node will move old blocks from DB to snapshots of 1K blocks size, then merge snapshots to bigger range, until snapshots of 500K blocks, then automatically start seeding new snapshot
Creation of a new network or bootnode
You may need to create new snapshots and start seeding them
Creating new snapshots will dump blocks from Database to .seg
files
erigon snapshots retire --datadir=<your_datadir>
Will create the .torrent
files that downloader
will automatically seed. The output format is compatible with https://github.com/erigontech/erigon-snapshot.
./build/bin/downloader torrent_hashes --rebuild --datadir=<your_datadir>
Start downloader (seeds automatically)
./build/bin/downloader --downloader.api.addr=127.0.0.1:9093 --datadir=<your_datadir>
Additional info
Snapshots creation does not require a fully-synced Erigon, few first stages are enough. For example:
STOP_AFTER_STAGE=Senders
./build/bin/erigon --snapshots=false --datadir=<your_datadir>
But for security it is better to have a fully-synced Erigon.
Erigon can use snapshots only after indexing them. Erigon will automatically index them but also can run (this step is not required for seeding):
./build/bin/erigon snapshots index --datadir=<your_datadir>
Architecture
Downloader
works based on <your_datadir>/snapshots/*.torrent
files. Such files can be created in 4 ways:
Erigon can do grpc call
downloader.Download(list_of_hashes)
, it will trigger creation of.torrent
filesErigon can create new
.seg
file,Downloader
will scan.seg
file and create.torrent
operator can manually copy
.torrent
files (rsync from other server or restore from backup)operator can manually copy
.seg
file, Downloader will scan.seg
file and create .torrent
Erigon does:
connect to Downloader
share the list of hashes (see https://github.com/erigontech/erigon-snapshot )
wait for download of all snapshots
when
.seg
file is available it automatically create.idx
files - secondary indices, for example to find block by hashthen switch to normal staged sync (which doesn't require connection to
Downloader
)ensure that snapshot downloading happens only once: even if new Erigon version does include new pre-verified snapshot hashes, Erigon will not download them (to avoid unpredictable downtime) - but Erigon may produce them by self.
Downloader does:
Read
.torrent
files, download everything described by .torrent filesUse https://github.com/ngosang/trackerslist see
./trackers/embed.go
automatically seeding
Technical details
To prevent attack
.idx
creation using random Seed - all nodes will have different.idx
file (and same.seg
files)If you add/remove any
.seg
file manually you also need to remove<your_datadir>/snapshots/db
folder
How to verify that .seg files have the same checksum as current .torrent files
Use it if you see strange behavior, bugs, bans, hardware problems, etc.
./build/bin/downloader --verify --datadir=<your_datadir>
Faster rsync
rsync -aP --delete -e "ssh -T -o Compression=no -x"
Release details
Start automatic commit of new hashes to branch master
crontab -e @hourly cd <erigon_source_dir> && ./cmd/downloader/torrent_hashes_update.sh <your_datadir> <network_name> 1>&2 2>> ~/erigon_cron.log
It does push to branch auto
, before release - merge auto
to main
manually
Command line options
To display available options for downloader
digit:
./build/bin/downloader --help
The --help
flag listing is reproduced below for your convenience.
Commands
snapshot downloader
Usage:
[flags]
[command]
Examples:
go run ./cmd/downloader --datadir <your_datadir> --downloader.api.addr 127.0.0.1:9093
Available Commands:
completion Generate the autocompletion script for the specified shell
help Help about any command
torrent_hashes
Flags:
--datadir string Data directory for the databases (default "/home/admin/.local/share/erigon")
--downloader.api.addr string external downloader api network address, for example: 127.0.0.1:9093 serves remote downloader interface (default "127.0.0.1:9093")
--downloader.disable.ipv4 Turns off ipv6 for the downlaoder
--downloader.disable.ipv6 Turns off ipv6 for the downlaoder
-h, --help help for this command
--log.console.json Format console logs with JSON
--log.console.verbosity string Set the log level for console logs (default "info")
--log.dir.json Format file logs with JSON
--log.dir.path string Path to store user and error logs to disk
--log.dir.prefix string The file name prefix for logs stored to disk
--log.dir.verbosity string Set the log verbosity for logs stored to disk (default "info")
--log.json Format console logs with JSON
--metrics Enable metrics collection and reporting
--metrics.addr string Enable stand-alone metrics HTTP server listening interface (default "127.0.0.1")
--metrics.port int Metrics HTTP server listening port (default 6060)
--nat string NAT port mapping mechanism (any|none|upnp|pmp|stun|extip:<IP>)
"" or "none" default - do not nat
"extip:77.12.33.4" will assume the local machine is reachable on the given IP
"any" uses the first auto-detected mechanism
"upnp" uses the Universal Plug and Play protocol
"pmp" uses NAT-PMP with an auto-detected gateway address
"pmp:192.168.0.1" uses NAT-PMP with the given gateway address
"stun" uses STUN to detect an external IP using a default server
"stun:<server>" uses STUN to detect an external IP using the given server (host:port)
--pprof Enable the pprof HTTP server
--pprof.addr string pprof HTTP server listening interface (default "127.0.0.1")
--pprof.cpuprofile string Write CPU profile to the given file
--pprof.port int pprof HTTP server listening port (default 6060)
--torrent.conns.perfile int connections per file (default 10)
--torrent.download.rate string bytes per second, example: 32mb (default "16mb")
--torrent.download.slots int amount of files to download in parallel. If network has enough seeders 1-3 slot enough, if network has lack of seeders increase to 5-7 (too big value will slow down everything). (default 3)
--torrent.maxpeers int unused parameter (reserved for future use) (default 100)
--torrent.port int port to listen and serve BitTorrent protocol (default 42069)
--torrent.staticpeers string Comma separated enode URLs to connect to
--torrent.upload.rate string bytes per second, example: 32mb (default "4mb")
--torrent.verbosity int 0=silent, 1=error, 2=warn, 3=info, 4=debug, 5=detail (must set --verbosity to equal or higher level and has defeault: 3) (default 2)
--trace string Write execution trace to the given file
--verbosity string Set the log level for console logs (default "info")
--verify Force verify data files if have .torrent files
Use " [command] --help" for more information about a command.
Last updated
Was this helpful?