Tendermint源码分析——启动流程分析

  • Post author:
  • Post category:其他

准备参数

cli参数:

node --proxy_app=dummy --home "C:\Users\Administrator\datadir\tendermint"

Tendermint的cli解析使用cobra库!

flags vs args

让我们将程序定格在(c *Command) ExecuteC【/vendor/github.com/spf13/cobra/command.go#】中的下面这一行:

err=cmd.execute(flags)

变量图
我们可以看出,从命令行传进来的所有“命令”都算作参数,而标记是“- -”开头的参数。换句话说,标记是去掉不带“- -”开头的参数!

梦开始的地方

代码位置:/vendor/github.com/spf13/cobra/command.go#(c *Command) execute

if c.RunE != nil {
    if err := c.RunE(c, argWoFlags); err != nil {
        return err
    }
} else {
    c.Run(c, argWoFlags) 
}

c.RunE(c, argWoFlags)是引发node启动服务的根。一切有意思的事情从这里开始。c.RunE(c, argWoFlags)是调用匿名函数RunE,而RunE的赋值则是在NewRunNodeCmd函数(/cmd/tendermints/commands/run_node.go):

// NewRunNodeCmd returns the command that allows the CLI to start a
// node. It can be used with a custom PrivValidator and in-process ABCI application.
func NewRunNodeCmd(nodeProvider nm.NodeProvider) *cobra.Command {
    cmd := &cobra.Command{
        Use:   "node",
        Short: "Run the tendermint node",
        RunE: func(cmd *cobra.Command, args []string) error {   //匿名函数
            // Create & start node
            n, err := nodeProvider(config, logger)
            if err != nil {
                return fmt.Errorf("Failed to create node: %v", err)
            }

        if err := n.Start(); err != nil {
            return fmt.Errorf("Failed to start node: %v", err)
        } else {
            logger.Info("Started node", "nodeInfo", n.Switch().NodeInfo())  //节点启动完毕打印日志Started node
        }

        // Trap signal, run forever.
        n.RunForever()

        return nil
    },
}

    AddNodeFlags(cmd)
    return cmd
}

这个匿名函数有三个功能:

1、创建node(nodeProvider(config, logger))
2、启动node(n.Start())
3、监听停止node的信号(n.RunForever())

下面我们一个功能往下走!我么先看创建node!

创建node

nodeProvider(config,logger)具体执行DefaultNewNode函数(/node/node.go#DefaultNewNode)。DefaultNewNode函数会返回一个Node全局变量。

// DefaultNewNode returns a Tendermint node with default settings for the
// PrivValidator, ClientCreator, GenesisDoc, and DBProvider.
// It implements NodeProvider.
func DefaultNewNode(config *cfg.Config, logger log.Logger) (*Node, error) {
    return NewNode(config,
        types.LoadOrGenPrivValidatorFS(config.PrivValidatorFile()),
        proxy.DefaultClientCreator(config.ProxyApp, config.ABCI, config.DBDir()),
        DefaultGenesisDocProviderFunc(config),
        DefaultDBProvider,
        logger)
}

DefaultNewNode的第一个实参为config。config的定义在/cmd/tendermint/commands/root.go:

var (
    config = cfg.DefaultConfig()
)

注意:用户在命令行传入的参数,那只是启动节点所需的非常小的一部分参数。大多数参数都是需要从默认配置中加载的。config全局变量加载了node所需的所有默认参数(主要是默认基础配置、默认RPC配置、默认P2P配置、内存池配置、共识配置和交易索引配置)。

DefaultNewNode的第二个实参为types.LoadOrGenPrivValidatorFS(config.PrivValidatorFile()),该函数会返回一个PrivValidatorFS实例。

DefaultNewNode的第三个实参为proxy.DefaultClientCreator(config.ProxyApp,config.ABCI,config.DBDir()),其中DefaultClientCreator函数(/proxy/client.go#DefaultClientCreator)的第一个实参config.ProxyApp=“dummy”;
第二个实参config.ABCI=“socket”;第三个实参config.DBDir()
=“C:\Users\Administrator\datadir\tendermint\data”。

但是具体负责创建这个全局变量的函数走的是NewNode函数。代码在/node/node.go#NewNode。

// NewNode returns a new, ready to go, Tendermint Node.
func NewNode(config *cfg.Config,
    privValidator types.PrivValidator,
    clientCreator proxy.ClientCreator,
    genesisDocProvider GenesisDocProvider,
    dbProvider DBProvider,
    logger log.Logger) (*Node, error) {

    // Get BlockStore
    //初始化blockstore数据库
    blockStoreDB, err := dbProvider(&DBContext{"blockstore", config})
    if err != nil {
        return nil, err
    }
    blockStore := bc.NewBlockStore(blockStoreDB)

    // Get State
    //初始化state数据库
    stateDB, err := dbProvider(&DBContext{"state", config})
    if err != nil {
        return nil, err
    }

    // Get genesis doc
    // TODO: move to state package?
    //从硬盘上读取创世文件
    genDoc, err := loadGenesisDoc(stateDB)
    if err != nil {
        genDoc, err = genesisDocProvider()
        if err != nil {
            return nil, err
        }
        // save genesis doc to prevent a certain class of user errors (e.g. when it
        // was changed, accidentally or not). Also good for audit trail.
        saveGenesisDoc(stateDB, genDoc)
    }

    state, err := sm.LoadStateFromDBOrGenesisDoc(stateDB, genDoc)
    if err != nil {
        return nil, err
    }

    // Create the proxyApp, which manages connections (consensus, mempool, query)
    // and sync tendermint and the app by performing a handshake
    // and replaying any necessary blocks
    consensusLogger := logger.With("module", "consensus")
    handshaker := cs.NewHandshaker(stateDB, state, blockStore)
    handshaker.SetLogger(consensusLogger)
    proxyApp := proxy.NewAppConns(clientCreator, handshaker)
    proxyApp.SetLogger(logger.With("module", "proxy"))
    //Start()
    if err := proxyApp.Start(); err != nil {
        return nil, fmt.Errorf("Error starting proxy app connections: %v", err)
    }

    // reload the state (it may have been updated by the handshake)
    state = sm.LoadState(stateDB)

    // Decide whether to fast-sync or not
    // We don't fast-sync when the only validator is us.
    fastSync := config.FastSync         //默认开启快速同步
    if state.Validators.Size() == 1 {
        addr, _ := state.Validators.GetByIndex(0)   //返回验证人的地址
        if bytes.Equal(privValidator.GetAddress(), addr) {
            fastSync = false         //如果只有一个验证者,禁用快速同步
        }
    }

    // Log(打印日志) whether this node is a validator or an observer(观察者)
    if state.Validators.HasAddress(privValidator.GetAddress()) {
        consensusLogger.Info("This node is a validator", "addr", privValidator.GetAddress(), "pubKey", privValidator.GetPubKey())
    } else {
        consensusLogger.Info("This node is not a validator", "addr", privValidator.GetAddress(), "pubKey", privValidator.GetPubKey())
    }

    // Make MempoolReactor
    mempoolLogger := logger.With("module", "mempool")
    //创建交易池
    mempool := mempl.NewMempool(config.Mempool, proxyApp.Mempool(), state.LastBlockHeight)
    mempool.InitWAL() // no need to have the mempool wal during tests
    mempool.SetLogger(mempoolLogger)
    mempoolReactor := mempl.NewMempoolReactor(config.Mempool, mempool)
    mempoolReactor.SetLogger(mempoolLogger)

    if config.Consensus.WaitForTxs() {
        mempool.EnableTxsAvailable()
    }

    // Make Evidence Reactor
    evidenceDB, err := dbProvider(&DBContext{"evidence", config})
    if err != nil {
        return nil, err
    }
    evidenceLogger := logger.With("module", "evidence")
    evidenceStore := evidence.NewEvidenceStore(evidenceDB)
    evidencePool := evidence.NewEvidencePool(stateDB, evidenceStore)
    evidencePool.SetLogger(evidenceLogger)
    evidenceReactor := evidence.NewEvidenceReactor(evidencePool)
    evidenceReactor.SetLogger(evidenceLogger)

    blockExecLogger := logger.With("module", "state")
    // make block executor for consensus and blockchain reactors to execute blocks
    blockExec := sm.NewBlockExecutor(stateDB, blockExecLogger, proxyApp.Consensus(), mempool, evidencePool)

    // Make BlockchainReactor
    bcReactor := bc.NewBlockchainReactor(state.Copy(), blockExec, blockStore, fastSync)
    bcReactor.SetLogger(logger.With("module", "blockchain"))

    // Make ConsensusReactor
    consensusState := cs.NewConsensusState(config.Consensus, state.Copy(),
        blockExec, blockStore, mempool, evidencePool)
    consensusState.SetLogger(consensusLogger)
    if privValidator != nil {
        consensusState.SetPrivValidator(privValidator)
    }
    consensusReactor := cs.NewConsensusReactor(consensusState, fastSync)
    consensusReactor.SetLogger(consensusLogger)

    p2pLogger := logger.With("module", "p2p")

    sw := p2p.NewSwitch(config.P2P)
    sw.SetLogger(p2pLogger)
    sw.AddReactor("MEMPOOL", mempoolReactor)
    sw.AddReactor("BLOCKCHAIN", bcReactor)
    sw.AddReactor("CONSENSUS", consensusReactor)
    sw.AddReactor("EVIDENCE", evidenceReactor)

    // Optionally, start the pex reactor
    var addrBook pex.AddrBook
    var trustMetricStore *trust.TrustMetricStore
    if config.P2P.PexReactor {
        addrBook = pex.NewAddrBook(config.P2P.AddrBookFile(), config.P2P.AddrBookStrict)
        addrBook.SetLogger(p2pLogger.With("book", config.P2P.AddrBookFile()))

        // Get the trust metric history data
        trustHistoryDB, err := dbProvider(&DBContext{"trusthistory", config})
        if err != nil {
            return nil, err
        }
        trustMetricStore = trust.NewTrustMetricStore(trustHistoryDB, trust.DefaultConfig())
        trustMetricStore.SetLogger(p2pLogger)

        var seeds []string
        if config.P2P.Seeds != "" {
            seeds = strings.Split(config.P2P.Seeds, ",")
        }
        pexReactor := pex.NewPEXReactor(addrBook,
            &pex.PEXReactorConfig{Seeds: seeds, SeedMode: config.P2P.SeedMode})
        pexReactor.SetLogger(p2pLogger)
        sw.AddReactor("PEX", pexReactor)
    }

    // Filter peers by addr or pubkey with an ABCI query.
    // If the query return code is OK, add peer.
    // XXX: Query format subject to change
    if config.FilterPeers {
        // NOTE: addr is ip:port
        sw.SetAddrFilter(func(addr net.Addr) error {
            resQuery, err := proxyApp.Query().QuerySync(abci.RequestQuery{Path: cmn.Fmt("/p2p/filter/addr/%s", addr.String())})
            if err != nil {
                return err
            }
            if resQuery.IsErr() {
                return fmt.Errorf("Error querying abci app: %v", resQuery)
            }
            return nil
        })
        sw.SetPubKeyFilter(func(pubkey crypto.PubKey) error {
            resQuery, err := proxyApp.Query().QuerySync(abci.RequestQuery{Path: cmn.Fmt("/p2p/filter/pubkey/%X", pubkey.Bytes())})
            if err != nil {
                return err
            }
            if resQuery.IsErr() {
                return fmt.Errorf("Error querying abci app: %v", resQuery)
            }
            return nil
        })
    }

    eventBus := types.NewEventBus()
    eventBus.SetLogger(logger.With("module", "events"))

    // services which will be publishing and/or subscribing for messages (events)
    // consensusReactor will set it on consensusState and blockExecutor
    consensusReactor.SetEventBus(eventBus)

    // Transaction indexing
    var txIndexer txindex.TxIndexer
    switch config.TxIndex.Indexer {
    case "kv":
        store, err := dbProvider(&DBContext{"tx_index", config})
        if err != nil {
            return nil, err
        }
        if config.TxIndex.IndexTags != "" {
            txIndexer = kv.NewTxIndex(store, kv.IndexTags(strings.Split(config.TxIndex.IndexTags, ",")))
        } else if config.TxIndex.IndexAllTags {
            txIndexer = kv.NewTxIndex(store, kv.IndexAllTags())
        } else {
            txIndexer = kv.NewTxIndex(store)
        }
    default:
        txIndexer = &null.TxIndex{}
    }

    indexerService := txindex.NewIndexerService(txIndexer, eventBus)

    // run the profile server
    profileHost := config.ProfListenAddress
    if profileHost != "" {
        go func() {
            logger.Error("Profile server", "err", http.ListenAndServe(profileHost, nil))
        }()
    }
    //创建node,并给成员赋值
    node := &Node{
        config:        config,
        genesisDoc:    genDoc,
        privValidator: privValidator,

        sw:               sw,
        addrBook:         addrBook,
        trustMetricStore: trustMetricStore,

        stateDB:          stateDB,
        blockStore:       blockStore,
        bcReactor:        bcReactor,
        mempoolReactor:   mempoolReactor,
        consensusState:   consensusState,
        consensusReactor: consensusReactor,
        evidencePool:     evidencePool,
        proxyApp:         proxyApp,
        txIndexer:        txIndexer,
        indexerService:   indexerService,
        eventBus:         eventBus,
    }
    node.BaseService = *cmn.NewBaseService(logger, "Node", node)
    return node, nil
}

这里,我们有必要看看Node的定义:

// Node is the highest level interface to a full Tendermint node.
// It includes all configuration information and running services.
type Node struct {
    cmn.BaseService   //内部类型

    // config
    config        *cfg.Config
    genesisDoc    *types.GenesisDoc   // initial validator set
    privValidator types.PrivValidator // local node's validator key

    // network
    sw               *p2p.Switch             // p2p connections
    addrBook         pex.AddrBook            // known peers 已知的peer
    trustMetricStore *trust.TrustMetricStore // trust metrics for all peers

    // services
    eventBus         *types.EventBus // pub/sub for services
    stateDB          dbm.DB
    blockStore       *bc.BlockStore         // store the blockchain to disk
    bcReactor        *bc.BlockchainReactor  // for fast-syncing
    mempoolReactor   *mempl.MempoolReactor  // for gossipping transactions
    consensusState   *cs.ConsensusState     // latest consensus state
    consensusReactor *cs.ConsensusReactor   // for participating in the consensus
    evidencePool     *evidence.EvidencePool // tracking evidence
    proxyApp         proxy.AppConns         // connection to the application
    rpcListeners     []net.Listener         // rpc servers
    txIndexer        txindex.TxIndexer
    indexerService   *txindex.IndexerService
}

这让我想起了geth中的Ethereum数据结构(/eth/backend.go#Ethereum)。Node组合了cmn.BaseService,按照go的语法,外部类型(Node)可以复用内部类型(cmn.BaseService)的方法和成员。Node自己实现Service接口的OnStart方法和OnStop方法。内部类型cmn.BaseService实现Service接口的所有方法(实际只实现了八个,OnStart和OnStop没有函数体,而这两个方法正是外部类型Node实现的两个方法)。因此,Node变量是Service的实例。

启动node的入口

代码位置:/vendor/github.com/tendermint/tmlibs/common/service.go#(bs *BaseService) Start()

// Start implements Service by calling OnStart (if defined). An error will be
// returned if the service is already running or stopped. Not to start the
// stopped service, you need to call Reset.
func (bs *BaseService) Start() error {
    if atomic.CompareAndSwapUint32(&bs.started, 0, 1) {
        if atomic.LoadUint32(&bs.stopped) == 1 {
            bs.Logger.Error(Fmt("Not starting %v -- already stopped", bs.name), "impl", bs.impl)
            return ErrAlreadyStopped
        } else {
            bs.Logger.Info(Fmt("Starting %v", bs.name), "impl", bs.impl)  //打印Starting Node 日志                              module=main impl=Node
        }
        err := bs.impl.OnStart()   //执行node/node.go#OnStart方法
        if err != nil {
            // revert flag
            atomic.StoreUint32(&bs.started, 0)
            return err
        }
        return nil
    } else {
        bs.Logger.Debug(Fmt("Not starting %v -- already started", bs.name), "impl", bs.impl)
        return ErrAlreadyStarted
    }
}

实际负责启动的函数是OnStart,定义在/node/node.go#OnStart()

// OnStart starts the Node. It implements cmn.Service.
func (n *Node) OnStart() error {
    err := n.eventBus.Start()   //复用了BaseService的(bs *BaseService) Start()方法
    if err != nil {
        return err
    }

// Run the RPC server first
// so we can eg. receive txs for the first block
if n.config.RPC.ListenAddress != "" {
    listeners, err := n.startRPC()    //启动RPC
    if err != nil {
        return err
    }
    n.rpcListeners = listeners
}

// Create & add listener
protocol, address := cmn.ProtocolAndAddress(n.config.P2P.ListenAddress)
l := p2p.NewDefaultListener(protocol, address, n.config.P2P.SkipUPNP, n.Logger.With("module", "p2p"))
n.sw.AddListener(l)

// Generate node PrivKey
// TODO: pass in like priv_val
nodeKey, err := p2p.LoadOrGenNodeKey(n.config.NodeKeyFile())
if err != nil {
    return err
}
n.Logger.Info("P2P Node ID", "ID", nodeKey.ID(), "file", n.config.NodeKeyFile())

// Start the switch
n.sw.SetNodeInfo(n.makeNodeInfo(nodeKey.PubKey()))
n.sw.SetNodeKey(nodeKey)
err = n.sw.Start()
if err != nil {
    return err
}

// Always connect to persistent peers
if n.config.P2P.PersistentPeers != "" {
    err = n.sw.DialPeersAsync(n.addrBook, strings.Split(n.config.P2P.PersistentPeers, ","), true)
    if err != nil {
        return err
    }
}

// start tx indexer
return n.indexerService.Start()
}

停止node

TM中node启动时执行了n.RunForever(),它负责监听中断信号,然后停掉node。

// RunForever waits for an interrupt signal and stops the node.
func (n *Node) RunForever() {
    // Sleep forever and then...
    cmn.TrapSignal(func() {
        n.Stop()              //调用BaseService的(bs *BaseService) Stop方法
    })
}

具体负责中断信号的是TrapSignal函数(vendor/github.com/tendermint/tmlibs/common/os.go):

// TrapSignal catches the SIGTERM and executes cb function. After that it exits
// with code 1.
func TrapSignal(cb func()) {
   c := make(chan os.Signal, 1)
   signal.Notify(c, os.Interrupt, syscall.SIGTERM)
   go func() {
      for sig := range c {
         fmt.Printf("captured %v, exiting...\n", sig)
         if cb != nil {
            cb()
         }
         os.Exit(1)
      }
   }()
   select {}
}

TrapSignal函数监听了SIGTERM信号。当用户触发了ctrl+c才终止node。

具体负责node的停止操作的是OnStop函数(/node/node.go#OnStop()):

// OnStop stops the Node. It implements cmn.Service.
func (n *Node) OnStop() {
    n.BaseService.OnStop()

    n.Logger.Info("Stopping Node")
    // TODO: gracefully disconnect from peers.
    n.sw.Stop()

    for _, l := range n.rpcListeners {
        n.Logger.Info("Closing rpc listener", "listener", l)
        if err := l.Close(); err != nil {
            n.Logger.Error("Error closing listener", "listener", l, "err", err)
        }
    }

    n.eventBus.Stop()

    n.indexerService.Stop()
}

总结

本文非常粗糙地梳理了TM中node启动流程。后续我会进一步完善启动流程分析。


版权声明:本文为KeenCryp原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。