Consul-启动失败排查

Consul-不能启动问题排查

现象

1
2
3
4
5
6
supervisorctl status

alertmanager RUNNING pid 2910405, uptime 1 day, 22:39:07
consul FATAL Exited too quickly (process log may have details)
prometheus RUNNING pid 1842, uptime 1299 days, 16:51:21
prometheus-webhook-dingtalk STOPPED Jul 04 03:47 PM

查看日志

1
2
3
4
5
6
7
8
9
10
11
tail -f supervisord.log-20220704 
2022-07-04 15:50:43,730 INFO gave up: consul entered FATAL state, too many start retries too quickly
2022-07-06 14:27:46,651 INFO spawned: 'consul' with pid 1248136
2022-07-06 14:27:46,854 INFO exited: consul (exit status 1; not expected)
2022-07-06 14:27:47,855 INFO spawned: 'consul' with pid 1248446
2022-07-06 14:27:48,043 INFO exited: consul (exit status 1; not expected)
2022-07-06 14:27:50,046 INFO spawned: 'consul' with pid 1248797
2022-07-06 14:27:50,599 INFO exited: consul (exit status 1; not expected)
2022-07-06 14:27:53,602 INFO spawned: 'consul' with pid 1249389
2022-07-06 14:27:53,847 INFO exited: consul (exit status 1; not expected)
2022-07-06 14:27:54,847 INFO gave up: consul entered FATAL state, too many start retries too quickly

再次排查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
supervisorctl  tail consul stdout
.245:8301: bind: address already in use
bootstrap_expect = 2: A cluster with 2 servers will provide no failure tolerance. See https://www.consul.io/docs/internals/consensus.html#deployment-table
bootstrap_expect > 0: expecting 2 servers
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul server: Failed to start LAN Serf: Failed to create memberlist: Could not set up network transport: failed to obtain an address: Failed to start TCP listener on "10.9.127.245" port 8301: listen tcp 10.9.127.245:8301: bind: address already in use
bootstrap_expect = 2: A cluster with 2 servers will provide no failure tolerance. See https://www.consul.io/docs/internals/consensus.html#deployment-table
bootstrap_expect > 0: expecting 2 servers
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul server: Failed to start LAN Serf: Failed to create memberlist: Could not set up network transport: failed to obtain an address: Failed to start TCP listener on "10.9.127.245" port 8301: listen tcp 10.9.127.245:8301: bind: address already in use
bootstrap_expect = 2: A cluster with 2 servers will provide no failure tolerance. See https://www.consul.io/docs/internals/consensus.html#deployment-table
bootstrap_expect > 0: expecting 2 servers
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul server: Failed to start LAN Serf: Failed to create memberlist: Could not set up network transport: failed to obtain an address: Failed to start TCP listener on "10.9.127.245" port 8301: listen tcp 10.9.127.245:8301: bind: address already in use

总结

1
## address already in use --> 端口被占用,清理端口后重新启动即可