Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[求助/Help]host主机pod异常:Host instance init error: Setup OVN Chassis: normalize db host: dns lookup #23585

@chenjacken

Description

@chenjacken

版本号:v3.11.11
系统:Centos 7.9

host异常:Host instance init error: Setup OVN Chassis: normalize db host: dns lookup
日志如下:

[root@master1 ~]# kubectl get pods -n onecloud -owide |grep ser-a4-1
default-host-deployer-c9vj4                 1/1     Running             0                32d     172.16.0.17     ser-a4-1   <none>           <none>
default-host-health-gccnj                   1/1     Running             0                32d     172.16.0.17     ser-a4-1   <none>           <none>
default-host-image-5bbfs                    1/1     Running             0                57d     172.16.0.17     ser-a4-1   <none>           <none>
default-host-w2fq4                          1/3     CrashLoopBackOff    280 (30s ago)    12h     172.16.0.17     ser-a4-1   <none>           <none>
default-telegraf-q6p9n                      1/1     Running             0                32d     172.16.0.17     ser-a4-1   <none>           <none>


[root@master1 ~]# kubectl logs  default-host-w2fq4 -n onecloud -c host --tail 100 -f 
[info 251022 04:44:25 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies
[info 251022 04:44:25 options.parseOptions(options.go:344)] Use configuration file: /etc/yunion/host.conf
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-qmp-monitor
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disk-is-ssd
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument health-driver
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument start-host-ignore-sys-error
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-health-checker
[warning 251022 04:44:25 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-rbac
[info 251022 04:44:25 options.parseOptions(options.go:367)] Set log level to "info"
[info 2025-10-22 04:44:25 options.parseOptions(options.go:344)] Use configuration file: /etc/yunion/common/common.conf
[info 2025-10-22 04:44:25 options.parseOptions(options.go:367)] Set log level to "info"
[info 2025-10-22 04:44:25 hostman.(*SHostService).InitService(host_services.go:65)] exec socket path: /var/run/onecloud/exec.sock
[info 2025-10-22 04:44:25 app.InitApp(app.go:32)] RequestWorkerCount: 8
[info 2025-10-22 04:44:25 appsrv.NewApplication(appsrv.go:122)] App hostId: EbaPQUpUa4YpbpM3fzSsfFT8NQw= (host,ser-a4-1,172.16.0.17)
2025/10/22 04:44:25 Allow hosts []
[info 2025-10-22 04:44:25 appsrv.(*Application).SetDefaultTimeout(appsrv.go:138)] adjust application default timeout to 60.000000 seconds
[info 2025-10-22 04:44:25 hostinfo.DetectCpuInfo(hostinfohelper.go:78)] cpuinfo freq 1001
[info 2025-10-22 04:44:25 hostinfo.NewHostInfo(hostinfo.go:2528)] CPU Model Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz Microcode 0x2006906
[error 2025-10-22 04:44:28 fileutils2.GetAllBlkdevsIoSchedulers(fileutils.go:175)] no block device avaiable
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).prepareEnv(hostinfo.go:421)] supported io schedulers []
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).prepareEnv(hostinfo.go:425)] HDD I/O Scheduler switch to none
[info 2025-10-22 04:44:28 fileutils2.ChangeBlkdevParameter(fileutils.go:260)] Set queue/scheduler of sdc to none
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).prepareEnv(hostinfo.go:432)] SSD I/O Scheduler switch to none
[info 2025-10-22 04:44:28 fileutils2.ChangeBlkdevParameter(fileutils.go:260)] Set queue/scheduler of sda to none
[info 2025-10-22 04:44:28 fileutils2.ChangeBlkdevParameter(fileutils.go:260)] Set queue/scheduler of sdb to none
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).Init(hostinfo.go:197)] Start detectHostInfo
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:937)] KVM API VERSION 12
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:942)] KVM CAP MAX VCPUS: 288
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:950)] KVM CAP NR VCPUS: 240
[info 2025-10-22 04:44:28 sysutils.detectNestSupport(kvm.go:146)] Host is support kvm nest ...
[info 2025-10-22 04:44:28 sysutils.detectNestSupport(kvm.go:151)] Host kvm nest is enabled ...
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectOsDist(hostinfo.go:830)] DetectOsDist CentOS Linux 7.9.2009
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectQemuVersion(hostinfo.go:904)] Detect qemu version is 4.2.0
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectOvsVersion(hostinfo.go:1045)] Detect OVS version is 2.12.4
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).detectOvsKOVersion(hostinfo.go:1062)] kernel module openvswitch vermagic:       5.4.130-1.yn20230805.el7.x86_64 SMP mod_unload modversions 
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).Init(hostinfo.go:206)] Start parseConfig
[info 2025-10-22 04:44:28 hostinfo.NewNIC(hostinfohelper.go:241)] IP 172.16.0.17/br0/bond1
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).ConfirmToConfig(hostbridge.go:336)] bridge br0 already has ip 172.16.0.17
[info 2025-10-22 04:44:28 hostinfo.NewNIC(hostinfohelper.go:291)] Confirm to configuration!! To migrate physical interface configs
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).MigrateSlaveConfigs(hostbridge.go:234)] to migrate routes: [] slaveAddress: []
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).MigrateSlaveConfigs(hostbridge.go:285)] to migrate routes: [] slaveAddress: [] delRoutes: []
[info 2025-10-22 04:44:28 hostinfo.NewNIC(hostinfohelper.go:241)] IP 10.10.0.17/br1/bond2
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).ConfirmToConfig(hostbridge.go:336)] bridge br1 already has ip 10.10.0.17
[info 2025-10-22 04:44:28 hostinfo.NewNIC(hostinfohelper.go:291)] Confirm to configuration!! To migrate physical interface configs
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).MigrateSlaveConfigs(hostbridge.go:234)] to migrate routes: [] slaveAddress: []
[info 2025-10-22 04:44:28 hostbridge.(*SBaseBridgeDriver).MigrateSlaveConfigs(hostbridge.go:285)] to migrate routes: [] slaveAddress: [] delRoutes: []
[info 2025-10-22 04:44:28 hostinfo.(*SNIC).SetupDhcpRelay(hostinfohelper.go:203)] Not enable dhcp relay on nic: &hostinfo.SNIC{Inter:"bond1", Bridge:"br0", Ip:"172.16.0.17", Wire:"", WireId:"", Mask:18, Bandwidth:1000, BridgeDev:(*hostbridge.SOVSBridgeDriver)(0xc00107d710), dhcpServer:(*hostdhcp.SGuestDHCPServer)(0xc00130e840)}
[info 2025-10-22 04:44:28 hostinfo.(*SNIC).SetupDhcpRelay(hostinfohelper.go:203)] Not enable dhcp relay on nic: &hostinfo.SNIC{Inter:"bond2", Bridge:"br1", Ip:"10.10.0.17", Wire:"", WireId:"", Mask:18, Bandwidth:1000, BridgeDev:(*hostbridge.SOVSBridgeDriver)(0xc001478d80), dhcpServer:(*hostdhcp.SGuestDHCPServer)(0xc001650c30)}
[info 2025-10-22 04:44:28 hostinfo.(*SHostInfo).setupOvnChassis(hostinfo.go:224)] Start setting up ovn chassis
goroutine 1 [running]:
runtime/debug.Stack()
        /usr/lib/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /usr/lib/go/src/runtime/debug/stack.go:16 +0x13
yunion.io/x/onecloud/pkg/util/ovnutils.InitOvn.func1()
        /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:125 +0x39
panic({0x3160400?, 0xc0008b5410?})
        /usr/lib/go/src/runtime/panic.go:914 +0x21f
yunion.io/x/onecloud/pkg/util/ovnutils.mustPrepOvsdbConfig({{0xc001177940, 0x1b}, {0xc001165ef8, 0x5}, {0x0, 0x0}, {0xc001165ed0, 0xa}, 0x5dc, {0xc001165f20, ...}, ...})
        /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:93 +0x5bb
yunion.io/x/onecloud/pkg/util/ovnutils.InitOvn({{0xc001177940, 0x1b}, {0xc001165ef8, 0x5}, {0x0, 0x0}, {0xc001165ed0, 0xa}, 0x5dc, {0xc001165f20, ...}, ...})
        /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:130 +0x9f
yunion.io/x/onecloud/pkg/hostman/hostinfo.(*OvnHelper).Init(...)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostovn.go:41
yunion.io/x/onecloud/pkg/hostman/hostinfo.(*SHostInfo).setupOvnChassis(0xc000e478c0?)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostinfo.go:226 +0xa5
yunion.io/x/onecloud/pkg/hostman/hostinfo.(*SHostInfo).Init(0xc000c3ba20?)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostinfo.go:211 +0xdc
yunion.io/x/onecloud/pkg/hostman.(*SHostService).RunService(0xc000a351e8?)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:85 +0x132
yunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc0009512d8)
        /root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xdf
yunion.io/x/onecloud/pkg/hostman.StartService(...)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:168
main.main()
        /root/go/src/yunion.io/x/onecloud/cmd/host/main.go:30 +0xec
goroutine 1 [running]:
runtime/debug.Stack()
        /usr/lib/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /usr/lib/go/src/runtime/debug/stack.go:16 +0x13
yunion.io/x/log.Fatalf({0x3678c0c, 0x1c}, {0xc0012d7e50, 0x1, 0x1})
        /root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/log/log.go:138 +0x2c
yunion.io/x/onecloud/pkg/hostman.(*SHostService).RunService(0xc000a351e8?)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:86 +0x177
yunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc0009512d8)
        /root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xdf
yunion.io/x/onecloud/pkg/hostman.StartService(...)
        /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:168
main.main()
        /root/go/src/yunion.io/x/onecloud/cmd/host/main.go:30 +0xec
[fatal 2025-10-22 04:44:48 hostman.(*SHostService).RunService(host_services.go:86)] Host instance init error: Setup OVN Chassis: normalize db host: dns lookup (default-ovn-north) failed: lookup default-ovn-north on 10.96.0.10:53: read udp 10.96.0.10:50335->10.96.0.10:53: read: connection refused
[root@master1 ~]# 

default-ovn-north的日志:

[root@master1 ~]# kubectl logs default-ovn-north-76469f7946-jd6vc  -n onecloud --tail 100 -f 
/etc/openvswitch/ovnnb_db.db does not exist ... (warning).
Creating empty database /etc/openvswitch/ovnnb_db.db.
/etc/openvswitch/ovnsb_db.db does not exist ... (warning).
Creating empty database /etc/openvswitch/ovnsb_db.db.
Starting ovn-northd.
ovn-northd: 2025-10-22T04:18:34.305Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovn-northd.log
ovsdb-server-nb: 2025-10-22T04:18:34.276Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server-nb.log
ovsdb-server-sb: 2025-10-22T04:18:34.296Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server-sb.log
ovsdb-server-nb: 2025-10-22T04:18:34.280Z|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.12.4
ovsdb-server-nb: 2025-10-22T04:18:44.289Z|00003|memory|INFO|3164 kB peak resident set size after 10.0 seconds
ovsdb-server-nb: 2025-10-22T04:18:44.289Z|00004|memory|INFO|cells:30 monitors:2 sessions:1
ovsdb-server-nb: 2025-10-22T04:19:00.209Z|00005|jsonrpc|WARN|tcp:10.109.74.25:41550: receive error: Connection reset by peer
ovsdb-server-nb: 2025-10-22T04:19:00.209Z|00006|reconnect|WARN|tcp:10.109.74.25:41550: connection dropped (Connection reset by peer)
ovsdb-server-nb: 2025-10-22T04:19:00.611Z|00007|jsonrpc|WARN|tcp:10.109.74.25:41640: receive error: Connection reset by peer
ovsdb-server-nb: 2025-10-22T04:19:00.611Z|00008|reconnect|WARN|tcp:10.109.74.25:41640: connection dropped (Connection reset by peer)
ovsdb-server-nb: 2025-10-22T04:19:00.800Z|00009|jsonrpc|WARN|tcp:10.109.74.25:41674: receive error: Connection reset by peer
ovsdb-server-nb: 2025-10-22T04:19:00.800Z|00010|reconnect|WARN|tcp:10.109.74.25:41674: connection dropped (Connection reset by peer)ovsdb-server-sb: 2025-10-22T04:18:34.299Z|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.12.4
ovsdb-server-sb: 2025-10-22T04:18:44.303Z|00003|memory|INFO|3252 kB peak resident set size after 10.0 seconds
ovsdb-server-sb: 2025-10-22T04:18:44.303Z|00004|memory|INFO|cells:401 monitors:3 sessions:13
ovsdb-server-sb: 2025-10-22T04:18:49.260Z|00005|jsonrpc|WARN|tcp:10.40.230.128:43648: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:18:49.260Z|00006|reconnect|WARN|tcp:10.40.230.128:43648: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:19:04.305Z|00007|memory|INFO|peak resident set size grew 300% in last 20.0 seconds, from 3252 kB to 13024 kB
ovsdb-server-sb: 2025-10-22T04:19:04.305Z|00008|memory|INFO|cells:81208 monitors:3 sessions:13
ovsdb-server-sb: 2025-10-22T04:20:07.283Z|00009|jsonrpc|WARN|tcp:10.40.35.128:57990: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:20:07.283Z|00010|reconnect|WARN|tcp:10.40.35.128:57990: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:22:54.699Z|00011|jsonrpc|WARN|tcp:10.40.230.128:44186: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:22:54.699Z|00012|reconnect|WARN|tcp:10.40.230.128:44186: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:23:51.707Z|00013|jsonrpc|WARN|tcp:10.40.35.128:58574: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:23:51.707Z|00014|reconnect|WARN|tcp:10.40.35.128:58574: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:23:58.721Z|00015|jsonrpc|WARN|tcp:10.40.35.128:58592: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:23:58.722Z|00016|reconnect|WARN|tcp:10.40.35.128:58592: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:27:15.078Z|00017|jsonrpc|WARN|tcp:10.40.35.128:59088: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:27:15.078Z|00018|reconnect|WARN|tcp:10.40.35.128:59088: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:29:06.372Z|00019|jsonrpc|WARN|tcp:10.40.230.128:45068: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:29:06.372Z|00020|reconnect|WARN|tcp:10.40.230.128:45068: connection dropped (Connection reset by peer)
ovsdb-server-sb: 2025-10-22T04:30:17.419Z|00021|jsonrpc|WARN|tcp:10.40.35.128:59534: receive error: Connection reset by peer
ovsdb-server-sb: 2025-10-22T04:30:17.419Z|00022|reconnect|WARN|tcp:10.40.35.128:59534: connection dropped (Connection reset by peer)

请教下如何定位问题和解决,谢谢~!!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions