Under these circumstances:
I'm getting the following panic:
panic: close of closed channel
goroutine 2849 [running]:
github.com/samuel/go-zookeeper/zk.(*Conn).Close(0xc420795180)
github.com/samuel/go-zookeeper/zk/conn.go:253 47
github.com/curator-go/curator.(*handleHolder).internalClose(0xc4203058f0, 0xc420302470, 0x0)
github.com/curator-go/curator/state.go:136 +0x8d
github.com/curator-go/curator.(*handleHolder).closeAndReset(0xc4203058f0, 0xc42587cd00, 0x1e)
github.com/curator-go/curator/state.go:122 +0x2f
github.com/curator-go/curator.(*connectionState).reset(0xc420302420, 0x1b71d87, 0xf)
github.com/curator-go/curator/state.go:234 +0x55
github.com/curator-go/curator.(*connectionState).handleExpiredSession(0xc420302420)
github.com/curator-go/curator/state.go:351 +0xd9
github.com/curator-go/curator.(*connectionState).checkState(0xc420302420, 0xffffff90, 0x0, 0x0, 0xc425ed2600, 0xed0e5250a)
github.com/curator-go/curator/state.go:318 +0x9c
github.com/curator-go/curator.(*connectionState).process(0xc420302420, 0xc425ed2680)
github.com/curator-go/curator/state.go:299 +0x16d
created by github.com/curator-go/curator.(*Watchers).Fire
github.com/curator-go/curator/watcher.go:64 +0x96
This is the detailed sequence of events:
s.ReregisterAll()
-> Conn()
-> checkTimeout()
-> reset
(bc 1 minute has elapsed) -> closeAndReset()
-> conn.Close()
which can block for a second
zk.StateExpired
(zk cluster sends this bc it considers this client as dead since it didn't ping during 2.) -> reset
-> closeAndReset()
-> conn.Close()
which causes a panic because conn.Close()
already closed the connection's c.shouldQuit
channel AND s.zooKeeper.getZookeeperConnection
was never called by goroutine A because it was blocking for the second so there's no new connection.A naive solution I tried is to just use a mutex on reset
, but now I'm getting helper.GetConnectionString()
equal to empty string. What's the best way to avoid this crash and try to get into a good state when the client loses and then regains network connectivity? Should the fix be in github.com/samuel/go-zookeeper's implementation of not letting you close an already closed connection?
(I've filed this issue here, but the project seems to be lacking in terms of discussion so I'm asking on SO.)
zk.Conn has a State() method that returns an enum "State", which is one of the following:
type State int32
const (
StateUnknown State = -1
StateDisconnected State = 0
StateConnecting State = 1
StateAuthFailed State = 4
StateConnectedReadOnly State = 5
StateSaslAuthenticated State = 6
StateExpired State = -112
StateConnected = State(100)
StateHasSession = State(101)
)
What state is "conn" in when goroutine B calls conn.Close()?
A possible solution would be to add a switch to goroutine B whereby you do not call conn.Close() if you are in conn.StateConnecting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With