Page MenuHomePhabricator

[chronik-client] Implement keepAlive to preserve longlived connections
ClosedPublic

Authored by bytesofman on Nov 21 2023, 19:45.

Details

Summary

Left to its own devices, websockets will disconnect in ~45s if no information is being exchanged between client and server.

Standard practice is to send ping msgs from the client, which the server responds to with pong. These methods are usually standardized.

Such methods are already implemented in chronik-client both in nng and in-node. It is possible to manually implement a ping-pong keepAlive at the app level, instead of in this library. However, it makes sense to do it in chronik-client, which already has a websocket manager that performs other typical websocket mgmt tasks like autoReconnect. It's also easiest to clearInterval on the ping method inside the existing websocket manager, which is already handling onopen and onclose methods.

Traditional TCP keepalive is not practical for websocket connections:
https://stackoverflow.com/questions/23238319/websockets-ping-pong-why-not-tcp-keepalive
https://websockets.readthedocs.io/en/stable/topics/timeouts.html

A ping interval of 30s was selected based on testing connections with NNG chronik websocket. Running for 5 min, a ping interval of 30s showed no disconnects (vs no ping interval would show 3-5). Disconnects were observed with a ping interval of 45s.

This diff does not prevent disconnects which may still happen for a variety of network reasons. App developers are expected to handle the disconnect event and recovery methods to catch missed events.

Test Plan

npm test, tests pass, terminal does not hang (confirming the interval is cleared by ws.close())

Diff Detail

Repository
rABC Bitcoin ABC
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Fabien added inline comments.
modules/chronik-client/README.md
74 ↗(On Diff #43204)

What do you think about making this an integer that holds the interval, and use 0 to disable the feature ? This makes it possible for various use cases to adjust the interval. You can always set a 30s default if unset, if you find this useful.

modules/chronik-client/README.md
74 ↗(On Diff #43204)

This would probably be a better feature, esp for power users who are (probably) most users of chronik-client, but would take some validation.

I don't think we can expect the avg chronik-client user to know what the interval value should be. Once we have to validate for "must be at least 20 seconds" ... might as well just fix it at a value that is working well empirically.

If we see that the number needs to be changed frequently or 30s is not effective in all use cases, can add this later.

I don't think we can expect the avg chronik-client user to know what the interval value should be. Once we have to validate for "must be at least 20 seconds" ... might as well just fix it at a value that is working well empirically.

Having a configurable value doesn't prevent you from using a default that works out of the box.

If we see that the number needs to be changed frequently or 30s is not effective in all use cases, can add this later.

Fair enough.

This revision is now accepted and ready to land.Nov 22 2023, 08:52

gate clearinterval logic by keepalive param, reset interval to undefined after it is cleared