It’s currently trivial to scrape any reachable node’s AddrMan (a database of all nodes known to them along with the timestamps).
We do have a limit of one GETADDR per connection, but a spy can disconnect and reconnect even from the same IP, and send GETADDR again and again.
Since we respond with 1,000 random records at most, depending on the AddrMan size it takes probably up to 100 requests for an spy to make sure they scraped (almost) everything.
I even have a script for that. It is totally doable within couple minutes.
Then, with some extra protocol knowledge a spy can infer the direct peers of the victim, and other topological stuff.
I suggest to cache responses to GETADDR on a daily basis, so that an attacker gets at most 1,000 records per day, and can’t track the changes in real time. I will be following up with more improvements to addr relay privacy, but this one alone is a very effective. And simple!
I doubt any of the real software does reconnect to get new addrs from a given peer, so we shouldn’t be cutting anyone.
I also believe it doesn’t have any negative implications on the overall topology quality. And the records being “outdated” for at most a day doesn’t break any honest assumptions either.
This is a backport of core#19335 [1/5]