Skip to content

Conversation

@yadij
Copy link
Contributor

@yadij yadij commented Sep 8, 2025

No description provided.

@yadij yadij requested a review from kinkie September 8, 2025 02:22
Copy link
Contributor

@kinkie kinkie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that an API to emit a SBuf from an IP Address would belong to Ip::Address, or at least a free function.
Not necessarily a blocker for this PR, but an XXX would make sense if there is consensus it would be a good idea

Comment on lines +93 to +105
SBuf key;
auto *buf = key.rawAppendStart(MAX_IPSTRLEN);
const auto len = ip.toHostStr(buf, MAX_IPSTRLEN);
key.rawAppendFinish(buf, len);
return key;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I feel like this would belong better to the Ip::Address class.

Copy link
Contributor

@rousskov rousskov Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I recommend addressing my change request about ipList key type before resuming work on this function.

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needless to say, I support replacing a dlink_list member.

/** what ip addresses has this user been seen at?, plus a list length cache */
dlink_list ip_list;
/// what IP addresses has this user been seen at?
ClpMap<SBuf, Ip::Address> ipList;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what this map key (represented by SBuf type) is. In other words, I do not understand what ipList maps to an IP address. Please clarify.

PR BuildIpKey() code suggests that the key is an IP address (represented by Ip::Address), but that does not help answering the above question and raises another: Why convert Ip::Address to SBuf instead of using Ip::Address itself?

Copy link
Contributor Author

@yadij yadij Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key is how the map is searched. In this cache it is a representation of the IP (only). The stored value includes details like address family which are not relevant to the map lookup.

In theory we could use struct in6_addr as a cache key (I tried). However the memory management logic of ClpMap need to use some methods of STL container API (SBuf provides those).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key is ... a representation of the IP (only). The stored value includes details like address family which are not relevant to the map lookup.

If this map key is an IP address, we should use Ip::Address instead of SBuf for its type (more on that below).

However, that raises a second question: If the key is an IP address and the value is the same IP address, then why map X to X?! AFAICT, this PR proposes to use a map for what is really a set of unique IP addresses (with LRU-driven expiration and other management features implemented by ClpMap). If so, then we should refactor to clarify this ClpMap abuse and reduce its overheads (unless you want to implement a ClpSet in a prerequisite PR). I can help with that refactoring, but I want to make sure we are on the same page here before we discuss specific code adjustments.

In theory we could use struct in6_addr as a cache key (I tried).

I do not think we should use in6_addr directly because high-level Squid code uses Ip::Address to represent IP addresses. This authentication cache is high-level code (for this discussion context). It is regrettable that Ip::Address contains port information as well, but we can (and should) ignore that extra information in this PR context.

Can Ip::Address be uses as a ClpMap key type already? If not, what APIs do we need to add or adjust?

Copy link
Contributor Author

@yadij yadij Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Ip::Address be uses as a ClpMap key type already? If not, what APIs do we need to add or adjust?

Per ClpMap.h documentation:

/// Key must come with std::hash<Key> and std::equal_to<Key> instantiations.
/// Key::length() must return the number of memory bytes in use by the key.
  • essentially any STL container type, or custom types implementing the STL container API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the minor questions discussed below are secondary to the primary concern raised in this change request. Again, we need to "make sure that we are on the same page here before we discuss specific code adjustments". Please see "If the key is an IP address and the value is the same IP address, then why map X to X?!" paragraph for more details.


Alex: Can Ip::Address be uses as a ClpMap key type already? If not, what APIs do we need to add or adjust?

/// Key must come with std::hash<Key> and std::equal_to<Key> instantiations.

std::hash<Key> should be easy to provide.

I hope that std::equal_to<Key> can be provided without significant problems as well, despite Ip::Address idiosyncrasies related to address equality, but we will need to tread carefully there. Some work may be required.

/// Key::length() must return the number of memory bytes in use by the key.

I suggest investigating replacing the existing k.length() call with MemoryUsedBy(k) to improve/fix this overly restrictive interface.

for (auto i = other.entries_.rbegin(); i != other.entries_.rend(); ++i) {
if (!i->expired()) {
const auto ours = find(i->key);
const auto newTtl = (ours != index_.cend() ? max(i->expires, ours->second->expires) : i->expires) - squid_curtime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this merge() method survives, we should try to refactor so that its code does not convert expiration values to TTL values (here) and then back to the original expiration value (in Entry constructor). In other words, there should be no squid_curtime involvement; just expires manipulation.

/// * Entries which have expired will be ignored.
/// * When an key collision occurs the existing entry will be replaced
/// and the new entry given the largest of the two TTL values.
void merge(const ClpMap<Key, Value, MemoryUsedBy> &other);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this method to importFresh(). There are many reasonable ways to merge two maps, and we should give this specific merging algorithm a more specific name. While renaming, please avoid the term "list" when mentioning this new method argument -- ClpMap is an associative container (i.e. a map) rather than a list.

Copy link
Contributor Author

@yadij yadij Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that name is any better.

  • import has just as many "reasonable ways to import" variations as merge does, and
  • fresh has implications about the timing information which may be false (already expired/stale, or earlier-than-current expiry time) due to helper delays and parallel transactions.

If you want to be explicit about what operation the merge is doing. How about addOrUpdate() ?

FWIW, I have tried to implement std::list::merge() behaviour with only small changes appropriate for ClpMap LRU sorting and uniqueness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import has just as many "reasonable ways to import" variations as merge does, and

Yes, and that is why nobody is proposing to name this method import(). mergeFresh() or addFresh() would work.

fresh has implications about the timing information which may be false (already expired/stale, or earlier-than-current expiry time) due to helper delays and parallel transactions.

This code does not add expired entries (from this code point of view), so "fresh" is very appropriate IMO.

I still think importFresh() is better than addOrUpdate() [because it more clearly indicates which map dominates], but addOrUpdate() is better than merge(). Using either of those two names will address this change request.

"Update" clashes with the method description that promises to replace (rather than update) existing same-key entries. Please rephrase that bullet if you decide to go with addOrUpdate().

I have tried to implement std::list::merge() behaviour with only small changes appropriate for ClpMap LRU sorting and uniqueness.

Those "small changes" are actually what differentiates the proposed method from other possible merging methods. That "sorting and uniqueness" is why we use ClpMap instead of std::map (and its merge() method) here. The name should inform the caller which of the two maps dominates as far as LRU replacements are concerned. If we were to call this merge(), then the method would have to truly merge (i.e. interleave as needed to preserve relative use order) LRU indexes instead (and our ClpMap does not have enough information to do that IIRC).

debugs(29, 5, "auth_user '" << from << "' into auth_user '" << this << "'.");

// combine the helper response annotations. Ensuring no duplicates are copied.
notes.appendNewOnly(&from->notes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we ignore any same-key annotations found from. In merge() below, we use same-key annotations in from to replace the ones in this. If this difference is intentional, please add a C++ comment to highlight and explain the difference in merging two containers (notes and ipList).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you are equating annotations with IP addresses. They are different, their handling is different.

}
}
}
ipList.merge(from->ipList);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you have tested this new code with both found and !found cases (using old code terminology). Please apply the same test to the old code to explain why the found case avoided the infinite while (from->ip_list.head != nullptr) loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am relying on the fact that we exhaustively audited and tested ClpMap before it merged not to have that type of fundamental error. All this Auth::User code does now is call add(IP, TTL) whenever a login occurs.

Comment on lines +93 to +105
SBuf key;
auto *buf = key.rawAppendStart(MAX_IPSTRLEN);
const auto len = ip.toHostStr(buf, MAX_IPSTRLEN);
key.rawAppendFinish(buf, len);
return key;
Copy link
Contributor

@rousskov rousskov Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I recommend addressing my change request about ipList key type before resuming work on this function.

src/auth/User.cc Outdated
username_(nullptr),
requestRealm_(aRequestRealm)
requestRealm_(aRequestRealm),
ipList(256*1024) // XXX: figure a more reasonable value than ~256KB cache per-username
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no limit before this PR, right? In PR description, please disclose this functionality change and discuss its expected effects.

Are there any other Squid functionality changes in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct on no limit. I believe this will help reduce memory consumption under some DoS conditions.

Though as per the code note I am unsure if this is reasonable - it seems like a lot of memory per-user but it turns out entries are 211 bytes, so this is only ~1200 unique IPs per user. We also have to factor in IPv6 address "anonymization" which rotates the public-IPv6 every 5 minutes - amounting to ~300 IP/User/device/day.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Sep 8, 2025
@yadij yadij force-pushed the cleanup-dlink-authuser-1 branch from 71d5ba6 to 5dad117 Compare September 12, 2025 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-for-author author action is expected (and usually required)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants