Skip to content

Fix access point cleanup on instance removal#10179

Open
openroad-ci wants to merge 5 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-odb-access-point-cleanup
Open

Fix access point cleanup on instance removal#10179
openroad-ci wants to merge 5 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-odb-access-point-cleanup

Conversation

@openroad-ci
Copy link
Copy Markdown
Collaborator

Summary

  • Found odb crash during new Resizer architecture testing.
  • Clear ITerm preferred access point back-references when destroying an instance.
  • Clear and de-duplicate MPin pin-access entries before destroying access points.
  • Add ODB tests that cover stale AP/ITerm references and repeated pin-access cleanup.

Problem

  • ORFS nangate45/gcd crashed in 5_1_grt while repair_timing triggered parasitic updates, incremental global routing, and DRT dirty pin-access database updates.
  • The crash was reproduced with FLOW_VARIANT=legacy_mt and RSZ_POLICY=legacy_mt. The same 4_cts.odb/.sdc input passed with RSZ_POLICY=legacy, which ruled out an already-corrupt input database.
  • The minimal failing move sequence for legacy_mt was sizeup,buffer, so the multi-threaded legacy repair flow exposed stale OpenDB access point bookkeeping during repeated ECO pin-access updates.

Symptom:

repair_timing -setup_margin 0 -hold_margin 0 -repair_tns 100 -verbose
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove
...
openroad: .../src/odb/src/db/dbTable.inc:47:
T* odb::dbTable<T, page_size>::getPtr(odb::dbId<T>) const [with T = odb::_dbITerm; ...]:
Assertion `p->offset_in_bytes_ & kAllocBit' failed.
Signal 6 received

Call stack:

odb::dbMPin::clearPinAccess(int)
odb::dbMaster::clearPinAccess(int)
drt::io::Writer::updateDbAccessPoints(odb::dbBlock*, odb::dbTech*)
drt::io::Writer::updateDb(odb::dbDatabase*, drt::RouterConfiguration*, bool, bool)
drt::TritonRoute::updateDirtyPAData()
utl::CallBackHandler::triggerOnPinAccessUpdateRequired()
grt::GlobalRouter::updateDirtyRoutes(bool)
grt::IncrementalGRoute::updateRoutes(bool)
est::EstimateParasitics::updateParasitics()
...
rsz::Resizer::repairSetup(...)
rsz::repair_setup(...)

Root-cause

OpenDB access point cleanup did not fully maintain the bidirectional references among MPin pin-access slots, ITerm preferred access points, and AccessPoint ITerm back-references.

  • dbMPin::clearPinAccess() destroyed the dbAccessPoint objects stored in pin->aps_[pin_access_idx], but it did not clear the MPin AP id vector for that pin-access slot. A later update of the same pin-access index could revisit stale AP ids and try to destroy already-destroyed access points.
  • dbInst::destroy() tried to remove AP back-references before ITerm deletion, but it used iterm->getAccessPoints(). That API derives APs from the instance's current pin-access index, so it can miss preferred APs stored in the ITerm aps_ map for other pin-access indices.
  • When such an ITerm was destroyed, stale ITerm ids could remain in _dbAccessPoint::iterms_. Later, dbAccessPoint::destroy() iterated _ap->iterms_ and dereferenced an already-destroyed ITerm through block->iterm_tbl_->getPtr(), hitting the OpenDB allocation assert.

Solution

  • In dbMPin::clearPinAccess(), move the AP ids out of pin->aps_[pin_access_idx] before destroying the AP objects. This leaves the MPin slot empty immediately, prevents stale id reuse on repeated cleanup, and avoids iterator invalidation while dbAccessPoint::destroy() updates MPin bookkeeping.
  • Sort and de-duplicate the copied AP id list before destruction so duplicated AP bookkeeping cannot cause a double destroy.
  • In dbInst::destroy(), replace the current-pin-access-index cleanup with iterm->clearPrefAccessPoints(). That method walks the ITerm's preferred AP map directly and removes this ITerm id from every referenced AP back-reference list before the ITerm object is destroyed.
  • Add targeted TestAccessPoint regression coverage for idempotent MPin pin-access cleanup, duplicate AP ids in an MPin pin-access slot, and stale AP back-reference removal when an instance is destroyed.

Testing

  • cmake --build build --target TestAccessPoint -j4
  • ./build/src/odb/test/cpp/TestAccessPoint --gtest_filter='*clear_mpin_access_points_is_idempotent:*clear_mpin_access_points_removes_duplicate_references:*destroy_inst_removes_access_point_back_references:*test_default'
  • bazel test //src/odb/test/cpp:TestAccessPoint

The new stale back-reference test fails on origin/master HEAD a00a5750166b before the fix and passes with this branch.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@jhkim-pii
Copy link
Copy Markdown
Contributor

jhkim-pii commented Apr 18, 2026

I found this issue during new Resizer architecture testing.

@eder-matheus I added you as a reviewer because I found that you edited the related codes a week ago.
Please add an appropriate review if you are not the right person. Thank you in advance.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors access point management in dbInst and dbMPin to ensure idempotency and correctly handle duplicate references. Key changes include updating dbMPin::clearPinAccess to sort and deduplicate access points before destruction and replacing manual cleanup in dbInst with a centralized method call. New unit tests were added to verify these improvements. The review feedback suggests hardening the index check in dbMPin::clearPinAccess against negative values and using structured bindings with std::ranges::unique to enhance code readability.

Comment thread src/odb/src/db/dbMPin.cpp Outdated
@@ -153,8 +155,11 @@ void dbMPin::clearPinAccess(const int pin_access_idx)
if (pin->aps_.size() <= pin_access_idx) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comparison between pin->aps_.size() (which is unsigned) and pin_access_idx (which is a signed int) can lead to unexpected behavior if pin_access_idx is negative. While it currently works "by accident" for -1 (as it promotes to a large unsigned value), it is safer and clearer to explicitly check for negative values or change the parameter type to uint32_t to match dbInst::getPinAccessIdx().

Suggested change
if (pin->aps_.size() <= pin_access_idx) {
if (pin_access_idx < 0 || pin->aps_.size() <= static_cast<uint32_t>(pin_access_idx)) {

Comment thread src/odb/src/db/dbMPin.cpp Outdated
dbVector<dbId<_dbAccessPoint>> aps;
aps.swap(pin->aps_[pin_access_idx]);
std::ranges::sort(aps);
aps.erase(std::ranges::unique(aps).begin(), aps.end());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using structured bindings with std::ranges::unique would improve readability by avoiding the explicit call to .begin() on the returned subrange.

  auto [unique_end, original_end] = std::ranges::unique(aps);
  aps.erase(unique_end, original_end);

@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Comment thread src/odb/test/cpp/BUILD
srcs = [
"TestAccessPoint.cpp",
],
features = ["-layering_check"],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

@maliberty
Copy link
Copy Markdown
Member

@codex review

@maliberty
Copy link
Copy Markdown
Member

Why do we have duplicate APs in the first place?

@maliberty maliberty requested a review from osamahammad21 April 18, 2026 14:22
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Member

@osamahammad21 osamahammad21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only effective change of this PR is clearing the MPin's access points at the end of dbMPin::clearPinAccess
However, the other changes and the test case seem to be defensive, which I generally don't prefer.

dbInst::destroy() tried to remove AP back-references before ITerm deletion, but it used iterm->getAccessPoints(). That API derives APs from the instance's current pin-access index, so it can miss preferred APs stored in the ITerm aps_ map for other pin-access indices.

What do you mean by other pin-access indices? Each instance has one and only one pin access index, so its iterms.

Sort and de-duplicate the copied AP id list before destruction so duplicated AP bookkeeping cannot cause a double destroy.

There should never be duplicate APs. If there is, then this needs to be addressed.

Comment thread src/odb/src/db/dbMPin.cpp
Comment on lines +156 to 159
if (pin_access_idx < 0
|| pin->aps_.size() <= static_cast<std::size_t>(pin_access_idx)) {
return;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what made you do this change? It is correct, but I wonder if there is any call that uses a negative pin_access_idx which would be more concerning.
Also, I think the more correct approach to make pin_access_idx argument a size_t or unsigned int instead.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, just found it's ai review triggered.

Comment thread src/odb/src/db/dbInst.cpp
}
}
}
iterm->clearPrefAccessPoints();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is a more efficient approach, the outcome theoretically should be exactly the same. The preferred access point of an iterm is always a subset of its total access points which is returned by getAccessPoints

Comment thread src/odb/src/db/dbMPin.cpp
Comment on lines +160 to 168
dbVector<dbId<_dbAccessPoint>> aps;
aps.swap(pin->aps_[pin_access_idx]);
std::ranges::sort(aps);
const auto duplicate_aps = std::ranges::unique(aps);
aps.erase(duplicate_aps.begin(), duplicate_aps.end());
for (const dbId<_dbAccessPoint>& ap : aps) {
odb::dbAccessPoint::destroy(
(odb::dbAccessPoint*) block->ap_tbl_->getPtr(ap));
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, there shouldn't be duplicate aps. This looks a bit defensive. I would prefer if such a case exists a crash rather than passing unnoticed.

Suggested change
dbVector<dbId<_dbAccessPoint>> aps;
aps.swap(pin->aps_[pin_access_idx]);
std::ranges::sort(aps);
const auto duplicate_aps = std::ranges::unique(aps);
aps.erase(duplicate_aps.begin(), duplicate_aps.end());
for (const dbId<_dbAccessPoint>& ap : aps) {
odb::dbAccessPoint::destroy(
(odb::dbAccessPoint*) block->ap_tbl_->getPtr(ap));
}
auto& aps = pin->aps_[pin_access_idx];
for (const auto& ap : aps) {
odb::dbAccessPoint::destroy(
(odb::dbAccessPoint*) block->ap_tbl_->getPtr(ap));
}
aps.clear();

@osamahammad21
Copy link
Copy Markdown
Member

The only effective change of this PR is clearing the MPin's access points at the end of dbMPin::clearPinAccess However, the other changes and the test case seem to be defensive, which I generally don't prefer.

dbInst::destroy() tried to remove AP back-references before ITerm deletion, but it used iterm->getAccessPoints(). That API derives APs from the instance's current pin-access index, so it can miss preferred APs stored in the ITerm aps_ map for other pin-access indices.

What do you mean by other pin-access indices? Each instance has one and only one pin access index, so its iterms.

Sort and de-duplicate the copied AP id list before destruction so duplicated AP bookkeeping cannot cause a double destroy.

There should never be duplicate APs. If there is, then this needs to be addressed.

@jhkim-pii Actually after checking the code, I found that erasing the mpin's aps list should be done by dbAccessPoint::destroy. I'll try to reproduce and investigate further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants