Skip to content

Add column mux ratio for ratio = 1, 2, 4#10018

Open
tnguy19 wants to merge 20 commits intoThe-OpenROAD-Project:masterfrom
tnguy19:add-column-mux
Open

Add column mux ratio for ratio = 1, 2, 4#10018
tnguy19 wants to merge 20 commits intoThe-OpenROAD-Project:masterfrom
tnguy19:add-column-mux

Conversation

@tnguy19
Copy link
Copy Markdown
Contributor

@tnguy19 tnguy19 commented Apr 1, 2026

Summary

Add -column_mux_ratio parameter to generate_ram and generate_ram_netlist with default value = 1.

The column_mux_ratio value indicate the number of words that will share the same physical row, and currently this implementation will work correctly for values = 1, 2, and 4*.

Muxes are implemented using AOI22s for ratio =2 (one AOI22+inverter) and ratio =4 (2 stage-1 AOI22s feeding a stage-2 AOI22).

Some changes to the code to accommodate this feature additions are also added.

Configs tested and passed: RAM8x8, 16x8, 32x6, 64x8, 128x8 with column_mux_ratio = 1, 2, 4

Type of Change

  • New feature

Impact

  • generate_ram and generate_ram_netlist can be use with option -column_mux_ratio with default value = 1. Scripts that do not have column_mux_ratio are not affected
  • Change from using num_words to num_rows to indicate the number of physical row that will be created. For column_mux_ratio >1, num_rows = num_words / column_mux_ratio physical rows are generated. Column mux are placed in the buffer row
  • Change naming instances with format storage_{row}_{bit} to storage_{row}_{word}_{bit} to reflect the placement of multiple words in one row when column_mux_ratio > 1

Verification

  • I have verified that the local build succeeds (./etc/Build.sh).
  • I have run the relevant tests and they pass.
  • My code follows the repository's formatting guidelines.
  • I have signed my commits (DCO).

Related Issues

Fixes #9843

@rovinski

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for column muxing (ratios 1, 2, and 4) in the RAM generator. It introduces logic to partition address bits into row and word selection, utilizes AOI22 cells for read-path muxing, and updates the grid placement logic to accommodate the additional mux and selection components. Feedback highlights a potential row misalignment in the decoder column due to the placement order of word selection logic and a need for more robust filler cell calculation in the inverter column to ensure consistent grid height across different address bit configurations.

Comment thread src/ram/src/ram.cpp Outdated
Comment thread src/ram/src/ram.cpp
@tnguy19 tnguy19 force-pushed the add-column-mux branch 3 times, most recently from a024df6 to b4bb84d Compare April 1, 2026 02:20
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp Outdated
@tnguy19 tnguy19 force-pushed the add-column-mux branch 2 times, most recently from 60c1025 to f628127 Compare April 1, 2026 12:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

clang-tidy review says "All clean, LGTM! 👍"

4 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

clang-tidy review says "All clean, LGTM! 👍"

@tnguy19
Copy link
Copy Markdown
Contributor Author

tnguy19 commented Apr 4, 2026

Ready for review @rovinski

Copy link
Copy Markdown
Collaborator

@rovinski rovinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in meeting, please increase the power grid pitch to be at least 4 bits wide (whatever that translates to in um)

Comment thread src/ram/test/make_8x8.tcl Outdated
@@ -13,6 +13,7 @@ generate_ram \
-mask_size 8 \
-word_size 8 \
-num_words 8 \
-column_mux_ratio 1 \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed, the default col/mux ratio should be 1.

Comment thread src/ram/test/make_8x8_mux2_sky130.ok
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of the 8x8 vok files are identical. Just keep the one that's already there and compare against that one.

Comment thread src/ram/src/ram.cpp
Comment on lines +788 to +794
if (num_words % column_mux_ratio != 0) {
logger_->error(RAM,
34,
"num_words ({}) must be divisible by column_mux_ratio ({}).",
num_words,
column_mux_ratio);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO comment about adding support for non-divisible word counts.
It would mean that the last row has some empty spaces instead of bits, but it should be possible and won't force the user to use any more words than they actually want.

Comment thread src/ram/src/ram.cpp Outdated
Comment thread src/ram/src/ram.cpp
Comment thread src/ram/src/ram.cpp
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp Outdated
Comment thread src/ram/src/ram.cpp Outdated
Comment thread src/ram/src/ram.cpp Outdated
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

clang-tidy review says "All clean, LGTM! 👍"

@rovinski
Copy link
Copy Markdown
Collaborator

rovinski commented Apr 9, 2026

@tnguy19 CI still failing

@rovinski
Copy link
Copy Markdown
Collaborator

rovinski commented Apr 9, 2026

@gemini-code-assist /review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for column multiplexing (with ratios 1, 2, and 4) in the RAM generator. Key changes include updating the generate, makeSlice, and makeWord functions to incorporate the column_mux_ratio and word_idx parameters, adding logic for shared select nets and AOI muxes for word selection, and integrating a new aoi22_cell_ type. Address inputs have been renamed to addr_rw, and pin placement and grid calculations are adjusted accordingly. A high-severity issue was noted regarding a potential null pointer dereference when deleting port_iter in RamGen::findMasters.

Comment thread src/ram/src/ram.cpp Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp Outdated
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

…e approach

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp
Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@tnguy19
Copy link
Copy Markdown
Contributor Author

tnguy19 commented Apr 14, 2026

There was an update that introduce PDK-independent pin mapping, some part of the code has been updated to follow this particularly the logic to determine aoi gates used in mux is now also PDK-independent

@tnguy19 tnguy19 requested a review from rovinski April 14, 2026 22:21
Copy link
Copy Markdown
Collaborator

@rovinski rovinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, don't have time to fully re-review today

Comment thread src/ram/test/regression_tests.tcl Outdated
record_tests {
make_8x8_sky130
make_7x7_nangate45
make_8x8
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no make_8x8 anymore, that got moved to make_8x8_sky130

Comment thread src/ram/src/ram.cpp Outdated
inputs.push_back(p->libertyPort());
}
}
delete port_iter;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use smart pointer and no delete
std::unique_ptr port_iter = liberty->portIterator();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 38c9cb2

… finder

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/ram/src/ram.cpp Outdated
Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

…en files

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Comment thread src/ram/src/ram.cpp Outdated
Comment on lines +509 to +511
bool is_inverted
= !port_name.empty()
&& (port_name.back() == 'N' || port_name.back() == 'B');
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to be robust enough. Is there no way to determine an inverted output from STA?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found 2 functions in the sta's Sequential.hh file ( /OpenROAD/src/sta/include/sta/Sequential.hh) that returns the output and inverted output port (output() and outputInv()) . Do you want me to use that to check here?

Copy link
Copy Markdown
Contributor

@braydenlouie braydenlouie Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested Sequential::output() and outputInv() to distinguish Q and Qn ports. Found that these return internal state variable names (IQ/IQN) rather than the actual Liberty port names (Q/QN), so they can't be used directly for port name matching. Also tested using function()->op() to check if the output is a not_ expression to identify QN, but both Q and QN returned the same function op, so that approach is also unreliable. The port iterator gives Q/QN, but neither method can distinguish them without relying on the port name directly.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maliberty @povik do you have any tips on how to identify an inverted vs. non-inverted output of a flip-flop?

Copy link
Copy Markdown
Member

@maliberty maliberty Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to ask the output port for its function and see if it matches up with the internal pos/neg variable. For example in ng45

  cell (DFFRS_X1) {
	ff ("IQ" , "IQN") ...
	pin (Q) {
		direction		: output;
		function		: "IQ";
	pin (QN) {
		direction		: output;
		function		: "IQN";

Comment thread src/ram/src/ram.cpp
// collect all input ports for this cell
std::vector<sta::LibertyPort*> inputs;
std::unique_ptr<sta::LibertyCellPortIterator> port_iter(
std::unique_ptr<sta::ConcreteCellPortIterator> port_iter(
Copy link
Copy Markdown
Collaborator

@rovinski rovinski Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for the change to ConcereteCell?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because it triggers a clang-tidy suggestion from last commit warning: no matching constructor for initialization of 'std::unique_ptrsta::LibertyCellPortIterator' [clang-diagnostic-error]

@rovinski
Copy link
Copy Markdown
Collaborator

@gemini-code-assist /review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements column muxing support (ratios 1, 2, and 4) in the RAM generator, utilizing AOI22 cells for read-path selection and shared select signals to minimize routing congestion. The review identifies critical bugs where multiple read ports are incorrectly shorted together and muxing logic is hardcoded for only the first port. Key feedback includes adding an error check to limit column muxing to single-port configurations, moving redundant gate instantiations out of per-slice loops to optimize the design, and restoring global pin placement settings to avoid unintended side effects.

Comment thread src/ram/src/ram.cpp
vector<vector<dbNet*>> word_output_nets(read_ports);
for (int port = 0; port < read_ports; ++port) {
for (int bit = 0; bit < word_size; ++bit) {
word_output_nets[port].push_back(word_q_nets[word_idx][bit]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This line incorrectly assigns the same intermediate net to all read ports for a given word and bit. This will result in a short circuit between different read ports in the generated netlist when read_ports > 1 and column_mux_ratio > 1.

Comment thread src/ram/src/ram.cpp
Comment on lines +1234 to +1246
vector<vector<dbNet*>> word_q_nets(column_mux_ratio,
vector<dbNet*>(word_size));
if (column_mux_ratio == 1) {
for (int bit = 0; bit < word_size; ++bit) {
word_q_nets[0][bit] = q_outputs_[0][bit]->getNet();
}
} else {
for (int w = 0; w < column_mux_ratio; ++w) {
for (int bit = 0; bit < word_size; ++bit) {
word_q_nets[w][bit] = makeNet("word_q", fmt::format("w{}_b{}", w, bit));
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The word_q_nets structure and initialization do not account for multiple read ports when column_mux_ratio > 1. Currently, it only creates a 2D vector indexed by word and bit, which means all read ports will be shorted to the same intermediate net. This should be a 3D structure indexed by [port][word_idx][bit] to support multi-port RAMs.

Comment thread src/ram/src/ram.cpp
Comment on lines +1306 to +1381
if (column_mux_ratio == 2) {
// mux placement
// col base_col+0: buffer
// col base_col+1: AOI22 + inverter side by side in same cell/column
auto aoi_out = makeNet(prefix, "aoi_out");
auto mux_cell = std::make_unique<Cell>();
makeInst(mux_cell.get(),
prefix,
"aoi",
aoi22_cell_,
{{aoi22_in_a1_, inv_addr[0]},
{aoi22_in_a2_, bit_word_q_nets[0]},
{aoi22_in_b1_, addr_inputs_[0]->getNet()},
{aoi22_in_b2_, bit_word_q_nets[1]},
{aoi22_out_, aoi_out}});
makeInst(mux_cell.get(),
prefix,
"inv",
inv_cell_,
{{inv_ports_[{PortRoleType::DataIn, 0}], aoi_out},
{inv_ports_[{PortRoleType::DataOut, 0}],
q_outputs_[0][global_bit]->getNet()}});
ram_grid_.addCell(std::move(mux_cell), base_col + 1);

} else if (column_mux_ratio == 4) {
// mux placement:
// col base_col+0: buffer
// col base_col+1: s1_AOI_0 (w0+w1)
// col base_col+2: s2_AOI final (even stages so no inverter needed)
// col base_col+3: s1_AOI_1 (w2+w3)
auto s1_out_0 = makeNet(prefix, "s1_out_0");
auto s1_out_1 = makeNet(prefix, "s1_out_1");

// col base_col+1: stage1 AOI for word0+word1
auto s1_cell_0 = std::make_unique<Cell>();
makeInst(s1_cell_0.get(),
prefix,
"s1_aoi_0",
aoi22_cell_,
{{aoi22_in_a1_, inv_addr[0]},
{aoi22_in_a2_, bit_word_q_nets[0]},
{aoi22_in_b1_, addr_inputs_[0]->getNet()},
{aoi22_in_b2_, bit_word_q_nets[1]},
{aoi22_out_, s1_out_0}});
ram_grid_.addCell(std::move(s1_cell_0), base_col + 1);

// col base_col+3: stage1 AOI for word2 + word3
// NOTE: must be placed before s2 so s1_out_1 net exists for s2 input
auto s1_cell_1 = std::make_unique<Cell>();
makeInst(s1_cell_1.get(),
prefix,
"s1_aoi_1",
aoi22_cell_,
{{aoi22_in_a1_, inv_addr[0]},
{aoi22_in_a2_, bit_word_q_nets[2]},
{aoi22_in_b1_, addr_inputs_[0]->getNet()},
{aoi22_in_b2_, bit_word_q_nets[3]},
{aoi22_out_, s1_out_1}});
ram_grid_.addCell(std::move(s1_cell_1), base_col + 3);

// col base_col+2: stage2 AOI combining stage 1 outputs and drives Q
// directly
auto s2_cell = std::make_unique<Cell>();
makeInst(s2_cell.get(),
prefix,
"s2_aoi",
aoi22_cell_,
{{aoi22_in_a1_, inv_addr[1]},
{aoi22_in_a2_, s1_out_0},
{aoi22_in_b1_, addr_inputs_[1]->getNet()},
{aoi22_in_b2_, s1_out_1},
{aoi22_out_, q_outputs_[0][global_bit]->getNet()}});
ram_grid_.addCell(std::move(s2_cell), base_col + 2);
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The column mux implementation using AOI22 cells is hardcoded to only handle the first read port (q_outputs_[0]). If read_ports > 1, the additional ports will not have the necessary muxing logic instantiated, leading to an incorrect or incomplete netlist for those ports.

Comment thread src/ram/src/ram.cpp
Comment on lines +1008 to +1013
if (column_mux_ratio != 1 && column_mux_ratio != 2 && column_mux_ratio != 4) {
logger_->error(RAM,
33,
"The ram generator currently only supports column_mux_ratio "
"values of 1, 2, or 4.");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the current implementation of column muxing only supports single-port RAMs correctly, an explicit error check should be added to prevent generating invalid netlists for multi-port configurations with column_mux_ratio > 1. This ensures critical failure conditions are logged as errors rather than proceeding with an invalid design.

  // error checking for column_mux_ratio
  if (column_mux_ratio != 1 && column_mux_ratio != 2 && column_mux_ratio != 4) {
    logger_->error(RAM,
                   33,
                   "The ram generator currently only supports column_mux_ratio "
                   "values of 1, 2, or 4.");
  }

  if (column_mux_ratio > 1 && read_ports > 1) {
    logger_->error(RAM,
                   36,
                   "Column muxing is currently only supported for single-port RAMs.");
  }
References
  1. Unexpected failures during the flow should be logged as an error, not a warning, to ensure critical failure messages are not missed by the user.

Comment thread src/ram/src/ram.cpp
Comment on lines +216 to +224
write_sel = makeNet(prefix, "write_sel");
makeInst(sel_cell.get(),
prefix,
"word_and",
and2_cell_,
{{and2_ports_[{PortRoleType::DataIn, 0}], selects[0]},
{and2_ports_[{PortRoleType::DataIn, 1}], word_select},
{and2_ports_[{PortRoleType::DataOut, 0}], write_sel}});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The word_and instance, which combines row_select and word_select, is being created inside makeSlice. Since word_select and selects[0] are constant for all slices within a word, this gate is redundantly instantiated for every slice. Following the principle of analyzing the entire loop structure, this operation should be moved to the beginning of the word generation (makeWord) and the resulting net passed to makeSlice to reduce area and power.

References
  1. When analyzing code within a loop, consider the entire loop structure to identify and move redundant operations from the loop body to the beginning of the loop.

Comment thread src/ram/src/ram.cpp
auto pin_tech = block_->getDb()->getTech();
io_placer_->addHorLayer(pin_tech->findLayer(hor_name));
io_placer_->addVerLayer(pin_tech->findLayer(ver_name));
io_placer_->getParameters()->setCornerAvoidance(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Setting cornerAvoidance to 0 on the io_placer_ object may have side effects on subsequent pin placement tasks if the object is shared or reused. It is safer to save the original value and restore it after runHungarianMatching.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be ignored because the OR session will not be reused.

@rovinski
Copy link
Copy Markdown
Collaborator

@tnguy19 I will give you the option of either:

  1. Fixing the code to handle when read_ports > 1 && col_mux_ratio > 1
  2. Throwing an error that it's unsupported to get this merged, then addressing it in a future PR.

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
@github-actions github-actions bot added size/L and removed size/XL labels Apr 17, 2026
@tnguy19
Copy link
Copy Markdown
Contributor Author

tnguy19 commented Apr 17, 2026

An error check for when read_ports > 1 && col_mux_ratio > 1 will be added for now since multiple r/w port supports are also being worked on at the moment. I'll add support for col/mux ratio with multiple read ports when that feature is completed

@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@rovinski
Copy link
Copy Markdown
Collaborator

Waiting to find out if there's a better way to distinguish register outputs before approving, but that's the last thing.

Signed-off-by: Thinh Nguyen <nguyenthinh19011@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ram: column/mux support

5 participants