Pyomo. DoE: Sensitivity initialization#3639
Conversation
…pected. Now, I need to add cases when some of the design variables are not changed.
…mo into sensitivity-initialization
… use compute_FIM_metrics() method. Added both the nested for loop and change_one_design_at_a_time argument to compute_FIM_factorial() in doe.py. Everything is working fine.
…gn_values` does not need to be in the same order as the `experiment_inputs` as long as the key matches, we are good. Also, if we don't want to change a design variable, just not passing it as a key in the `design_values` dictionary will do the trick. Tested with the `reactor_example.py`
|
@adowling2 @djlaky, this draft PR is ready for initial review. |
|
Does this PR depend on changes in #3575? |
adowling2
left a comment
There was a problem hiding this comment.
@smondal13 I left some initial comments.
| @@ -0,0 +1,267 @@ | |||
| # ___________________________________________________________________________ | |||
There was a problem hiding this comment.
@smondal13 Why are your creating a new utils file instead of updating the current one? There might be a point of confusion regarding branches in git.
There was a problem hiding this comment.
This issue is resolved now. merged the utils file from my previous PR # 3525.
|
|
||
| return self.fim_factorial_results | ||
|
|
||
| def compute_FIM_factorial( |
There was a problem hiding this comment.
Was this function already in Pyomo.DoE somewhere else?
There was a problem hiding this comment.
No, I created this method. I do not see the same name anywhere else.
There was a problem hiding this comment.
@adowling2, we have compute_FIM_full_factorial(). I have not changed that method, rather I have added a new method. Maybe we can show a deprecation warning for compute_FIM_full_factorial()
There was a problem hiding this comment.
Yes, a depreciation warning sounds reasonable.
There was a problem hiding this comment.
Actually, I am really confused: why did you create a new method in the same file that did effectively the same thing? Does this replace the other method? Is it a superset of the other method? Do we need both?
Also, you might consider renaming this function: I don't know what a FIM factorial is (nor how to compute one). Maybe something like compute_FIM_from_full_factorial_sampling?
There was a problem hiding this comment.
@jsiirola This function is a superset of the already existing compute_FIM_full_factorial() method. So, I am not sure if I should keep it under the same name since the "full" in the method name can be misleading. It can use a full factorial (changing all the decision variables) and a fractional factorial (changing some of the decision variables). This method basically analyzes the sensitivity.
@adowling2, @slilonfe5, @snarasi2, @sscini Can we name it something like compute_sensitivity_at_design_points()? Or similar to the one @jsiirola suggested, compute_FIM_from_factorial_sampling?
There was a problem hiding this comment.
@smondal13 If sensitivity is what you are doing, I think compute_sensitivity_at_design_points is better.
| # design_map_keys so that design_point can be constructed correctly in the loop. | ||
| des_ranges = [design_values[k.name] for k in design_map_keys] | ||
| if change_one_design_at_a_time: | ||
| factorial_points = generate_snake_zigzag_pattern(*des_ranges) |
There was a problem hiding this comment.
Let's brainstorm different sensitivity analysis sequences we might use. We can then define an enum.
Also added test for patter generator
…mo into sensitivity-initialization
…ensitivity-initialization
…f. The code is run by using a simple model.
|
@adowling2 , @djlaky, @blnicho This PR is ready for design review. |
adowling2
left a comment
There was a problem hiding this comment.
Please see my requested changes.
| traversal_scheme="snake_traversal", | ||
| file_name: str = None, | ||
| ): | ||
| """Will run a simulation-based factorial exploration of the experimental input |
There was a problem hiding this comment.
Change "will run" to "performs"
| ---------- | ||
| model : DoE model, optional | ||
| The model to perform the full factorial exploration on. Default: None | ||
| design_vals : dict, |
There was a problem hiding this comment.
I'll let @blnicho also chime in but my initial thoughts are:
The best general practice is to accept component objects / references, not names. It is convenient (for users) to optionally accept names and resolve them internally. So if it's possible, support both, with a clear canonical form.
|
|
||
| return self.factorial_results | ||
|
|
||
| # TODO: Overhaul plotting functions to not use strings |
There was a problem hiding this comment.
Do we still want to overhaul the plotting functions to not use strings? Or do we just want this function to be consistent with our user interface above? In other words, if we use string-based names above, I think it it okay to use string-based names here.
| def draw_factorial_figure( | ||
| self, | ||
| results=None, | ||
| sensitivity_design_variables=None, |
There was a problem hiding this comment.
Let's update this user interface to match the function above. It appears that design variables as specified differently in these two functions. Likewise, this function should do design variable name error checking similar to the function above.
There was a problem hiding this comment.
Made the FIM metrics consistent in both compute_FIM_full_factorial() and compute_FIM_factorial(). Added design variable name error checking.
|
|
||
| Parameters | ||
| ---------- | ||
| covariance_matrix : numpy.ndarray |
There was a problem hiding this comment.
Can this be a square DataFrame? If yes, what type of error checking should we do on the column and row names?
There was a problem hiding this comment.
Good point. We now support square DataFrame input and have added explicit label checks: row labels must be unique, column labels must be unique, and row/column labels must match in the same order. If any of those fail, we raise a ValueError with a specific message.
| # Set the index to the variable names if provided, | ||
| corr_df = ( | ||
| pd.DataFrame(correlation_matrix, index=var_name, columns=var_name) | ||
| if var_name |
There was a problem hiding this comment.
What is this logic doing here?
There was a problem hiding this comment.
This ternary block returns a pd DataFrame if var_name is passed; otherwise, it returns a numpy array.
|
@smondal13 Were you able to address the comments from my review in early February? If yes, I can re-review. Reply here. |
Excluding the |
|
|
||
| return self.fim_factorial_results | ||
|
|
||
| def compute_FIM_factorial( |
There was a problem hiding this comment.
Actually, I am really confused: why did you create a new method in the same file that did effectively the same thing? Does this replace the other method? Is it a superset of the other method? Do we need both?
Also, you might consider renaming this function: I don't know what a FIM factorial is (nor how to compute one). Maybe something like compute_FIM_from_full_factorial_sampling?
| 2, | ||
| ), | ||
| ) | ||
| except: |
There was a problem hiding this comment.
Bare except: is potentially problematic, as it catching everything including both solver errors and generic programming errors. At a minimum, you should include the exception in the log so that a user can get an idea as to why things failed.
There was a problem hiding this comment.
Added the error message inside the logger warning.
| print("\n\n=========Factorial results===========") | ||
| print("Total points:", total_points) | ||
| print("Success count:", success_count) | ||
| print("Failure count:", failure_count) | ||
| print("\n") | ||
| print(res_df) |
There was a problem hiding this comment.
We generally strongly discourage print() within Pyomo. Instead, output should be channeled via loggers or global streams (like sys.stdout) that are easily replaceable.
There was a problem hiding this comment.
I have repaced the print with sys.stdout
| if not isinstance(fixed_design_variables, dict): | ||
| raise ValueError("``fixed_design_variables`` must be a dictionary.") | ||
| if any(not isinstance(name, str) for name in fixed_design_variables.keys()): | ||
| raise ValueError("Fixed design variable names must be strings.") |
There was a problem hiding this comment.
This kind of type checking is very "non-Pythonic" and something I would generally discourage. Python encourages duck-typing: as long as the variable behaves like the desired type, then it should be OK.
There was a problem hiding this comment.
I have removed the strict type checks,
Co-authored-by: John Siirola <jsiirola@users.noreply.github.com>
…gn_vals input; add tests for component key normalization and error handling
… to abs_step/rel_step and update related logic; add test for factorial step formula using design span
…change to abs_step/rel_step in docstring and test case
…p_change and update related comments for clarity
Co-authored-by: John Siirola <jsiirola@users.noreply.github.com>
…mo into sensitivity-initialization
…ccess and update related functions. Run black
…ance error handling. Update related tests for consistency.
…d for improved design space handling
…prove output formatting and add validation for design mode arguments in compute_FIM_factorial.
…e unique row and column labels, and match between them. Add corresponding unit tests for validation.
| rel_change: list = None, | ||
| n_designs: int = 5, | ||
| method="sequential", | ||
| df_settings=(True, None, None, 500), |
There was a problem hiding this comment.
Should I use this tuple? I was reluctant to use 4 different arguments for the pd df setting / configuration
| rel_change: list = None, | ||
| n_design_points: int = None, | ||
| method="sequential", | ||
| return_df=True, |
There was a problem hiding this comment.
Since the method is printing the df depending on this flag, not returning it, does it make more sense to use show_df instead of return_df?
| # Set the index to the variable names if provided, | ||
| corr_df = ( | ||
| pd.DataFrame(correlation_matrix, index=var_name, columns=var_name) | ||
| if var_name |
There was a problem hiding this comment.
This ternary block returns a pd DataFrame if var_name is passed; otherwise, it returns a numpy array.
| factorial_points = snake_traversal_grid_sampling(*design_values) | ||
| elif scheme_enum == SensitivityInitialization.nested_for_loop: | ||
| factorial_points = product(*design_values) | ||
|
|
There was a problem hiding this comment.
Great suggestion, and I implemented that split: traversal generation now lives in a dedicated helper (_materialize_factorial_points).
Current tests already cover traversal behavior through the public API (including invalid scheme errors and explicit nested-for-loop ordering on a 2x2 grid).
|
|
||
| Parameters | ||
| ---------- | ||
| covariance_matrix : numpy.ndarray |
There was a problem hiding this comment.
Good point. We now support square DataFrame input and have added explicit label checks: row labels must be unique, column labels must be unique, and row/column labels must match in the same order. If any of those fail, we raise a ValueError with a specific message.
|
|
||
| return self.fim_factorial_results | ||
|
|
||
| def compute_FIM_factorial( |
There was a problem hiding this comment.
@jsiirola Thanks for the suggestion. I agree and have refactored this in the latest update.
Fixes # .
Summary/Motivation:
Changes proposed in this PR:
compute_FIM_factorial()in Pyomo.DoE for sensitivity initializationsnake_traversal_grid_sampling()andcompute_correlation_matrix()in utils.pyLegal Acknowledgement
By contributing to this software project, I have read the contribution guide and agree to the following terms and conditions for my contribution: