From 74c4567eed54e7f5c2ccfb576c610ca38ac8e962 Mon Sep 17 00:00:00 2001
From: cassmasschelein <cassandramasschelein@gmail.com>
Date: Tue, 7 Apr 2026 00:16:42 +0200
Subject: [PATCH 01/14] [update] move from linear classifier to polynomial
 after new benchmarks

---
 .gitignore                         |   2 +
 README.md                          | 211 +++++++++++++++--------
 include/permanent/opt.h            | 123 +++++++++----
 include/permanent/tuning.default.h |  58 +++----
 src/tuning.cc                      |   6 +-
 tools/generate_tuning_header.py    | 268 ++++++++++++++++++-----------
 6 files changed, 421 insertions(+), 247 deletions(-)

diff --git a/.gitignore b/.gitignore
index 7bd2e46..7ddcd45 100644
--- a/.gitignore
+++ b/.gitignore
@@ -85,7 +85,9 @@ tags
 
 # Project files
 /include/permanent/tuning.h
+*.pkl
 
 # Language server files
 compile_commands.json
 .ccls-cache
+
diff --git a/README.md b/README.md
index c7ccc28..3570d6b 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,7 @@ matrix.
 
 `permanent.opt()`
 
-Compute the permanent of a matrix using the best algorithm for the shape of the given matrix.
+Compute the permanent of a matrix using an automatically selected algorithm. The library uses a polynomial logistic regression model (degree 4) trained on benchmarks to predict whether Ryser's or Glynn's algorithm will be faster for the given matrix dimensions.
 
 **Parameters:**
 
@@ -120,120 +120,193 @@ This can be neatly fit into the original formula by extending the inner sums ove
 The `permanent`  package allows you to solve the permanent of a given matrix using the
 **optimal algorithm** for your matrix dimensions.
 
-The easiest way to install the `permanent` package is via PyPI:
+## Quick Start (Recommended)
 
+The easiest way to install and use the `permanent` package:
+
+### From PyPI
 ```bash
-   pip install qc-permanent
+pip install qc-permanent
 ```
 
-This will install the package with pre-set parameters with a good performance for most cases.
-Advanced users can also **compile the code locally** and fine tune it for their specific
-architecture. They can either use the pre-defined parameters or fine tune them to their machine.
+### For Development
+```bash
+# Clone the repository and install with test dependencies
+# Note: This project uses pyproject.toml, editable mode requires pip >= 21.3
+pip install ".[test]"
 
+# For editable mode (requires pip >= 21.3 with PEP 660 support):
+# pip install --editable ".[test]"
 
-## Setting up your environment
+# Run tests
+pytest tests/
+```
 
-1. Install Python on your machine. Depending on your operating system, the instructions may vary.
+This will install the package with pre-set parameters that work well for most cases.
 
-2. Install gcc on your machine. Depending on your operating system, the instructions may vary.
+## Advanced Installation
 
-3. Create and activate a virtual environment for this project named `permanents`. One way to do this
-is with pip.
+For users who want to compile from source with custom optimizations or machine-specific tuning.
 
-   ```bash
-   pip install virtualenv
-   virtualenv permanents
-   ```
+### Prerequisites
 
-4. Activate the virtual environment.
+- **Python** ≥ 3.9
+- **CMake** ≥ 3.23 (required for building from source)
+- **C++ Compiler**: gcc ≥ 11.4 or equivalent
+- **make**: System build tool (not a Python package)
 
-   ```bash
-   source permanents/bin/activate
-   ```
+### Installing Prerequisites
 
-5. Install Sphinx and other dependencies.
+#### Ubuntu/Debian
+```bash
+# Install build tools and newer CMake
+sudo apt-get update
+sudo apt-get install build-essential
 
-   ```bash
-   pip install sphinx sphinx-rtd-theme sphinx-copybutton
-   ```
+# Install CMake 3.23+ (Ubuntu's default may be older)
+# Option 1: Via Kitware's APT repository
+sudo apt install ca-certificates gpg wget
 
-6. Install Python dependencies.
+# If the kitware-archive-keyring package has not been installed previously, manually obtain a copy of our signing key:
+test -f /usr/share/doc/kitware-archive-keyring/copyright ||
+wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
 
-   ```bash
-   pip install numpy pandas scikit-learn
-   ```
+# Add kitware's repository to your sources list and update.
+# For Ubuntu Focal Fossa (20.04):
+echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ focal main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
 
-7. (Optional) Install Pytest if you wish to run tests.
+# If the kitware-archive-keyring package has not been installed previously, remove the manually obtained signed key to make room for the package:
+test -f /usr/share/doc/kitware-archive-keyring/copyright ||
+sudo rm /usr/share/keyrings/kitware-archive-keyring.gpg
 
-   ```bash
-   pip install pytest
-   ```
+# Install the kitware-archive-keyring package to ensure that your keyring stays up to date as we rotate our keys:
+sudo apt install kitware-archive-keyring
 
-Now that you have your environment set up and activated you are ready to compile the source code
-into an executable. Here you have two options - compile the code as is with the pre-defined
-parameters for algorithm swapping, **or** compile the code with machine specific tuning for
-algorithm swapping. _Note that machine specific tuning will run a series of tests. This will take
-anywhere from 10 minutes to 1 hour depending on your system._
+# Finally we can install the cmake package
+sudo apt install cmake
+```
 
-## Option 1: Use given parameters
+# Option 2: Via official CMake website
+```bash
+# Navigate to https://cmake.org/download/ and download version 3.23+ for Linux x86_64
+# For example, for CMake 3.29.0:
+wget https://cmake.org/files/v3.29/cmake-3.29.0-linux-x86_64.tar.gz
 
-1. Compile the permanent code (natively for your CPU architecture).
+# Extract the archive
+tar -xzf cmake-3.29.0-linux-x86_64.tar.gz
 
-   ```bash
-   make BUILD_NATIVE=1
-   ```
+# Move to /opt or preferred location
+sudo mv cmake-3.29.0-linux-x86_64 /opt/cmake
 
-   **Note: if using M1 architecture, or want a portable build, simply run the following.**
+# Add to PATH
+export PATH=/opt/cmake/bin:$PATH
+# To make permanent, add the above line to ~/.bashrc
+```
 
-   ```bash
-   make
-   ```
+#### macOS
+```bash
+# Install Xcode Command Line Tools (includes make)
+xcode-select --install
+
+# Install CMake via Homebrew
+brew install cmake
+```
+
+#### Conda (All Platforms)
+```bash
+# Install within your conda environment
+conda install -c conda-forge make cmake compilers
+```
+
+### Setting up your environment
 
-2. (Optional) Run tests on the algorithms.
+1. Create and activate a virtual environment:
 
    ```bash
-   make test
+   python -m venv permanents
+   source permanents/bin/activate  
    ```
 
-3. Compile the website.
+2. Install Python dependencies:
 
    ```bash
-   cd docs && make html
+   pip install numpy pandas scikit-learn pytest
    ```
 
-4. Load the website.
+3. (Optional) For documentation:
 
    ```bash
-   open build/html/index.html
+   pip install sphinx sphinx-rtd-theme sphinx-copybutton
    ```
 
-## Option 2: Tune parameters
+### Build Options
 
-1. Compile the permanent code with the `tuning` flag.
+#### Option 1: Standard Build
+```bash
+# Basic build with default optimizations
+make
+```
 
-   ```bash
-   make RUN_TUNING=1
-   ```
+#### Option 2: Native CPU Optimizations
+```bash
+# Optimize for your specific CPU architecture
+make BUILD_NATIVE=1
+```
 
-   **Note: it will take some time to run the tuning tests on your machine.**
+**Note:** Use standard build for M1 Macs or if you need a portable build.
 
-2. (Optional) Run tests on the algorithms.
+#### Option 3: Machine-Specific Tuning
+```bash
+# Run extensive benchmarks to find optimal algorithm thresholds for your machine
+make PERMANENT_TUNE=1
+```
 
-   ```bash
-   make test
-   ```
+**Important Notes:**
+- Tuning will take 10-60 minutes depending on your system
+- Creates `include/permanent/tuning.h` with custom parameters
+- Creates `build/tuning.csv` with benchmark data
 
-3. Compile the website.
+#### Verify Tuning Success
+```bash
+# Check if tuning generated custom parameters
+diff include/permanent/tuning.h include/permanent/tuning.default.h
 
-   ```bash
-   cd docs && make html
-   ```
+# If files differ, tuning succeeded!
+# View generated parameters
+cat include/permanent/tuning.h
+```
 
-4. Load the website using your web browser.
+### Running Tests
+```bash
+# After building
+pytest tests/
 
-   ```bash
-   <browser> build/html/index.html
-   ```
+# Or using make
+make test
+```
+
+### Building Documentation
+```bash
+cd docs && make html
+# Open build/html/index.html in your browser
+```
+
+### Troubleshooting
+
+#### CMake version too old
+If you get "CMake 3.23 or higher is required", see the Prerequisites section above for installation instructions.
+
+#### make: command not found
+`make` is a system tool, not a Python package. Install it using your system's package manager (see Prerequisites).
+
+#### Build cache issues
+```bash
+# Clean and rebuild
+make clean
+# or
+rm -rf build/
+make
+```
 
 ## Notes about the `Makefile`
 
diff --git a/include/permanent/opt.h b/include/permanent/opt.h
index d9fc2fc..f81e65c 100644
--- a/include/permanent/opt.h
+++ b/include/permanent/opt.h
@@ -18,52 +18,105 @@
 
 namespace permanent {
 
-template <typename T, typename I = void>
-result_t<T, I> opt_square(const size_t m, const size_t n, const T *ptr)
+// Polynomial feature computation helper
+// Generates features in sklearn PolynomialFeatures order:
+// For degree d: N^d, N^(d-1)*(M/N), N^(d-2)*(M/N)^2, ..., (M/N)^d
+template <typename Type>
+inline void compute_poly_features(const size_t n, const double ratio, double* features)
 {
-  if (n <= PARAM_8<T, I>) {  // N ≤ 13
-    if (n <= PARAM_4<T, I>) {
-      return combinatoric_square<T, I>(m, n, ptr);
-    } else {
-      // Check first hyperplane: -6.67(m/n) + (-0.35)(n) + 6.49 = 0
-      // Using stored parameters for flexibility
-      const double ratio = static_cast<double>(m) / n;
-      if (PARAM_1<T, I> * ratio + PARAM_2<T, I> * n + PARAM_3<T, I> > 0) {
-        return combinatoric_square<T, I>(m, n, ptr);
-      } else {
-        return glynn_square<T, I>(m, n, ptr);
+  int idx = 0;
+
+  // Generate all combinations: N^i * (M/N)^j where i+j <= POLY_DEGREE
+  // sklearn uses lexicographic order: higher powers of first feature come first
+  for (int total_degree = 0; total_degree <= POLY_DEGREE; ++total_degree) {
+    for (int n_power = total_degree; n_power >= 0; --n_power) {
+      int mn_power = total_degree - n_power;
+
+      double term = 1.0;
+
+      // Compute N^n_power
+      for (int i = 0; i < n_power; ++i) {
+        term *= static_cast<double>(n);
       }
+
+      // Compute (M/N)^mn_power
+      for (int i = 0; i < mn_power; ++i) {
+        term *= ratio;
+      }
+
+      features[idx++] = term;
     }
-  } else {  // N > 13
-    // Check second hyperplane: -10.76(m/n) + 0.09(n) + 1.96 = 0
-    const double ratio = static_cast<double>(m) / n;
-    if (PARAM_5<T, I> * ratio + PARAM_6<T, I> * n + PARAM_7<T, I> > 0) {
-      return glynn_square<T, I>(m, n, ptr);
-    } else {
-      return ryser_square<T, I>(m, n, ptr);
+  }
+}
+
+// Predict optimal algorithm using polynomial logistic regression
+// Returns: 0 = Ryser, 1 = Glynn
+template <typename Type>
+inline int predict_algorithm(const size_t m, const size_t n)
+{
+  const double ratio = static_cast<double>(m) / static_cast<double>(n);
+
+  // 1. Compute polynomial features
+  double features[N_FEATURES];
+  compute_poly_features<Type>(n, ratio, features);
+
+  // 2. Scale features
+  double scaled_features[N_FEATURES];
+  for (int i = 0; i < N_FEATURES; ++i) {
+    scaled_features[i] = (features[i] - SCALER_MEAN[i]) / SCALER_SCALE[i];
+  }
+
+  // 3. Compute decision values for each class
+  double decision_values[N_CLASSES];
+  for (int c = 0; c < N_CLASSES; ++c) {
+    decision_values[c] = INTERCEPTS[c];
+    for (int f = 0; f < N_FEATURES; ++f) {
+      decision_values[c] += COEFFICIENTS[c * N_FEATURES + f] * scaled_features[f];
     }
   }
+
+  // 4. Return class with maximum decision value
+  int max_idx = 0;
+  double max_val = decision_values[0];
+  for (int i = 1; i < N_CLASSES; ++i) {
+    if (decision_values[i] > max_val) {
+      max_val = decision_values[i];
+      max_idx = i;
+    }
+  }
+
+  return max_idx;
+}
+
+template <typename T, typename I = void>
+result_t<T, I> opt_square(const size_t m, const size_t n, const T *ptr)
+{
+  const int algo = predict_algorithm<T>(m, n);
+
+  switch (algo) {
+    case 0:  // Ryser
+      return ryser_square<T, I>(m, n, ptr);
+    case 1:  // Glynn
+      return glynn_square<T, I>(m, n, ptr);
+    default:
+      // Fallback to Glynn for safety
+      return glynn_square<T, I>(m, n, ptr);
+  }
 }
 
 template <typename T, typename I = void>
 result_t<T, I> opt_rectangular(const size_t m, const size_t n, const T *ptr)
 {
-  if (n <= PARAM_8<T, I>) {  // N ≤ 13
-    // First hyperplane for small matrices
-    const double ratio = static_cast<double>(m) / n;
-    if (PARAM_1<T, I> * ratio + PARAM_2<T, I> * n + PARAM_3<T, I> > 0) {
-      return combinatoric_rectangular<T, I>(m, n, ptr);
-    } else {
+  const int algo = predict_algorithm<T>(m, n);
+
+  switch (algo) {
+    case 0:  // Ryser
+      return ryser_rectangular<T, I>(m, n, ptr);
+    case 1:  // Glynn
       return glynn_rectangular<T, I>(m, n, ptr);
-    }
-  } else {  // N > 13
-    // Second hyperplane for large matrices
-    const double ratio = static_cast<double>(m) / n;
-    if (PARAM_5<T, I> * ratio + PARAM_6<T, I> * n + PARAM_7<T, I> > 0) {
+    default:
+      // Fallback to Glynn for safety
       return glynn_rectangular<T, I>(m, n, ptr);
-    } else {
-      return ryser_rectangular<T, I>(m, n, ptr);
-    }
   }
 }
 
@@ -75,4 +128,4 @@ result_t<T, I> opt(const size_t m, const size_t n, const T *ptr)
 
 }  // namespace permanent
 
-#endif  // permanent_opt_h_
\ No newline at end of file
+#endif  // permanent_opt_h_
diff --git a/include/permanent/tuning.default.h b/include/permanent/tuning.default.h
index ac8e9a1..b4744d0 100644
--- a/include/permanent/tuning.default.h
+++ b/include/permanent/tuning.default.h
@@ -1,50 +1,34 @@
 /* Copyright 2024 QC-Devs (GPLv3) */
+/* Auto-generated tuning parameters for polynomial logistic regression
+ *
+ * Model: Polynomial Logistic Regression (degree 4)
+ * Train accuracy: 0.9847
+ * Test accuracy: 1.0000
+ *
+ * To regenerate: python tools/generate_tuning_header.py <benchmark.csv> <output.h>
+ */
 
 #if !defined(permanent_tuning_h_)
 #define permanent_tuning_h_
 
 namespace permanent {
 
-template <typename Type, typename IntType = void>
-struct _tuning_params_t
-{
-  // First hyperplane parameters (N ≤ 13): -6.67(m/n) + (-0.35)(n) + 6.49 = 0
-  static constexpr double PARAM_1 = -6.67000e+00;  // coefficient of m/n
-  static constexpr double PARAM_2 = -3.50000e-01;  // coefficient of n
-  static constexpr double PARAM_3 = +6.49000e+00;  // constant term
-  static constexpr double PARAM_4 = +4.00000e+00;  // Combinatorial crossover limit
+// Model hyperparameters
+constexpr int POLY_DEGREE = 4;
+constexpr int N_FEATURES = 15;
+constexpr int N_CLASSES = 2;
 
-  // Second hyperplane parameters (N > 13): -10.76(m/n) + 0.09(n) + 1.96 = 0
-  static constexpr double PARAM_5 = -1.07600e+01;  // coefficient of m/n
-  static constexpr double PARAM_6 = +9.00000e-02;  // coefficient of n
-  static constexpr double PARAM_7 = +1.96000e+00;  // constant term
-  static constexpr double PARAM_8 = +1.30000e+01;  // Combinatorial limit (N=13)
-};
+// Feature scaling parameters (StandardScaler)
+constexpr double SCALER_MEAN[N_FEATURES] = { 1.000000000000000e+00, 1.824361158432709e+01, 1.129990362851972e+00, 4.103151618398637e+02, 1.747018739352641e+01, 1.797670059373797e+00, 1.040173935264054e+04, 3.553134582623509e+02, 2.225283115820182e+01, 3.764157056856755e+00, 2.838747649063033e+05, 8.462959114139694e+03, 3.871226575809199e+02, 3.686420301846051e+01, 9.651467559254822e+00 };
+constexpr double SCALER_SCALE[N_FEATURES] = { 1.000000000000000e+00, 8.802601785839817e+00, 7.216590879601438e-01, 3.398767907207399e+02, 9.050702183587225e+00, 2.533742275150697e+00, 1.153458993518837e+04, 2.798498146858290e+02, 1.975975498128697e+01, 8.783581441596224e+00, 3.831817371278007e+05, 9.045567492744174e+03, 3.386025927099062e+02, 5.730099957760284e+01, 3.182500900147520e+01 };
 
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_1 = _tuning_params_t<Type, IntType>::PARAM_1;
+// Logistic regression coefficients [N_CLASSES * N_FEATURES]
+// Ordered as: [class0_feature0, class0_feature1, ..., class1_feature0, ...]
+constexpr double COEFFICIENTS[N_CLASSES * N_FEATURES] = { -0.000000000000000e+00, 4.589675095776374e+00, -6.012594985495101e+00, 9.588545838949250e+00, -2.077955485469633e+01, 5.824485713660757e+00, 3.272173581264640e+00, -4.277440104070251e+00, 4.070883997590151e+00, 7.030952123247757e+00, -3.376033076736038e+00, -5.116407997170819e+00, 6.921103120663089e+00, 1.606233546979973e+01, 5.695112278898278e+00, 0.000000000000000e+00, -4.589675095776374e+00, 6.012594985495101e+00, -9.588545838949250e+00, 2.077955485469633e+01, -5.824485713660757e+00, -3.272173581264640e+00, 4.277440104070251e+00, -4.070883997590151e+00, -7.030952123247757e+00, 3.376033076736038e+00, 5.116407997170819e+00, -6.921103120663089e+00, -1.606233546979973e+01, -5.695112278898278e+00 };
 
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_2 = _tuning_params_t<Type, IntType>::PARAM_2;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_3 = _tuning_params_t<Type, IntType>::PARAM_3;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_4 = _tuning_params_t<Type, IntType>::PARAM_4;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_5 = _tuning_params_t<Type, IntType>::PARAM_5;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_6 = _tuning_params_t<Type, IntType>::PARAM_6;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_7 = _tuning_params_t<Type, IntType>::PARAM_7;
-
-template <typename Type, typename IntType = void>
-static constexpr double PARAM_8 = _tuning_params_t<Type, IntType>::PARAM_8;
+// Logistic regression intercepts [N_CLASSES]
+constexpr double INTERCEPTS[N_CLASSES] = { 4.947950494439473e-01, -4.947950494439473e-01 };
 
 }  // namespace permanent
 
-#endif  // permanent_tuning_h_
\ No newline at end of file
+#endif  // permanent_tuning_h_
diff --git a/src/tuning.cc b/src/tuning.cc
index 5617924..a1bfb38 100644
--- a/src/tuning.cc
+++ b/src/tuning.cc
@@ -28,15 +28,15 @@ constexpr char CSV_HEADER[] = "M/N,N,Combn,Glynn,Ryser,Fastest";
 
 constexpr size_t NUM_TRIALS = 5;
 
-constexpr size_t MAX_ROWS = 26;
+constexpr size_t MAX_ROWS = 35;
 
-constexpr size_t MAX_COLS = 26;
+constexpr size_t MAX_COLS = 35;
 
 constexpr size_t DATA_POINTS = MAX_ROWS * MAX_COLS;
 
 constexpr double TOLERANCE = 0.0001;
 
-constexpr size_t RUN_COMBINATORIAL_UNTIL = 13;
+constexpr size_t RUN_COMBINATORIAL_UNTIL = 10;
 
 namespace {
 
diff --git a/tools/generate_tuning_header.py b/tools/generate_tuning_header.py
index fdfd7b5..9beb864 100644
--- a/tools/generate_tuning_header.py
+++ b/tools/generate_tuning_header.py
@@ -1,132 +1,194 @@
 import sys
 import numpy as np
 import pandas as pd
-from sklearn import svm
-from sklearn.metrics import accuracy_score
+from sklearn.linear_model import LogisticRegression
+from sklearn.preprocessing import PolynomialFeatures, StandardScaler
+from sklearn.pipeline import Pipeline
+from sklearn.model_selection import train_test_split, GridSearchCV
+from sklearn.metrics import accuracy_score, classification_report
+import pickle
 
 CSV_FILE = sys.argv[1]
 HEADER_FILE = sys.argv[2]
+MODEL_FILE = HEADER_FILE.replace('.h', '_poly.pkl')
 
 # Read output from tuning program
-tuning = pd.read_csv(CSV_FILE, usecols=["M/N", "N", "Fastest"])
-
-# Split data into two parts based on N=13
-tuning_small = tuning[tuning["N"] <= 13].copy()
-tuning_large = tuning[tuning["N"] > 13].copy()
-
-# Update label columns for ML
-tuning_small.rename(columns={"N": "x", "M/N": "y", "Fastest": "target"}, inplace=True)
-tuning_large.rename(columns={"N": "x", "M/N": "y", "Fastest": "target"}, inplace=True)
-
-# Find Combinatoric Square limit
-matching_row = tuning["Fastest"] == 1
-combn_limit = tuning.loc[matching_row, "N"].iloc[-1] if matching_row.any() else 0
-
-# Update classes for first SVM (N <= 13, all three algorithms)
-update_ryser = tuning_small["target"] == 0
-tuning_small.loc[update_ryser, "target"] = -1
-update_glynn = tuning_small["target"] == 2
-tuning_small.loc[update_glynn, "target"] = -1
-
-# Update classes for second SVM (N > 13, Glynn vs Ryser)
-tuning_large["target"] = np.where(tuning_large["target"] == 0, -1, 1)
-
-# Train first SVM (N <= 13)
-features_small = tuning_small[["x", "y"]]
-label_small = tuning_small["target"]
-size_small = tuning_small.shape[0]
-test_size_small = int(np.round(size_small * 0.1, 0))
-
-x_train_small = features_small[:-test_size_small].values
-y_train_small = label_small[:-test_size_small].values
-x_test_small = features_small[-test_size_small:].values
-y_test_small = label_small[-test_size_small:].values
-
-linear_model_small = svm.SVC(kernel="linear", C=100.0)
-linear_model_small.fit(x_train_small, y_train_small)
-
-# Train second SVM (N > 13)
-features_large = tuning_large[["x", "y"]]
-label_large = tuning_large["target"]
-size_large = tuning_large.shape[0]
-test_size_large = int(np.round(size_large * 0.1, 0))
-
-x_train_large = features_large[:-test_size_large].values
-y_train_large = label_large[:-test_size_large].values
-x_test_large = features_large[-test_size_large:].values
-y_test_large = label_large[-test_size_large:].values
-
-linear_model_large = svm.SVC(kernel="linear", C=100.0)
-linear_model_large.fit(x_train_large, y_train_large)
-
-# Get coefficients and bias for both models
-coefficients_small = linear_model_small.coef_[0]
-bias_small = linear_model_small.intercept_[0]
-coefficients_large = linear_model_large.coef_[0]
-bias_large = linear_model_large.intercept_[0]
-
-# Write header file with all parameters
-param_1 = coefficients_small[0]  # First hyperplane x coefficient
-param_2 = coefficients_small[1]  # First hyperplane y coefficient
-param_3 = bias_small            # First hyperplane bias
-param_4 = combn_limit          # Combinatoric square limit
-param_5 = coefficients_large[0] # Second hyperplane x coefficient
-param_6 = coefficients_large[1] # Second hyperplane y coefficient
-param_7 = bias_large           # Second hyperplane bias
-param_8 = 13.0                 # Combinatorial limit
-
+df = pd.read_csv(CSV_FILE)
+
+# Group by M, N and find fastest algorithm for each configuration
+grouped = df.groupby(['M', 'N'])
+tuning_data = []
+
+for (m, n), group in grouped:
+    fastest_idx = group['mean_time'].idxmin()
+    fastest_algo = group.loc[fastest_idx, 'algorithm']
+    tuning_data.append({
+        'N': n,
+        'M/N': m / n,
+        'Fastest': fastest_algo
+    })
+
+tuning = pd.DataFrame(tuning_data)
+
+# Convert algorithm names to integer labels
+# Only use Ryser (0) and Glynn (1) - exclude Combinatoric
+algo_to_label = {'ryser': 0, 'glynn': 1}
+tuning['Fastest'] = tuning['Fastest'].map(algo_to_label)
+
+# Filter out any combinatoric entries
+tuning = tuning.dropna()
+
+print(f"Total configurations: {len(tuning)}")
+print(f"Algorithm distribution:")
+print(tuning['Fastest'].value_counts().sort_index())
+
+features = tuning[["N", "M/N"]].values
+labels = tuning["Fastest"].values.astype(int)
+
+# Split into train and test sets
+label_counts = pd.Series(labels).value_counts()
+can_stratify = all(label_counts >= 2)
+
+if can_stratify:
+    x_train, x_test, y_train, y_test = train_test_split(
+        features, labels, test_size=0.1, random_state=42, stratify=labels
+    )
+    print("Using stratified split")
+else:
+    print(f"Warning: Cannot stratify - some classes have < 2 samples: {label_counts.to_dict()}")
+    x_train, x_test, y_train, y_test = train_test_split(
+        features, labels, test_size=0.1, random_state=42
+    )
+
+# Train polynomial logistic regression
+print("\nTraining polynomial logistic regression with grid search...")
+
+pipeline = Pipeline([
+    ('poly_features', PolynomialFeatures()),
+    ('scaler', StandardScaler()),
+    ('logistic', LogisticRegression(max_iter=5000, solver='lbfgs'))
+])
+
+param_grid = {
+    'poly_features__degree': [1, 2, 3, 4, 5],
+    'poly_features__include_bias': [True],
+    'logistic__C': [0.01, 0.1, 1, 10, 100]
+}
+
+grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy', n_jobs=-1, verbose=1)
+grid_search.fit(x_train, y_train)
+
+best_model = grid_search.best_estimator_
+best_degree = grid_search.best_params_['poly_features__degree']
+best_C = grid_search.best_params_['logistic__C']
+
+print(f"\nBest parameters:")
+print(f"  Polynomial degree: {best_degree}")
+print(f"  C: {best_C}")
+print(f"  Best CV score: {grid_search.best_score_:.4f}")
+
+y_pred = best_model.predict(x_test)
+test_accuracy = accuracy_score(y_test, y_pred)
+print(f"\nTest accuracy: {test_accuracy:.4f}")
+
+unique_test_classes = np.unique(y_test)
+class_names = ['Ryser', 'Glynn']
+present_class_names = [class_names[i] for i in unique_test_classes]
+
+print("\nClassification report:")
+print(classification_report(y_test, y_pred, labels=unique_test_classes, target_names=present_class_names, zero_division=0))
+
+# Save the model
+with open(MODEL_FILE, 'wb') as f:
+    pickle.dump(best_model, f)
+print(f"\nModel saved to: {MODEL_FILE}")
+
+# Extract model parameters
+poly_transformer = best_model.named_steps['poly_features']
+scaler = best_model.named_steps['scaler']
+log_reg = best_model.named_steps['logistic']
+
+scaler_mean = scaler.mean_
+scaler_scale = scaler.scale_
+coefficients = log_reg.coef_ 
+intercepts = log_reg.intercept_ 
+
+n_features = len(scaler_mean)
+n_classes = len(log_reg.classes_) 
+
+# For binary classification, sklearn stores 1 set of coefs (for class 1), but we need both classes
+# sklearn: decision_function > 0 means class 1, < 0 means class 0
+# We want: argmax of [decision_class0, decision_class1]
+# So: decision_class0 = -decision_function, decision_class1 = +decision_function
+# But sklearn labels are [0, 1], and we want class 0 = Ryser (0), class 1 = Glynn (1)
+# Check class mapping
+if n_classes == 2 and coefficients.shape[0] == 1:
+    # sklearn stores coefficients for the positive class (higher label)
+    # For classes [0, 1], positive class is 1
+    # decision_function > 0 → class 1, < 0 → class 0
+    # To use argmax: need decision[0] for class 0, decision[1] for class 1
+    # decision[0] = -decision_function, decision[1] = +decision_function
+    coefficients = np.vstack([-coefficients[0], coefficients[0]])
+    intercepts = np.array([-intercepts[0], intercepts[0]])
+
+print(f"\nModel details:")
+print(f"  Polynomial degree: {best_degree}")
+print(f"  Number of features: {n_features}")
+print(f"  Number of classes: {n_classes}")
+
+# Generate C++ header with just the learned parameters
 try:
     with open(HEADER_FILE, "w") as file_ptr:
+        means_str = ", ".join([f"{m:.15e}" for m in scaler_mean])
+
+        scales_str = ", ".join([f"{s:.15e}" for s in scaler_scale])
+
+        coefs_list = []
+        for c in range(n_classes):
+            for f in range(n_features):
+                coefs_list.append(f"{coefficients[c, f]:.15e}")
+        coefs_str = ", ".join(coefs_list)
+
+        intercepts_str = ", ".join([f"{i:.15e}" for i in intercepts])
+
         file_ptr.write(
             f"""/* Copyright 2024 QC-Devs (GPLv3) */
+/* Auto-generated tuning parameters for polynomial logistic regression
+ *
+ * Model: Polynomial Logistic Regression (degree {best_degree})
+ * Train accuracy: {grid_search.best_score_:.4f}
+ * Test accuracy: {test_accuracy:.4f}
+ *
+ * To regenerate: python tools/generate_tuning_header.py <benchmark.csv> <output.h>
+ */
 
 #if !defined(permanent_tuning_h_)
 #define permanent_tuning_h_
 
 namespace permanent {{
 
-template<typename Type, typename IntType = void>
-struct _tuning_params_t
-{{
-  static constexpr double PARAM_1 = {param_1:+16.9e};
-  static constexpr double PARAM_2 = {param_2:+16.9e};
-  static constexpr double PARAM_3 = {param_3:+16.9e};
-  static constexpr double PARAM_4 = {param_4:+16.9e};
-  static constexpr double PARAM_5 = {param_5:+16.9e};
-  static constexpr double PARAM_6 = {param_6:+16.9e};
-  static constexpr double PARAM_7 = {param_7:+16.9e};
-  static constexpr double PARAM_8 = {param_8:+16.9e};
-}};
-
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_1 = _tuning_params_t<Type, IntType>::PARAM_1;
+// Model hyperparameters
+constexpr int POLY_DEGREE = {best_degree};
+constexpr int N_FEATURES = {n_features};
+constexpr int N_CLASSES = {n_classes};
 
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_2 = _tuning_params_t<Type, IntType>::PARAM_2;
+// Feature scaling parameters (StandardScaler)
+constexpr double SCALER_MEAN[N_FEATURES] = {{ {means_str} }};
+constexpr double SCALER_SCALE[N_FEATURES] = {{ {scales_str} }};
 
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_3 = _tuning_params_t<Type, IntType>::PARAM_3;
+// Logistic regression coefficients [N_CLASSES * N_FEATURES]
+// Ordered as: [class0_feature0, class0_feature1, ..., class1_feature0, ...]
+constexpr double COEFFICIENTS[N_CLASSES * N_FEATURES] = {{ {coefs_str} }};
 
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_4 = _tuning_params_t<Type, IntType>::PARAM_4;
-
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_5 = _tuning_params_t<Type, IntType>::PARAM_5;
-
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_6 = _tuning_params_t<Type, IntType>::PARAM_6;
-
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_7 = _tuning_params_t<Type, IntType>::PARAM_7;
-
-template<typename Type, typename IntType = void>
-static constexpr double PARAM_8 = _tuning_params_t<Type, IntType>::PARAM_8;
+// Logistic regression intercepts [N_CLASSES]
+constexpr double INTERCEPTS[N_CLASSES] = {{ {intercepts_str} }};
 
 }}  // namespace permanent
 
 #endif  // permanent_tuning_h_
 """
         )
+    print(f"\nHeader file written to: {HEADER_FILE}")
 
 except IOError:
     print("Cannot open file!")
@@ -134,4 +196,4 @@
 
 except Exception as e:
     print("Error occurred:", e)
-    exit(1)
\ No newline at end of file
+    exit(1)

From 42c0336266251a688903abb1b90241d1dd918f8b Mon Sep 17 00:00:00 2001
From: cassmasschelein <cassandramasschelein@gmail.com>
Date: Fri, 17 Apr 2026 16:40:55 +0200
Subject: [PATCH 02/14] [update] clean up model pkl file]

---
 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index ee5296b..b9bdba7 100644
--- a/Makefile
+++ b/Makefile
@@ -12,6 +12,7 @@ endif
 
 CLEAN_TARGETS :=
 CLEAN_TARGETS += include/permanent/tuning.h
+CLEAN_TARGETS += include/permanent/*.pkl
 CLEAN_TARGETS += build dist _build _generate
 CLEAN_TARGETS += permanent.*egg-info permanent.*so
 CLEAN_TARGETS += compile_commands.json

From 482e6f37e42d1d875c3f880409eef3333f1ee259 Mon Sep 17 00:00:00 2001
From: cassmasschelein <cassandramasschelein@gmail.com>
Date: Thu, 23 Apr 2026 16:07:44 +0200
Subject: [PATCH 03/14] [update] README to reflect tuning protocol, fix bug in
 tuning header

---
 README.md                       |  7 +++---
 src/main.cc                     | 40 ---------------------------------
 src/tuning.cc                   |  4 ++--
 tools/generate_tuning_header.py | 30 +++++++------------------
 4 files changed, 14 insertions(+), 67 deletions(-)

diff --git a/README.md b/README.md
index 38133ad..7b59222 100644
--- a/README.md
+++ b/README.md
@@ -161,10 +161,11 @@ is with pip.
    pip install '.[doc,tune,test]'
    ```
 
-  If you want to generate a machine-specific tuning header, preface the `pip` command with the
-  corresponding environment variable like so:
+  If you want to generate a machine-specific tuning header for building the library, you must first install with tuning dependencies, and then build with tuning enabled:
    ```bash
-  PERMANENT_TUNE=ON pip install '.[tune]'
+   pip install '.[tune]'
+   make clean
+   PERMANENT_TUNE=1 PERMANENT_PYTHON=1 make
    ```
 
   This compiles the code with machine specific tuning for algorithm swapping.
diff --git a/src/main.cc b/src/main.cc
index 7af8d42..5b1e78a 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -153,74 +153,34 @@ permanent : (np.double|np.complex)
   {                                                                                        \
     int type = PyArray_TYPE(MATRIX);                                                       \
     if (type == NPY_INT8) {                                                                \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::int8_t *ptr =                                                             \
           reinterpret_cast<const std::int8_t *>(PyArray_GETPTR2(MATRIX, 0, 0));            \
       return PyLong_FromSsize_t(FN<std::int8_t, Py_ssize_t>(m, n, ptr));                   \
     } else if (type == NPY_INT16) {                                                        \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::int16_t *ptr =                                                            \
           reinterpret_cast<const std::int16_t *>(PyArray_GETPTR2(MATRIX, 0, 0));           \
       return PyLong_FromSsize_t(FN<std::int16_t, Py_ssize_t>(m, n, ptr));                  \
     } else if (type == NPY_INT32) {                                                        \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::int32_t *ptr =                                                            \
           reinterpret_cast<const std::int32_t *>(PyArray_GETPTR2(MATRIX, 0, 0));           \
       return PyLong_FromSsize_t(FN<std::int32_t, Py_ssize_t>(m, n, ptr));                  \
     } else if (type == NPY_INT64) {                                                        \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::int64_t *ptr =                                                            \
           reinterpret_cast<const std::int64_t *>(PyArray_GETPTR2(MATRIX, 0, 0));           \
       return PyLong_FromSsize_t(FN<std::int64_t, Py_ssize_t>(m, n, ptr));                  \
     } else if (type == NPY_UINT8) {                                                        \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::uint8_t *ptr =                                                            \
           reinterpret_cast<const std::uint8_t *>(PyArray_GETPTR2(MATRIX, 0, 0));           \
       return PyLong_FromSsize_t(FN<std::uint8_t, Py_ssize_t>(m, n, ptr));                  \
     } else if (type == NPY_UINT16) {                                                       \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::uint16_t *ptr =                                                           \
           reinterpret_cast<const std::uint16_t *>(PyArray_GETPTR2(MATRIX, 0, 0));          \
       return PyLong_FromSsize_t(FN<std::uint16_t, Py_ssize_t>(m, n, ptr));                 \
     } else if (type == NPY_UINT32) {                                                       \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::uint32_t *ptr =                                                           \
           reinterpret_cast<const std::uint32_t *>(PyArray_GETPTR2(MATRIX, 0, 0));          \
       return PyLong_FromSsize_t(FN<std::uint32_t, Py_ssize_t>(m, n, ptr));                 \
     } else if (type == NPY_UINT64) {                                                       \
-      if (std::abs<ptrdiff_t>(static_cast<ptrdiff_t>(m) - n) > 20) {                       \
-        PyErr_SetString(PyExc_ValueError,                                                  \
-                        "Difference between # cols and # rows cannot exceed 20");          \
-        return nullptr;                                                                    \
-      }                                                                                    \
       const std::uint64_t *ptr =                                                           \
           reinterpret_cast<const std::uint64_t *>(PyArray_GETPTR2(MATRIX, 0, 0));          \
       return PyLong_FromSsize_t(FN<std::uint64_t, Py_ssize_t>(m, n, ptr));                 \
diff --git a/src/tuning.cc b/src/tuning.cc
index a1bfb38..a85382a 100644
--- a/src/tuning.cc
+++ b/src/tuning.cc
@@ -28,9 +28,9 @@ constexpr char CSV_HEADER[] = "M/N,N,Combn,Glynn,Ryser,Fastest";
 
 constexpr size_t NUM_TRIALS = 5;
 
-constexpr size_t MAX_ROWS = 35;
+constexpr size_t MAX_ROWS = 25;
 
-constexpr size_t MAX_COLS = 35;
+constexpr size_t MAX_COLS = 25;
 
 constexpr size_t DATA_POINTS = MAX_ROWS * MAX_COLS;
 
diff --git a/tools/generate_tuning_header.py b/tools/generate_tuning_header.py
index 9beb864..674ed87 100644
--- a/tools/generate_tuning_header.py
+++ b/tools/generate_tuning_header.py
@@ -15,28 +15,14 @@
 # Read output from tuning program
 df = pd.read_csv(CSV_FILE)
 
-# Group by M, N and find fastest algorithm for each configuration
-grouped = df.groupby(['M', 'N'])
-tuning_data = []
-
-for (m, n), group in grouped:
-    fastest_idx = group['mean_time'].idxmin()
-    fastest_algo = group.loc[fastest_idx, 'algorithm']
-    tuning_data.append({
-        'N': n,
-        'M/N': m / n,
-        'Fastest': fastest_algo
-    })
-
-tuning = pd.DataFrame(tuning_data)
-
-# Convert algorithm names to integer labels
-# Only use Ryser (0) and Glynn (1) - exclude Combinatoric
-algo_to_label = {'ryser': 0, 'glynn': 1}
-tuning['Fastest'] = tuning['Fastest'].map(algo_to_label)
-
-# Filter out any combinatoric entries
-tuning = tuning.dropna()
+# CSV has columns: M/N, N, Combn, Glynn, Ryser, Fastest
+# Fastest column: 0 = Ryser, 1 = Combinatoric, 2 = Glynn
+# We only want to use Ryser (0) and Glynn (2) data
+# Filter out Combinatoric (1)
+tuning = df[df['Fastest'].isin([0, 2])].copy()
+
+# Remap Fastest: 0 (Ryser) stays 0, 2 (Glynn) becomes 1
+tuning['Fastest'] = tuning['Fastest'].replace({2: 1})
 
 print(f"Total configurations: {len(tuning)}")
 print(f"Algorithm distribution:")

From 580d1a33036c5cf00e1305e6c15af5a98083d562 Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 11:13:05 +0800
Subject: [PATCH 04/14] Fix the release versioning

---
 pyproject.toml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pyproject.toml b/pyproject.toml
index 4e48fd3..a250240 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -44,6 +44,7 @@ tune = ["pandas", "scikit-learn"]
 [tool.scikit-build]
 wheel.expand-macos-universal-tags = true
 cmake.args = ["-DPERMANENT_TUNE=OFF"]
+metadata.version.provider = "scikit_build_core.metadata.setuptools_scm"
 
 [tool.pytest.ini_options]
 minversion = "6.0"
@@ -77,4 +78,4 @@ extend-select = [
   "PGH",  # pygrep-hooks
   "RUF",  # Ruff-specific
   "UP",   # pyupgrade
-]
\ No newline at end of file
+]

From f357576ee2560231422625b48f427aae4252a1b0 Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 11:28:41 +0800
Subject: [PATCH 05/14] Remove Python 3.9 as it's already end of life

---
 .github/workflows/publish_pypi.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/publish_pypi.yml b/.github/workflows/publish_pypi.yml
index 48846b0..a207ff6 100644
--- a/.github/workflows/publish_pypi.yml
+++ b/.github/workflows/publish_pypi.yml
@@ -24,7 +24,7 @@ jobs:
       fail-fast: false
       matrix:
         os: [ubuntu-22.04, macos-14, windows-2022]
-        python-version: ["3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
 
     steps:
       - name: Checkout repository

From 82cc572f6bc76c71c5a24e2cf4a2ea288c9b905c Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 11:39:40 +0800
Subject: [PATCH 06/14] Update Python version

---
 pyproject.toml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pyproject.toml b/pyproject.toml
index a250240..65b40b0 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -6,7 +6,7 @@ build-backend = "scikit_build_core.build"
 name = "qc-permanent"
 description = "Extension module for computing permanents of square and rectangular matrices."
 readme = "README.md"
-requires-python = ">=3.9"
+requires-python = ">=3.10"
 authors = [{name = "QC-Devs", email = "qcdevs@gmail.com"}]
 keywords = ["math", "linear algebra", "combinatorics", "permanent"]
 classifiers = [

From 0c74552f9c24b87cd1118ab86dfb4bef51afc158 Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 11:44:13 +0800
Subject: [PATCH 07/14] Update GitHub Actions versions

---
 .github/workflows/publish_pypi.yml | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/publish_pypi.yml b/.github/workflows/publish_pypi.yml
index a207ff6..07f9a9a 100644
--- a/.github/workflows/publish_pypi.yml
+++ b/.github/workflows/publish_pypi.yml
@@ -28,10 +28,10 @@ jobs:
 
     steps:
       - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
 
       - name: Setup Python
-        uses: actions/setup-python@v5
+        uses: actions/setup-python@v6
         with:
           python-version: "3.13"  # Use latest Python to install cibuildwheel
 
@@ -65,7 +65,7 @@ jobs:
         run: ls -lh wheelhouse/
 
       - name: Upload built wheels
-        uses: actions/upload-artifact@v4
+        uses: actions/upload-artifact@v7
         with:
           name: qc-permanent-${{ matrix.os }}-py${{ matrix.python-version }}
           path: wheelhouse/*.whl
@@ -87,7 +87,7 @@ jobs:
 
     steps:
       - name: Download all the dists
-        uses: actions/download-artifact@v4
+        uses: actions/download-artifact@v8
         with:
           path: dist/
           merge-multiple: true  # Merge all artifacts into the dist/ directory

From 50ad2238765bdb82c7f5388c740cab268f4febac Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 11:49:01 +0800
Subject: [PATCH 08/14] Update the GitHub Actions versions

---
 .github/workflows/pull_request_build.yml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/pull_request_build.yml b/.github/workflows/pull_request_build.yml
index 9129eaa..e46cd66 100644
--- a/.github/workflows/pull_request_build.yml
+++ b/.github/workflows/pull_request_build.yml
@@ -17,14 +17,14 @@ jobs:
       fail-fast: false
       matrix:
         os: [ubuntu-22.04, macos-14, windows-2022]
-        python-version: ["3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
 
     steps:
       - name: Checkout repository
         uses: actions/checkout@v4
 
       - name: Setup Python
-        uses: actions/setup-python@v5
+        uses: actions/setup-python@v6
         with:
           python-version: "3.13"  # Use latest Python to install cibuildwheel
 
@@ -58,7 +58,7 @@ jobs:
         run: ls -lh wheelhouse/
 
       - name: Upload built wheels
-        uses: actions/upload-artifact@v4
+        uses: actions/upload-artifact@v7
         with:
           name: qc-permanent-${{ github.event.pull_request.number }}-${{ matrix.os }}-py${{ matrix.python-version }}
           path: wheelhouse/*.whl

From d08c5c17d0f0cae06cfcd0877d4efc6bdf022c8c Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 12:08:41 +0800
Subject: [PATCH 09/14] Fix equation rendering

---
 README.md | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index 7b59222..f28c658 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-[![Python 3](http://img.shields.io/badge/python-3-blue.svg)](https://docs.python.org/3/)
+[![This project supports Python 3.10+](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://python.org/downloads)
 [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/theochem/matrix-permanent/actions/workflows/pull_request.yml)
 [![GNU GPLv3](https://img.shields.io/badge/license-%20%20GNU%20GPLv3%20-green?style=plastic)](https://www.gnu.org/licenses/gpl-3.0.en.html)
 
@@ -37,9 +37,10 @@ Compute the permanent of a matrix using an automatically selected algorithm. The
 Compute the permanent of a matrix combinatorically.
 
 **Formula:**
-```math
+
+$$
 \text{per}(A) = \sum_{\sigma \in P(N,M)}{\prod_{i=1}^M{a_{i,{\sigma(i)}}}}
-```
+$$
 
 **Parameters:**
 
@@ -55,29 +56,29 @@ Compute the permanent of a matrix combinatorically.
 
 **Formula:**
 
-```math
+$$
 \text{per}(A) = \frac{1}{2^{N-1}} \cdot \sum_{\delta \in \left[\delta_1 = 1,~ \delta_2 \dots \delta_N=\pm1\right]}{
     \left(\sum_{k=1}^N{\delta_k}\right){\prod_{j=1}^N{\sum_{i=1}^N{\delta_i a_{i,j}}}}}
-```
+$$
 
 **Additional Information:**
 The original formula has been generalized here to work with $M$-by-$N$ rectangular permanents with
 $M \leq N$ by use of the following identity (shown here for $M \geq N$):
 
-```math
+$$
 \text{per}\left(\begin{matrix}a_{1,1} & \cdots & a_{1,N} \\ \vdots & \ddots & \vdots \\ a_{M,1} & \cdots & a_{M,N}\end{matrix}\right) = \frac{1}{(M - N + 1)!} \cdot \text{per}\left(\begin{matrix}a_{1,1} & \cdots & a_{1,N} & 1_{1,N+1} & \cdots & 1_{1,M} \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ a_{M,1} & \cdots & a_{M,N} & 1_{M,N+1} & \cdots & 1_{M,M}\end{matrix}\right)
-```
+$$
 
 This can be neatly fit into the original formula by extending the inner sums over $\delta$ from $[1,M]$ to $[1,N]$:
 
-```math
+$$
 \text{per}(A) = \frac{1}{2^{N-1}} \cdot \frac{1}{(N - M + 1)!}\cdot \sum_{\delta \in \left[\delta_1 = 1,~ \delta_2 \dots \delta_N=\pm1\right]}{
         \left(\sum_{k=1}^N{\delta_k}\right)
         \prod_{j=1}^N{\left(
             \sum_{i=1}^M{\delta_i a_{i,j}} + \sum_{i=M+1}^N{\delta_i}
         \right)}
     }
-```
+$$
 
 **Parameters:**
 
@@ -93,7 +94,7 @@ This can be neatly fit into the original formula by extending the inner sums ove
 
 **Formula:**
 
-```math
+$$
 \text{per}(A) = \sum_{k=0}^{M-1}{
         {(-1)}^k
         \binom{N - M + k}{k}
@@ -103,7 +104,7 @@ This can be neatly fit into the original formula by extending the inner sums ove
             }
         }
     }
-```
+$$
 
 **Parameters:**
 
@@ -182,4 +183,4 @@ is with pip.
 # License
 
 This code is distributed under the GNU General Public License version 3 (GPLv3).
-See <http://www.gnu.org/licenses/> for more information.
\ No newline at end of file
+See <http://www.gnu.org/licenses/> for more information.

From b87547ae9678e91737fdbec98621e6d7590f089f Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 12:17:11 +0800
Subject: [PATCH 10/14] Use https instead of http

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index f28c658..344fbf4 100644
--- a/README.md
+++ b/README.md
@@ -183,4 +183,4 @@ is with pip.
 # License
 
 This code is distributed under the GNU General Public License version 3 (GPLv3).
-See <http://www.gnu.org/licenses/> for more information.
+See <https://www.gnu.org/licenses/> for more information.

From 4ba969a4dc703d784af3d44711cae001f15bb95a Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 12:20:38 +0800
Subject: [PATCH 11/14] Fix the misaligned equations

---
 README.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README.md b/README.md
index 344fbf4..3d3cb6c 100644
--- a/README.md
+++ b/README.md
@@ -66,7 +66,9 @@ The original formula has been generalized here to work with $M$-by-$N$ rectangul
 $M \leq N$ by use of the following identity (shown here for $M \geq N$):
 
 $$
+\begin{aligned}
 \text{per}\left(\begin{matrix}a_{1,1} & \cdots & a_{1,N} \\ \vdots & \ddots & \vdots \\ a_{M,1} & \cdots & a_{M,N}\end{matrix}\right) = \frac{1}{(M - N + 1)!} \cdot \text{per}\left(\begin{matrix}a_{1,1} & \cdots & a_{1,N} & 1_{1,N+1} & \cdots & 1_{1,M} \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ a_{M,1} & \cdots & a_{M,N} & 1_{M,N+1} & \cdots & 1_{M,M}\end{matrix}\right)
+\end{aligned}
 $$
 
 This can be neatly fit into the original formula by extending the inner sums over $\delta$ from $[1,M]$ to $[1,N]$:

From 7248a97304692190742ddc2688ac981540908a89 Mon Sep 17 00:00:00 2001
From: Fanwang Meng <fanwang.meng@queensu.ca>
Date: Thu, 23 Apr 2026 12:38:25 +0800
Subject: [PATCH 12/14] Fix rendering2 (#39)

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 3d3cb6c..f94be1d 100644
--- a/README.md
+++ b/README.md
@@ -62,7 +62,7 @@ $$
 $$
 
 **Additional Information:**
-The original formula has been generalized here to work with $M$-by-$N$ rectangular permanents with
+The original formula has been generalized here to work with $M$ by $N$ rectangular permanents with
 $M \leq N$ by use of the following identity (shown here for $M \geq N$):
 
 $$

From a41fcf829620b421409013c16cf33c49e2f71ce2 Mon Sep 17 00:00:00 2001
From: Michelle Richer <michellericher93@gmail.com>
Date: Thu, 23 Apr 2026 11:00:52 -0400
Subject: [PATCH 13/14] Fix instructions in readme

---
 README.md | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index f94be1d..eb809d7 100644
--- a/README.md
+++ b/README.md
@@ -143,7 +143,7 @@ is with pip.
 
    ```bash
    python -m venv permanents
-   source permanents/bin/activate  
+   source permanents/bin/activate
    ```
 
 2. Install Python dependencies:
@@ -166,9 +166,7 @@ is with pip.
 
   If you want to generate a machine-specific tuning header for building the library, you must first install with tuning dependencies, and then build with tuning enabled:
    ```bash
-   pip install '.[tune]'
-   make clean
-   PERMANENT_TUNE=1 PERMANENT_PYTHON=1 make
+   PERMANENT_TUNE=ON pip install '.[tune]'
    ```
 
   This compiles the code with machine specific tuning for algorithm swapping.
@@ -185,4 +183,4 @@ is with pip.
 # License
 
 This code is distributed under the GNU General Public License version 3 (GPLv3).
-See <https://www.gnu.org/licenses/> for more information.
+See <https://www.gnu.org/licenses/> for more information.
\ No newline at end of file

From 7cf1942fc5fff73f0f8804a17d91aae4750f2bdc Mon Sep 17 00:00:00 2001
From: Michelle Richer <michellericher93@gmail.com>
Date: Thu, 23 Apr 2026 11:07:03 -0400
Subject: [PATCH 14/14] Revert "Merge branch 'main' into benchmarks"

This reverts commit d52df4bb9df0c45bc364e3df3358ce54150ca7fc, reversing
changes made to a41fcf829620b421409013c16cf33c49e2f71ce2.
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index b4674c2..eb809d7 100644
--- a/README.md
+++ b/README.md
@@ -183,4 +183,4 @@ is with pip.
 # License
 
 This code is distributed under the GNU General Public License version 3 (GPLv3).
-See <https://www.gnu.org/licenses/> for more information.
+See <https://www.gnu.org/licenses/> for more information.
\ No newline at end of file