This repository contains the code from my summer research project on distributionally robust rare-event simulation for financial risk.
The research addresses robust risk management problems of the form:
sup_{ν: d(ν,μ) ≤ δ} P_ν(Loss ≥ threshold)
where we seek the worst-case probability over all distributions ν within distance δ of a nominal distribution with mean μ.
Linear/- Linear constraint problems (portfolio loss functions)Quadratic/- Quadratic constraint problemspiecewise/- Piecewise linear constraints (options portfolios)Rebalance Delta Hedging/- Dynamic hedging with rebalancing costsAmerican Options/- American option boundary analysisAmerican option boundary/- Additional American option studies
Each problem type contains subdirectories for different estimation approaches:
CIS-CIS/- Two-stage Conditional Importance SamplingMC-MC/- Two-stage Monte CarloCIS-MC/- Hybrid: CIS for radius estimation, MC for probabilityMC-CIS/- Hybrid: MC for radius estimation, CIS for probability
- Stage 1: Estimate the optimal ambiguity radius
u*that satisfies the constraint - Stage 2: Estimate the worst-case probability at radius
u*
- Monte Carlo (MC): Standard sampling-based estimation
- Conditional Importance Sampling (CIS): Variance reduction through optimal importance sampling
- Hybrid Methods: Combine MC and CIS for different stages
get_worst_p_[type]_[method1]_[method2].py- Main estimation functionsexperiments_[method]_[type].py- Experimental frameworks and parameter sweepsget_radius_[method]_[type].py- Stage 1 radius estimationrobust_[method]_[type].py- Stage 2 probability estimation
find_xstar.py- Optimization problem solvershift_samples.py- Importance sampling transformations- Various helper functions for each constraint type
The experiments use S&P 500 stock data (MSFT, NVDA, AAPL, AMZN, META, GOOGL, BRK.B, TSLA, JPM, UNH):
- Returns from June 2021 to June 2024
- Mean returns μ and correlation matrix Rho defined in experiment files
*_results.csv- Experimental results with performance metricsratios_vs_CIS_linear_results_with_MC.csv- Method comparisonsplot.ipynb- Visualization and analysis
# Example: Linear CIS-CIS experiment
from Linear.CIS_CIS.experiments_cis_linear import run_threshold_sweep
# Set parameters
thresholds = [1, 2, 4, 6, 10] # Loss thresholds to test
delta = 0.02 # Ambiguity radius
N_tot = int(1e5) # Total samples
runs = 5 # Independent runs
# Run experiment
run_threshold_sweep(thresholds, delta, mu, Sigma, w, N_alpha, N_tot, runs, rng)delta: Ambiguity set radius (Wasserstein distance)loss_threshold: Loss level for probability estimationmu,Sigma: Mean and covariance of nominal distributionw: Portfolio weights (linear case) or constraint matrices (other cases)N_alpha,N_tot: Sample allocation between stagesruns: Number of independent replications
The experiments track several key metrics:
mean_p: Average worst-case probability estimaterel_error: Relative error (confidence interval / estimate)lsre_mean: Log-scale relative efficiencymean_time_sec: Computational timevar_H,var_p: Variance estimates for each stage
The ratios_vs_CIS_linear_results_with_MC.csv file contains comparative analysis showing:
- Efficiency ratios between methods (CIS, MC, hybrid approaches)
- Time complexity comparisons
- Accuracy trade-offs
- [
american_put_boundary.py](American option boundary/CIS-CIS ( no exercise) /american_put_boundary.py) - Boundary computation - Analysis of early exercise decisions under ambiguity
VaR_cis_cis.py- Value-at-Risk under distributional ambiguity- Bootstrap confidence intervals for risk measures
- Dynamic rebalancing with transaction costs
- Robust hedging under parameter uncertainty
numpy
scipy
pandas
matplotlib
numba # For performance optimizationThis work addresses fundamental challenges in financial risk management:
- Model Uncertainty: Traditional risk models assume known distributions
- Computational Efficiency: Robust optimization problems are computationally intensive
- Practical Implementation: Methods must scale to realistic portfolio sizes
The hybrid CIS-MC approaches often provide the best balance of accuracy and computational efficiency, as demonstrated in the comparative results.