StateSpaceLearning

Build Status	Coverage	Documentation	CodeStyle

StateSpaceLearning.jl is a package for modeling and forecasting time series in a high-dimension regression framework.

Quickstart

using StateSpaceLearning

y = randn(100)

# Instantiate Model
model = StructuralModel(y)

# Fit Model
fit!(model)

# Point Forecast
prediction = forecast(model, 12) # Gets a 12 steps ahead prediction

# Scenarios Path Simulation
simulation = simulate(model, 12, 1000) # Gets 1000 scenarios path of 12 steps ahead predictions

StructuralModel Arguments

y::Union{Vector,Matrix}: Time series data
level::String: Level component type: "stochastic", "deterministic", or "none" (default: "stochastic")
slope::String: Slope component type: "stochastic", "deterministic", or "none" (default: "stochastic")
seasonal::String: Seasonal component type: "stochastic", "deterministic", or "none" (default: "stochastic")
cycle::String: Cycle component type: "stochastic", "deterministic", or "none" (default: "none")
freq_seasonal::Union{Int,Vector{Int}}: Seasonal frequency or vector of frequencies (default: 12)
cycle_period::Union{Union{Int,<:AbstractFloat},Vector{Int},Vector{<:AbstractFloat}}: Cycle period or vector of periods (default: 0)
outlier::Bool: Include outlier component (default: true)
ξ_threshold::Int: Threshold for level innovations (default: 1)
ζ_threshold::Int: Threshold for slope innovations (default: 12)
ω_threshold::Int: Threshold for seasonal innovations (default: 12)
ϕ_threshold::Int: Threshold for cycle innovations (default: 12)
stochastic_start::Int: Starting point for stochastic components (default: 1)
exog::Matrix: Matrix of exogenous variables (default: zeros(length(y), 0))
dynamic_exog_coefs::Union{Vector{<:Tuple}, Nothing}: Dynamic exogenous coefficients (default: nothing)

Features

Current features include:

Model estimation using elastic net based regularization
Automatic component decomposition (trend, seasonal, cycle)
Point forecasting and scenario simulation
Missing value imputation
Outlier detection and robust modeling
Multiple seasonal frequencies support
Deterministic and stochastic component options
Dynamic exogenous variable handling
Best subset selection for exogenous variables

Quick Examples

Fitting, forecasting and simulating

Quick example of fit and forecast for the air passengers time-series.

using StateSpaceLearning
using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)
steps_ahead = 30

model = StructuralModel(log_air_passengers)
StateSpaceLearning.fit!(model)
prediction_log = StateSpaceLearning.forecast(model, steps_ahead) # arguments are the output of the fitted model and number of steps ahead the user wants to forecast
prediction = exp.(prediction_log)

plot_point_forecast(airp.passengers, prediction)

N_scenarios = 1000
simulation = simulate(model, steps_ahead, N_scenarios) # arguments are the output of the fitted model, number of steps ahead the user wants to forecast and number of scenario paths

plot_scenarios(airp.passengers, exp.(simulation))

Component Extraction

Quick example on how to perform component extraction in time series utilizing StateSpaceLearning.

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

model = StructuralModel(log_air_passengers)
fit!(model)

# Access decomposed components directly
trend = model.output.decomposition["trend"]
seasonal = model.output.decomposition["seasonal_12"]

plot(trend, w=2, color = "Black", lab = "Trend Component", legend = :outerbottom)
plot(seasonal, w=2, color = "Black", lab = "Seasonal Component", legend = :outerbottom)

Best Subset Selection and Dynamic Coefficients

Example of performing best subset selection and using dynamic coefficients:

using StateSpaceLearning
using CSV
using DataFrames
using Random

Random.seed!(2024)

# Load data
airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

# Create exogenous features
X = rand(length(log_air_passengers), 10)
β = rand(3)
y = log_air_passengers + X[:, 1:3]*β

# Create model with exogenous variables
model = StructuralModel(y; 
    exog = X
)

# Fit model with elastic net regularization
fit!(model; 
    α = 1.0, # 1.0 for Lasso, 0.0 for Ridge
    information_criteria = "bic",
    ϵ = 0.05,
    penalize_exogenous = true,
    penalize_initial_states = true
)

# Get selected features
selected_exog = model.output.components["exog"]["Selected"]

Missing values imputation

Quick example of completion of missing values for the air passengers time-series (artificial NaN values are added to the original time-series).

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

airpassengers = AbstractFloat.(airp.passengers)
log_air_passengers[60:72] .= NaN

model = StructuralModel(log_air_passengers)
fit!(model)

fitted_completed_missing_values = ones(144).*NaN; fitted_completed_missing_values[60:72] = exp.(model.output.fitted[60:72])
real_removed_valued = ones(144).*NaN; real_removed_valued[60:72] = deepcopy(airp.passengers[60:72])
airpassengers[60:72] .= NaN

plot(airpassengers, w=2 , color = "Black", lab = "Historical", legend = :outerbottom)
plot!(real_removed_valued, lab = "Real Removed Values", w=2, color = "red")
plot!(fitted_completed_missing_values, lab = "Fit in Sample completed values", w=2, color = "blue")

Outlier Detection

Quick example of outlier detection for an altered air passengers time-series (artificial NaN values are added to the original time-series).

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

log_air_passengers[60] = 10
log_air_passengers[30] = 1
log_air_passengers[100] = 2

model = StructuralModel(log_air_passengers)
fit!(model)

detected_outliers = findall(i -> i != 0, model.output.components["o"]["Coefs"])

plot(log_air_passengers, w=2 , color = "Black", lab = "Historical", legend = :outerbottom)
scatter!([detected_outliers], log_air_passengers[detected_outliers], lab = "Detected Outliers")

StateSpaceModels initialization

Quick example on how to use StateSpaceLearning to initialize StateSpaceModels

using CSV
using DataFrames
using StateSpaceModels

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

model = StructuralModel(log_air_passengers)
fit!(model)

residuals_variances = model.output.residuals_variances

ss_model = BasicStructural(log_air_passengers, 12)
StateSpaceModels.set_initial_hyperparameters!(ss_model, Dict("sigma2_ε" => residuals_variances["ε"], 
                                         "sigma2_ξ" =>residuals_variances["ξ"], 
                                         "sigma2_ζ" =>residuals_variances["ζ"], 
                                         "sigma2_ω" =>residuals_variances["ω_12"]))
StateSpaceModels.fit!(ss_model)

Paper Results Reproducibility

To reproduce the paper results, run the following experiments:

M4 Competition Test

Evaluates SSL and SS (StateSpaceModels) benchmark models on M4 competition dataset across all granularities (Monthly, Quarterly, Daily, Hourly, Weekly, Yearly).

Before running: Add PyCall to your Julia environment:

using Pkg
Pkg.add("PyCall")

The script also requires Python packages (statsmodels, numpy) for the SS benchmark evaluation.

julia paper_tests/m4_test/m4_test.jl

The script:

Downloads M4 competition datasets directly from GitHub
Runs SSL models with various configurations (with/without outliers, different selection methods, and alpha values)
Runs SS (StateSpaceModels) benchmark models using PyCall
Calculates metrics (MASE, sMAPE, OWA, CRPS) for all models

Results are saved in:

paper_tests/m4_test/results_SSL/ - SSL model results by granularity
paper_tests/m4_test/metrics_results/ - Summary metrics and benchmark results

Simulation Parameter Study

Compares SSL vs Kalman filter on simulated data.

julia paper_tests/simulation_param/simulation.jl [repetitions] [sample_sizes]

Example:

julia paper_tests/simulation_param/simulation.jl 100 60,120,240

Results are saved in paper_tests/simulation_param/ssl_vs_kalman_paired_tests.csv.

Simulation Test

Large-scale simulation comparing SSL, SS, and other methods.

julia paper_tests/simulation_test/simulation.jl [num_workers]

Example (with 4 parallel workers):

julia paper_tests/simulation_test/simulation.jl 4

Results are saved in paper_tests/simulation_test/results_simulation/ and paper_tests/simulation_test/results_metrics/.

Contributing

PRs such as adding new models and fixing bugs are very welcome!
For nontrivial changes, you'll probably want to first discuss the changes via issue.

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.github		.github
.vscode		.vscode
datasets		datasets
docs		docs
ext		ext
paper_tests		paper_tests
src		src
test		test
.JuliaFormatter.toml		.JuliaFormatter.toml
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StateSpaceLearning

Quickstart

StructuralModel Arguments

Features

Quick Examples

Fitting, forecasting and simulating

Component Extraction

Best Subset Selection and Dynamic Coefficients

Missing values imputation

Outlier Detection

StateSpaceModels initialization

Paper Results Reproducibility

M4 Competition Test

Simulation Parameter Study

Simulation Test

Contributing

About

Uh oh!

Releases 33

Uh oh!

Contributors 2

Uh oh!

Languages

License

LAMPSPUC/StateSpaceLearning.jl

Folders and files

Latest commit

History

Repository files navigation

StateSpaceLearning

Quickstart

StructuralModel Arguments

Features

Quick Examples

Fitting, forecasting and simulating

Component Extraction

Best Subset Selection and Dynamic Coefficients

Missing values imputation

Outlier Detection

StateSpaceModels initialization

Paper Results Reproducibility

M4 Competition Test

Simulation Parameter Study

Simulation Test

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 33

Uh oh!

Contributors 2

Uh oh!

Languages