Skip to content

StateSpaceLearning.jl is a Julia package for time-series analysis using state space learning framework.

License

Notifications You must be signed in to change notification settings

LAMPSPUC/StateSpaceLearning.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StateSpaceLearning

Build Status Coverage Documentation CodeStyle
ci codecov Code Style: Blue

StateSpaceLearning.jl is a package for modeling and forecasting time series in a high-dimension regression framework.

Quickstart

using StateSpaceLearning

y = randn(100)

# Instantiate Model
model = StructuralModel(y)

# Fit Model
fit!(model)

# Point Forecast
prediction = forecast(model, 12) # Gets a 12 steps ahead prediction

# Scenarios Path Simulation
simulation = simulate(model, 12, 1000) # Gets 1000 scenarios path of 12 steps ahead predictions

StructuralModel Arguments

  • y::Union{Vector,Matrix}: Time series data
  • level::String: Level component type: "stochastic", "deterministic", or "none" (default: "stochastic")
  • slope::String: Slope component type: "stochastic", "deterministic", or "none" (default: "stochastic")
  • seasonal::String: Seasonal component type: "stochastic", "deterministic", or "none" (default: "stochastic")
  • cycle::String: Cycle component type: "stochastic", "deterministic", or "none" (default: "none")
  • freq_seasonal::Union{Int,Vector{Int}}: Seasonal frequency or vector of frequencies (default: 12)
  • cycle_period::Union{Union{Int,<:AbstractFloat},Vector{Int},Vector{<:AbstractFloat}}: Cycle period or vector of periods (default: 0)
  • outlier::Bool: Include outlier component (default: true)
  • ξ_threshold::Int: Threshold for level innovations (default: 1)
  • ζ_threshold::Int: Threshold for slope innovations (default: 12)
  • ω_threshold::Int: Threshold for seasonal innovations (default: 12)
  • ϕ_threshold::Int: Threshold for cycle innovations (default: 12)
  • stochastic_start::Int: Starting point for stochastic components (default: 1)
  • exog::Matrix: Matrix of exogenous variables (default: zeros(length(y), 0))
  • dynamic_exog_coefs::Union{Vector{<:Tuple}, Nothing}: Dynamic exogenous coefficients (default: nothing)

Features

Current features include:

  • Model estimation using elastic net based regularization
  • Automatic component decomposition (trend, seasonal, cycle)
  • Point forecasting and scenario simulation
  • Missing value imputation
  • Outlier detection and robust modeling
  • Multiple seasonal frequencies support
  • Deterministic and stochastic component options
  • Dynamic exogenous variable handling
  • Best subset selection for exogenous variables

Quick Examples

Fitting, forecasting and simulating

Quick example of fit and forecast for the air passengers time-series.

using StateSpaceLearning
using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)
steps_ahead = 30

model = StructuralModel(log_air_passengers)
StateSpaceLearning.fit!(model)
prediction_log = StateSpaceLearning.forecast(model, steps_ahead) # arguments are the output of the fitted model and number of steps ahead the user wants to forecast
prediction = exp.(prediction_log)

plot_point_forecast(airp.passengers, prediction)

quick_example_airp

N_scenarios = 1000
simulation = simulate(model, steps_ahead, N_scenarios) # arguments are the output of the fitted model, number of steps ahead the user wants to forecast and number of scenario paths

plot_scenarios(airp.passengers, exp.(simulation))

airp_sim

Component Extraction

Quick example on how to perform component extraction in time series utilizing StateSpaceLearning.

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

model = StructuralModel(log_air_passengers)
fit!(model)

# Access decomposed components directly
trend = model.output.decomposition["trend"]
seasonal = model.output.decomposition["seasonal_12"]

plot(trend, w=2, color = "Black", lab = "Trend Component", legend = :outerbottom)
plot(seasonal, w=2, color = "Black", lab = "Seasonal Component", legend = :outerbottom)
quick_example_trend quick_example_seas

Best Subset Selection and Dynamic Coefficients

Example of performing best subset selection and using dynamic coefficients:

using StateSpaceLearning
using CSV
using DataFrames
using Random

Random.seed!(2024)

# Load data
airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

# Create exogenous features
X = rand(length(log_air_passengers), 10)
β = rand(3)
y = log_air_passengers + X[:, 1:3]*β

# Create model with exogenous variables
model = StructuralModel(y; 
    exog = X
)

# Fit model with elastic net regularization
fit!(model; 
    α = 1.0, # 1.0 for Lasso, 0.0 for Ridge
    information_criteria = "bic",
    ϵ = 0.05,
    penalize_exogenous = true,
    penalize_initial_states = true
)

# Get selected features
selected_exog = model.output.components["exog"]["Selected"]

Missing values imputation

Quick example of completion of missing values for the air passengers time-series (artificial NaN values are added to the original time-series).

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

airpassengers = AbstractFloat.(airp.passengers)
log_air_passengers[60:72] .= NaN

model = StructuralModel(log_air_passengers)
fit!(model)

fitted_completed_missing_values = ones(144).*NaN; fitted_completed_missing_values[60:72] = exp.(model.output.fitted[60:72])
real_removed_valued = ones(144).*NaN; real_removed_valued[60:72] = deepcopy(airp.passengers[60:72])
airpassengers[60:72] .= NaN

plot(airpassengers, w=2 , color = "Black", lab = "Historical", legend = :outerbottom)
plot!(real_removed_valued, lab = "Real Removed Values", w=2, color = "red")
plot!(fitted_completed_missing_values, lab = "Fit in Sample completed values", w=2, color = "blue")

quick_example_completion_airp

Outlier Detection

Quick example of outlier detection for an altered air passengers time-series (artificial NaN values are added to the original time-series).

using CSV
using DataFrames
using Plots

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

log_air_passengers[60] = 10
log_air_passengers[30] = 1
log_air_passengers[100] = 2

model = StructuralModel(log_air_passengers)
fit!(model)

detected_outliers = findall(i -> i != 0, model.output.components["o"]["Coefs"])

plot(log_air_passengers, w=2 , color = "Black", lab = "Historical", legend = :outerbottom)
scatter!([detected_outliers], log_air_passengers[detected_outliers], lab = "Detected Outliers")

quick_example_completion_airp

StateSpaceModels initialization

Quick example on how to use StateSpaceLearning to initialize StateSpaceModels

using CSV
using DataFrames
using StateSpaceModels

airp = CSV.File(StateSpaceLearning.AIR_PASSENGERS) |> DataFrame
log_air_passengers = log.(airp.passengers)

model = StructuralModel(log_air_passengers)
fit!(model)

residuals_variances = model.output.residuals_variances

ss_model = BasicStructural(log_air_passengers, 12)
StateSpaceModels.set_initial_hyperparameters!(ss_model, Dict("sigma2_ε" => residuals_variances["ε"], 
                                         "sigma2_ξ" =>residuals_variances["ξ"], 
                                         "sigma2_ζ" =>residuals_variances["ζ"], 
                                         "sigma2_ω" =>residuals_variances["ω_12"]))
StateSpaceModels.fit!(ss_model)

Paper Results Reproducibility

To reproduce the paper results, run the following experiments:

M4 Competition Test

Evaluates SSL and SS (StateSpaceModels) benchmark models on M4 competition dataset across all granularities (Monthly, Quarterly, Daily, Hourly, Weekly, Yearly).

Before running: Add PyCall to your Julia environment:

using Pkg
Pkg.add("PyCall")

The script also requires Python packages (statsmodels, numpy) for the SS benchmark evaluation.

julia paper_tests/m4_test/m4_test.jl

The script:

  • Downloads M4 competition datasets directly from GitHub
  • Runs SSL models with various configurations (with/without outliers, different selection methods, and alpha values)
  • Runs SS (StateSpaceModels) benchmark models using PyCall
  • Calculates metrics (MASE, sMAPE, OWA, CRPS) for all models

Results are saved in:

  • paper_tests/m4_test/results_SSL/ - SSL model results by granularity
  • paper_tests/m4_test/metrics_results/ - Summary metrics and benchmark results

Simulation Parameter Study

Compares SSL vs Kalman filter on simulated data.

julia paper_tests/simulation_param/simulation.jl [repetitions] [sample_sizes]

Example:

julia paper_tests/simulation_param/simulation.jl 100 60,120,240

Results are saved in paper_tests/simulation_param/ssl_vs_kalman_paired_tests.csv.

Simulation Test

Large-scale simulation comparing SSL, SS, and other methods.

julia paper_tests/simulation_test/simulation.jl [num_workers]

Example (with 4 parallel workers):

julia paper_tests/simulation_test/simulation.jl 4

Results are saved in paper_tests/simulation_test/results_simulation/ and paper_tests/simulation_test/results_metrics/.

Contributing

  • PRs such as adding new models and fixing bugs are very welcome!
  • For nontrivial changes, you'll probably want to first discuss the changes via issue.

About

StateSpaceLearning.jl is a Julia package for time-series analysis using state space learning framework.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages