Granular (Upper and lower) Sensitivity#35
Conversation
|
Thanks @jonasViehweger , this is definitely a nice feature. diff --git a/nrt/monitor/__init__.py b/nrt/monitor/__init__.py
index c906558..3c69b03 100644
--- a/nrt/monitor/__init__.py
+++ b/nrt/monitor/__init__.py
@@ -419,6 +419,7 @@ class BaseNrt(metaclass=abc.ABCMeta):
nc_var = src.variables[k]
# bool are stored as int in netcdf and need to be coerced back to bool
is_bool = 'dtype' in nc_var.ncattrs() and nc_var.getncattr('dtype') == 'bool'
+ is_tuple = 'python_type' in nc_var.ncattrs() and nc_var.getncattr('python_type') == 'tuple'
try:
v = nc_var.value
if is_bool:
@@ -427,6 +428,8 @@ class BaseNrt(metaclass=abc.ABCMeta):
v = nc_var[:]
if is_bool:
v = v.astype(np.bool)
+ if is_tuple:
+ v = tuple(v)
if k == 'x':
k = 'x_coords'
if k == 'y':
@@ -434,7 +437,7 @@ class BaseNrt(metaclass=abc.ABCMeta):
# TODO A different way to name the third dimensions would be
# good. Right now the names might also clash with other
# attribute names (unlikely, but e.g. n, h in MOSUM)
- if k in src.dimensions.keys():
+ if k in src.dimensions.keys() and not is_tuple:
continue
d.update({k:v})
return cls(**d)
@@ -489,6 +492,12 @@ class BaseNrt(metaclass=abc.ABCMeta):
elif isinstance(v, int):
new_var = dst.createVariable(k, 'i4')
new_var.value = v
+ elif isinstance(v, tuple):
+ tup_dim = dst.createDimension(k, len(v))
+ new_var = dst.createVariable(k, 'f4', (k,))
+ new_var[:] = list(v)
+ # Mark it as a tuple for reconstruction in from_netcdf
+ new_var.setncattr('python_type', 'tuple')
def set_xy(self, dataarray):
self.x = dataarray.x.values
Two positive floats sounds good to me
yes
Sorry just reading this now. Is it possible/does it make sense for all monitoring methods? |
For some inputs (TCW, NDMI) and some types of seasonal trajectories, the direction of the residuals which signal a true break and not time-series artifacts like seasonality or missed clouds/snow is uni-directional.
For example with NDMI, true disturbances will almost always signal through sustained lower NDMI values.
Another case is residual distributions. Even though residual distributions will be centered around a mean of 0, the distribution of residuals might be quite lopsided. My guess is that this is one of the reasons why IQR performs so well out of the box; it has some amount of granular upper and lower sensitivity built in through its statistical limits.
I carried out a hyperparameter tuning test for IQR using the jr-digital/DISFOR dataset, which showed that the optimal parameters for this test set using NDMI as input and optimizing for F1 score, results in lopsided sensitivities:
compared to the optimal parameters with just a single sensitivity value for both positive and negative deviations
Being able to set sensitivities individually yields an improvement in F1 of around 3% (0.762455 with separate sensitivities vs. 0.731636).
I've implemented this additional behaviour for IQR in this pull request. There's a few things left to clear up before doing this same treatment for every class:
_update_process().Any thoughts @loicdtx?