-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Description
I was using the notebook: download_merra2.ipynb and in the section: Setting up the DataFrame the following function raised an error:
xr.open_mfdataset(file_path, concat_dim='date', preprocess=extract_date)
raised:
xarray ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation
I solved it (for Germany) by changing the code in the following way:
def extract_date(data_set):
"""
Extracts the date from the filename before merging the datasets.
"""
try:
# The attribute name changed during the development of this script
# from HDF5_Global.Filename to Filename.
if 'HDF5_GLOBAL.Filename' in data_set.attrs:
f_name = data_set.attrs['HDF5_GLOBAL.Filename']
elif 'Filename' in data_set.attrs:
f_name = data_set.attrs['Filename']
else:
raise AttributeError('The attribute name has changed again!')
# find a match between "." and ".nc4" that does not have "." .
exp = r'(?<=\.)[^\.]*(?=\.nc4)'
res = re.search(exp, f_name).group(0)
# Extract the date.
y, m, d = res[0:4], res[4:6], res[6:8]
date_str = ('%s-%s-%s' % (y, m, d))
data_set = data_set.assign(date=date_str)
data_set = data_set.expand_dims("date")
data_set.coords["lat"] = [47.5, 48.0, 48.5, 49.0, 49.5, 50.0, 50.5, 51.0, 51.5, 52.0, 52.5, 53.0, 53.5, 54.0, 54.5, 55.0]
data_set.coords["lon"] = [5.625, 6.25, 6.875, 7.5, 8.125, 8.75, 9.375, 10.0, 10.625, 11.25, 11.875, 12.5, 13.125, 13.75, 14.375, 15.0]
data_set.coords["time"] = list(range(24))
return data_set
except KeyError:
# The last dataset is the one all the other sets will be merged into.
# Therefore, no date can be extracted.
data_set.coords["lat"] = [47.5, 48.0, 48.5, 49.0, 49.5, 50.0, 50.5, 51.0, 51.5, 52.0, 52.5, 53.0, 53.5, 54.0, 54.5, 55.0]
data_set.coords["lon"] = [5.625, 6.25, 6.875, 7.5, 8.125, 8.75, 9.375, 10.0, 10.625, 11.25, 11.875, 12.5, 13.125, 13.75, 14.375, 15.0]
data_set.coords["time"] = list(range(24))
return data_set
and by commenting in the following cell:
df.drop('DISPH', axis=1, inplace=True)
df.drop(['time', 'date'], axis=1, inplace=True)
df.drop(['U2M', 'U10M', 'U50M', 'V2M', 'V10M', 'V50M'], axis=1, inplace=True)
# df['lat'] = df['lat'].apply(lambda x: lat_array[int(x)])
# df['lon'] = df['lon'].apply(lambda x: lon_array[int(x)])I could not check whether the same error occurred on another machine.
For the rest thank you for writing this awesome notebook!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels