-
Notifications
You must be signed in to change notification settings - Fork 2
set time bounds #193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: prep-release
Are you sure you want to change the base?
set time bounds #193
Conversation
code still needs refinement. This is first approximation.
|
It appears the time method (like mean, climatology, instantaneous) does influence in setting time bounds. For instance, when time method is climatology, time bounds are not to be set as it does not make sense in this case. Also the time averaging step sets new time stamps which is either first or middle or last in the time interval. setting time bounds before this step or after this step may produce different results. |
|
@siligam this is still marked as a draft, is there anything to be done here, or can I review this and merge? Is it still required for prep release? |
it is ready for review, although the linting thing needs to be fixed. |
- Remove unused loop variable 'i' in time_bounds.py - Apply black formatting to time_bounds.py and test_time_bounds.py - All linting checks (flake8, isort, black) now pass
pgierz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments, I think we need to iterate on this PR one more time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file belongs in the documentation folder, together with the time_bounds.rst and should be merged into one single document
| The output dataset with the time bounds information. | ||
| """ | ||
| # Get dataset name for logging | ||
| dataset_name = ds.attrs.get("name", "unnamed_dataset") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this in a couple of places. I think we should be consistent and think of a rule for it. I am ok with "unnamed_dataset", but we should keep it in mind for the future.
|
|
||
| # Log header with markdown style | ||
| logger.info(f"[Time bounds] {dataset_name}") | ||
| logger.info(f" → is dataset: {'✅' if isinstance(ds, xr.Dataset) else '❌'}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this in a few places now, and I am not sure I like the emojis so much. What about @mandresm, any opinions?
| # Only create bounds for mean and instantaneous methods | ||
| if time_method == "climatology": | ||
| logger.info(" → skipping bounds creation for climatology data") | ||
| return ds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we double check this? Do climatologically averaged datasets really not include time bounds? That seems inconsistent.
| if time_method not in ["mean", "instantaneous", "climatology"]: | ||
| logger.warning( | ||
| f" ⚠️ Unknown time method '{time_method}', defaulting to 'mean'" | ||
| ) | ||
| time_method = "mean" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a way to bail out here and raise a failure would be nice.
| # For instantaneous data, bounds are the same as time values | ||
| bounds_data = np.column_stack([time_values, time_values]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a reminder of what np.column_stack does:
| # For instantaneous data, bounds are the same as time values | |
| bounds_data = np.column_stack([time_values, time_values]) | |
| # For instantaneous data, bounds are the same as time values | |
| # | |
| # .. note:: np.column_stack reminder | |
| # | |
| # >>> a = np.array([1, 2, 3]) | |
| # >>> b = np.array([4, 5, 6]) | |
| # >>> np.column_stack((a, b)) | |
| # [[1, 4], [2, 5], [6, 3]] | |
| # | |
| bounds_data = np.column_stack([time_values, time_values]) |
src/pymor/std_lib/time_bounds.py
Outdated
| raise ValueError(error_msg) | ||
|
|
||
| # Calculate data frequency in days | ||
| time_diff_seconds = np.median( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why median?
| coords={time_label: time_values, bounds_dim_label: [0, 1]}, | ||
| attrs={ | ||
| "long_name": f"time bounds for {time_label}", | ||
| "comment": f"Generated by pymorize: {time_method} time bounds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong name!
Time Bounds Feature Implementation
Overview
This PR introduces a robust time bounds handling system for xarray Datasets, which is particularly useful for climate and weather data where variables are often associated with time intervals rather than single time points.
Key Features
Changes
Core Functionality:
Testing:
Documentation:
Testing
All tests are passing:
Logging
Next Steps
Related Issues
Closes #176