Skip to content

Enable parallel netCDF dataset reads#923

Open
davidhassell wants to merge 21 commits intoNCAS-CMS:mainfrom
davidhassell:parallel-read
Open

Enable parallel netCDF dataset reads#923
davidhassell wants to merge 21 commits intoNCAS-CMS:mainfrom
davidhassell:parallel-read

Conversation

@davidhassell
Copy link
Collaborator

Fixes #912

Requires NCAS-CMS/cfdm#384 to be resolved first.

@davidhassell davidhassell added this to the NEXTVERSION milestone Feb 18, 2026
@davidhassell davidhassell added enhancement New feature or request performance Relating to speed and memory performance dataset read Relating to reading datasets active storage Relating to active storage operations labels Feb 18, 2026
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the appropriate cfdm branch (NCAS-CMS/cfdm#384) and have suitable dependencies etc. but there are immediate issues - should be easily solvable, though - see in-line comment.

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested as working and suitably tested downstream here in cf-python. A few minor comments in-line plus there is, ideally, the recipes linting/isort changes to revert (though I guess those only happened due to the historical / early 2025 initiation of this PR, since more recently I included the full recipes-docs/source/recipes in the linting isort skipping glob: https://github.com/NCAS-CMS/cf-python/blob/main/pyproject.toml#L25 so they won't auto-change any more).

Otherwise ready to merge, except for that we will update the class name for Netcdf_fileArray as discussed - I can do a final sanity check on both PRs once that change is made in both.

* New optional backend for netCDF-3 in `cf.read` that allows parallel
reading: ``netcdf_file``
(https://github.com/NCAS-CMS/cf-python/issues/912)
* Changed dependency: ``cfdm>=1.13.1.0, <1.13.2.0``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hasn't yet (here in this PR) been updated in the requirements.txt, but as long as it is done before release time that is fine.

**2026-??-??**

* New default backend for netCDF-4 in `cf.read` that allows parallel
reading: (https://github.com/NCAS-CMS/cf-python/issues/912)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
reading: (https://github.com/NCAS-CMS/cf-python/issues/912)
reading: ``h5netcdf-pyfive``
(https://github.com/NCAS-CMS/cf-python/issues/912)

from ...mixin_container import Container


class Netcdf_fileArray(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the cfdm sibling PR, we shall rename this class and do so after my review so that there's no need for me to wait on the update here too. (A reminder.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

active storage Relating to active storage operations dataset read Relating to reading datasets enhancement New feature or request performance Relating to speed and memory performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable parallel netCDF dataset reads

2 participants