Skip to content

refactor(datafile): ignore "text" parameter, add attributes from file#2231

Open
mwtoews wants to merge 1 commit intomodflowpy:developfrom
mwtoews:headfile-ignore-text
Open

refactor(datafile): ignore "text" parameter, add attributes from file#2231
mwtoews wants to merge 1 commit intomodflowpy:developfrom
mwtoews:headfile-ignore-text

Conversation

@mwtoews
Copy link
Contributor

@mwtoews mwtoews commented Jun 14, 2024

This PR ignores the text= parameter from HeadFile, HeadUFile and UcnFile, since this should be read from the file and is constant for all records. This PR adds these attributes:

  • obj.text_bytes, the 16-byte char bytes obtained from the file
  • obj.text, a str version of this field, .e.g., used by flopy.export for NetCDF variable names

There are instances where other binary files can be attempted to be read (e.g. "mt3d_test/mf96mt3d/P01/case1b/MT3D001.UCN" with HeadFile), but these are now caught with exceptions raised. Warnings are raised if the text records are inconsistent, which shouldn't happen (unless I've misunderstood something!)

@mwtoews mwtoews force-pushed the headfile-ignore-text branch from 6d2abab to 2f1d78e Compare June 14, 2024 02:17
@mwtoews
Copy link
Contributor Author

mwtoews commented Jun 14, 2024

This does have a breaking change, where previously the obj.text attribute was bytes, and now changed to str. For instance:

Before:

>>> pth = example_data_path / "unstructured/headu.githds"
>>> obj = flopy.utils.binaryfile.HeadUFile(pth)
>>> obj.text
b'headu'

After:

>>> obj = flopy.utils.binaryfile.HeadUFile(pth)
>>> obj.text
'headu'
>>> obj.text_bytes
b'           HEADU'

if this is a deal-breaker, then obj.text can be kept as the modified bytes (possibly raising a deprecation message if accessed), and the str version of the attribute instead stored at obj.text_str.

Also, I should note that this PR does not change FormattedHeadFile, since further refactoring may be needed.

@mwtoews mwtoews force-pushed the headfile-ignore-text branch from 2f1d78e to fb14357 Compare June 14, 2024 03:16
@codecov
Copy link

codecov bot commented Jun 14, 2024

Codecov Report

Attention: Patch coverage is 63.15789% with 7 lines in your changes missing coverage. Please review.

Project coverage is 70.8%. Comparing base (e2a85a3) to head (fb14357).

Additional details and impacted files
@@            Coverage Diff            @@
##           develop   #2231     +/-   ##
=========================================
- Coverage     70.8%   70.8%   -0.1%     
=========================================
  Files          294     294             
  Lines        59026   59029      +3     
=========================================
- Hits         41832   41804     -28     
- Misses       17194   17225     +31     
Files Coverage Δ
flopy/export/vtk.py 78.0% <100.0%> (-0.1%) ⬇️
flopy/utils/binaryfile.py 81.7% <100.0%> (+0.1%) ⬆️
flopy/utils/datafile.py 73.6% <100.0%> (+0.3%) ⬆️
flopy/mf6/utils/binaryfile_utils.py 10.6% <0.0%> (ø)
flopy/export/utils.py 59.0% <16.6%> (-2.4%) ⬇️

... and 6 files with indirect coverage changes

Copy link
Member

@wpbonelli wpbonelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a deal-breaker, then obj.text can be kept as the modified bytes (possibly raising a deprecation message if accessed), and the str version of the attribute instead stored at obj.text_str.

I think changing .text to str could be considered a fix rather than a breaking change

Copy link

@christianlangevin christianlangevin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwtoews, there is one potential issue that I see here, and I'm not sure of the best way to handle it. With older MODFLOW versions (2000/2005/NWT), it's possible to write different types of records to the same binary file (for example both head and drawdown can be written as alternating records in a binary file). I think we included that "text" argument as a way to deal with this, though maybe it wasn't handled properly. The code indicates that we do not store the header entries for text strings that do not match. One option might be to preserve this behavior in HeadFile/UcnFile/HeaduFile and begin user transition to a more generic version of BinaryLayerFile (or better named alternative), which could just work on any variation. We now use HeadFile to read the dependent variable information for the advanced packages in MODFLOW 6, including SFR, LAK, MAW, UZF, and their transport counterparts. It's not ideal to use a class named HeadFile to read these, and it's also not ideal that a user needs to specify the text string, because for MODFLOW 6 only one type of data will be found in these files. For mf6 flavors of "HeadFile" eliminating the effect of the text arg is a welcome addition.

@mwtoews
Copy link
Contributor Author

mwtoews commented Jun 14, 2024

Thanks @langevin-usgs for the info, I'll create and add a mixed text example to test, e.g. head/drawdown. The text parameter can be changed to optional, which would default to read the first text header, and warn if more than one found.

@mwtoews
Copy link
Contributor Author

mwtoews commented Jun 14, 2024

Confirmed I can create the mixed-text file with older MF versions, and testing current and ancient versions of flopy I can see this feature has never worked. I'll superseded this PR with a fix after a few other related refactors that use the .headers data frame property.

@wpbonelli
Copy link
Member

@mwtoews did you still plan to work on this? I might have a go if not

I figure the first step is to fix headers to store all headers, not just those for the selected record type. Then move internal logic from recordarray to headers.

Then as you say using text from the first header, warning if multiple. Keeping these classes focused on only a single record type seems best, otherwise the kstpkper and times methods need to change to accommodate, etc.

It's clunky in the mixed-text case to open an instance, see the warning, and have to open another one to get at the data, but no reason to support mixed-text when MF6 doesn't use it IMO. And, not to say there will be no place for these custom file handle classes in 4.x, but I think we will have alternative patterns available, like lazy loading binary files into xarray, so may not make sense to put a lot of effort here at the moment.

In 4.x I think DFNs could define dependent variables, and we could generate code to subclass BinaryLayerFile with the appropriate text, to avoid the awkwardness @christianlangevin mentioned above.

@wpbonelli
Copy link
Member

BinaryLayerFile is marked internal right now, but I wonder if we could just make it part of the public API, and suggest using that instead of HeadFile for advanced package dependent variable files

@mwtoews
Copy link
Contributor Author

mwtoews commented Feb 18, 2026

@wpbonelli it seems that I've abandoned these efforts over a year ago, so please have a go and I can take a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments