Skip to content

test: make pre-release tests work#1071

Open
flying-sheep wants to merge 16 commits intomainfrom
fix-tests
Open

test: make pre-release tests work#1071
flying-sheep wants to merge 16 commits intomainfrom
fix-tests

Conversation

@flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Feb 12, 2026

Fixes #1039

I changed the test setup to use uv add, which actually makes sense and results in a single solve per env, as I’ve been preaching since the dawn of time.

Don’t use [uv] pip install in CI people. Ever. Why does nobody trust me on that?

@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.93%. Comparing base (9db3004) to head (92c20cb).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1071   +/-   ##
=======================================
  Coverage   91.93%   91.93%           
=======================================
  Files          51       51           
  Lines        7664     7664           
=======================================
  Hits         7046     7046           
  Misses        618      618           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@flying-sheep flying-sheep marked this pull request as ready for review February 12, 2026 09:11
@LucaMarconato
Copy link
Member

LucaMarconato commented Feb 16, 2026

@melonora I summon you as a "Windows x Dask" expert. Do you have any insight on why this specific Windows job fails (the dask computational graph shows double the size): https://github.com/scverse/spatialdata/actions/runs/21940345294/job/63754609282?pr=1071.

The test failing means that there may be some performance penalty in that environment. Anyway since the result is correct and it the test doesn't fail for other OS versions or more recent Dask, that will not block the merge.

@melonora
Copy link
Collaborator

I will check when back at home, but yes I have my suspicion

@LucaMarconato
Copy link
Member

Thanks I will fix the docs meanwhile.

@LucaMarconato
Copy link
Member

Docs are green!

@melonora
Copy link
Collaborator

Reporting: I think the failing test is overall faulty, depending on the actual purpose. We are actually testing more dask internals here instead of ensuring that we don't have more than expected entry points in our codebase to dask.

This means that Array.__getitem__ calls can also just happen multiple times within the dask codebase which seems to be the case. By writing a side effect for the mock object I saw that all calls with only ({"width": 32, "chunk_size": 16, "scale_factors": (2,)},) happened in _write_raster_datatree, specifically in da.compute(*dask_delayed, optimize_graph=False) on line 341. This line was only passed once. Checking downstream ome-zarr-py did not explain the 8 calls made and furthermore switching to dask version 2025.11.0 (prior to sharding) makes the test pass. This means the test only fails for dask version 2025.2.0 and maybe earlier.

My suggestion would be to either ensure that there are no more expected calls from the spatialdata side, e.g. if you write there should be only one entry point or we test that the amount of times that chunks are accessed fall within a range of 2*chunks and if that test fails we report it upstream to dask. @LucaMarconato WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better pre-release CI workflow

3 participants