Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 49 additions & 11 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,25 +14,63 @@ pip install -e '.[docs,test,benchmark]'

## Usage

Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method e.g. the option `-b timeraw_import_inspect` selects the function `timeraw_import_inspect` in `benchmarks/spatialdata_benchmark.py`. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:
Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:

Importing the SpatialData library can take around 4 seconds:
### Import time benchmarks

Import benchmarks live in `benchmarks/benchmark_imports.py`. Each `timeraw_*` function returns a Python code snippet that asv runs in a fresh interpreter (cold import, empty module cache):

Run all import benchmarks in your current environment:

```
PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b timeraw_import_inspect
Couldn't load asv.plugins._mamba_helpers because
No module named 'conda'
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[ 0.00%] ·· Benchmarking existing-py_opt_homebrew_Caskroom_mambaforge_base_envs_spatialdata2_bin_python3.12
[50.00%] ··· Running (spatialdata_benchmark.timeraw_import_inspect--).
[100.00%] ··· spatialdata_benchmark.timeraw_import_inspect 3.65±0.2s
asv run --python=same --show-stderr -b timeraw
```

Or a single one:

```
asv run --python=same --show-stderr -b timeraw_import_spatialdata
```

### Comparing the current branch against `main`

The simplest way is `asv continuous`, which builds both commits, runs the benchmarks, and prints the comparison in one shot:

```bash
asv continuous --show-stderr -v -b timeraw main faster-import
```

Replace `faster-import` with any branch name or commit hash. The `-v` flag prints per-sample timings; drop it for a shorter summary.

Alternatively, collect results separately and compare afterwards:

```bash
# 1. Collect results for the tip of main and the tip of your branch
asv run --show-stderr -b timeraw main
asv run --show-stderr -b timeraw HEAD

# 2. Print a side-by-side comparison
asv compare main HEAD
```

Both approaches build isolated environments from scratch. If you prefer to skip the rebuild and reuse your current environment (faster, less accurate):

```bash
asv run --python=same --show-stderr -b timeraw HEAD

git stash && git checkout main
asv run --python=same --show-stderr -b timeraw HEAD
git checkout - && git stash pop

asv compare main HEAD
```

### Querying benchmarks

Querying using a bounding box without a spatial index is highly impacted by large amounts of points (transcripts), more than table rows (cells).

```
$ PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b time_query_bounding_box
$ asv run --python=same --show-stderr -b time_query_bounding_box

[100.00%] ··· ======== ============ ============= ============= ==============
-- filter_table / n_transcripts_per_cell
Expand Down
56 changes: 56 additions & 0 deletions benchmarks/benchmark_imports.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
"""Benchmarks for import times of the spatialdata package and its submodules.

Each ``timeraw_*`` function returns a snippet of Python code that asv runs in
a fresh interpreter, so the measured time reflects a cold import with an empty
module cache.
"""

from collections.abc import Callable
from typing import Any


def _timeraw(func: Any) -> Any:
"""Set asv benchmark attributes for a cold-import timeraw function."""
func.repeat = 5 # number of independent subprocess measurements
func.number = 1 # must be 1: second import in same process hits module cache
return func


@_timeraw
def timeraw_import_spatialdata() -> str:
"""Time a bare ``import spatialdata``."""
return """
import spatialdata
"""


@_timeraw
def timeraw_import_SpatialData() -> str:
"""Time importing the top-level ``SpatialData`` class."""
return """
from spatialdata import SpatialData
"""


@_timeraw
def timeraw_import_read_zarr() -> str:
"""Time importing ``read_zarr`` from the top-level namespace."""
return """
from spatialdata import read_zarr
"""


@_timeraw
def timeraw_import_models_elements() -> str:
"""Time importing the main element model classes."""
return """
from spatialdata.models import Image2DModel, Labels2DModel, PointsModel, ShapesModel, TableModel
"""


@_timeraw
def timeraw_import_transformations() -> str:
"""Time importing the ``spatialdata.transformations`` submodule."""
return """
from spatialdata.transformations import Affine, Scale, Translation, Sequence
"""
7 changes: 0 additions & 7 deletions benchmarks/spatialdata_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,6 @@ def peakmem_list2(self):
return sdata


def timeraw_import_inspect():
"""Time the import of the spatialdata module."""
return """
import spatialdata
"""


class TimeMapRaster:
"""Time the."""

Expand Down
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

from __future__ import annotations

# -- Path setup --------------------------------------------------------------
import sys
from datetime import datetime
Expand Down
2 changes: 2 additions & 0 deletions docs/extensions/typed_returns.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# code from https://github.com/theislab/scanpy/blob/master/docs/extensions/typed_returns.py
# with some minor adjustment
from __future__ import annotations

import re

from sphinx.application import Sphinx
Expand Down
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ dependencies = [
[project.optional-dependencies]
dev = [
"bump2version",
"sentry-prevent-cli",
]
test = [
"pytest",
Expand All @@ -79,6 +78,7 @@ docs = [
benchmark = [
"asv",
"memray",
"profimp",
]
torch = [
"torch"
Expand Down Expand Up @@ -183,6 +183,9 @@ select = [
]
unfixable = ["B", "C4", "UP", "BLE", "T20", "RET"]

[tool.ruff.lint.isort]
required-imports = ["from __future__ import annotations"]
Comment on lines +186 to +187
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this. It makes it so the UP lints will always flag when you’re just using an import in an annotation, and therefore will automatically skip such imports at runtime by sticking them into if TYPE_CHECKING: blocks.


[tool.ruff.lint.pydocstyle]
convention = "numpy"

Expand Down
Loading