Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

hoxbro
Copy link
Member

@hoxbro hoxbro commented Aug 25, 2025

List of changes, a missing checkmark means a missing unit test

  • Handle gridded dataset.
  • Improve linkage, not being able to solve error message
  • Fix sorting of non-selected column
  • Fix using non-plotting vdims as main_dim
  • Improve error message using Layout with dendrograms.

Copy link

codecov bot commented Aug 25, 2025

Codecov Report

❌ Patch coverage is 92.59259% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.02%. Comparing base (365b4a4) to head (434d07e).

Files with missing lines Patch % Lines
holoviews/operation/element.py 85.71% 4 Missing ⚠️
holoviews/tests/operation/test_operation.py 95.23% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6669   +/-   ##
=======================================
  Coverage   89.02%   89.02%           
=======================================
  Files         329      329           
  Lines       70422    70489   +67     
=======================================
+ Hits        62693    62754   +61     
- Misses       7729     7735    +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@flying-sheep
Copy link
Contributor

flying-sheep commented Aug 26, 2025

Not entirely sure why A.obs["n_counts"] does not have the same size even though A.obs.index does.

because of this line; only kdims currently get expanded (and expand_grid_coords only supports kdims):

https://github.com/holoviz-topics/hv-anndata/blob/b399744dfc8f01b7be159a738cd0007481d1e338/src/hv_anndata/interface.py#L477

@flying-sheep
Copy link
Contributor

best wait until holoviz-topics/hv-anndata#82 is fixed

@flying-sheep
Copy link
Contributor

flying-sheep commented Aug 28, 2025

OK, so values should now works as expected in holoviz-topics/hv-anndata#66 and broadcast things correctly.

But when using that PR and this one together, things are still broken (in fact, this PR does nothing to change what happens here):

import holoviews as hv
import numpy as np
import scanpy as sc
import hv_anndata
from hv_anndata import ACCESSOR as A
from hv_anndata import register

register()
hv.extension("bokeh")


markers = ["C1QA", "PSAP", "CD79A", "CD79B", "CST3", "LYZ"]
hm = hv.HeatMap(
    adata[np.argsort(adata.obs["bulk_labels"], stable=True), markers],
    [A.obs.index, A.var.index],
    [A[:, :], A.obs["n_counts"]],
).opts(xticks=0, colorbar=True, width=500, height=200)
if dendrogram:
hm = hv.operation.dendrogram(
    hm,
    adjoint_dims=[A.obs.index],
    main_dim=A.obs["n_counts"],
    linkage_metric="euclidean",
)
hm

running with dendrogram = False

grafik

running with dendrogram = True

grafik
old grafik

@flying-sheep
Copy link
Contributor

flying-sheep commented Aug 28, 2025

I updated the image. Looks better, but still changed. Looks reordered, but dendrogram’s optimal_ordering is False by default.

(and the dendrogram is still a little messed up. Very messed up when passing responsive=True)

@hoxbro
Copy link
Member Author

hoxbro commented Aug 28, 2025

I updated the image. Looks better, but still changed. Looks reordered, but dendrogram’s optimal_ordering is False by default.

I think it should be reordered, it is finding an order, just not the optimal order. However, I'm not an expert in dendrograms by any means.

(and the dendrogram is still a little messed up. Very messed up when passing responsive=True)

Can you share an image. The responsive=True is tracked here, #6527

@flying-sheep
Copy link
Contributor

flying-sheep commented Aug 29, 2025

OK, two issues:

  1. If you put the dendrogram on the other axis, the ordering is still not preserved:

    markers = ["C1QA", "PSAP", "CD79A", "CD79B", "CST3", "LYZ"]
    hm = hv.HeatMap(
        adata[np.argsort(adata.obs["bulk_labels"], stable=True), markers],
        [A.obs.index, A.var.index],
        [A[:, :], A.var["n_counts"]],
    ).opts(xticks=0, width=500, height=400)
    
    hv.operation.dendrogram(
        hm,
        adjoint_dims=[A.var.index],
        main_dim=A.var["n_counts"],
        linkage_metric="euclidean",
    )
    grafik
  2. adjoint_dims=[A.obs["bulk_labels"]] doesn’t work, as what comes out of its groupby can’t be np.vstacked. I’m trying to reproduce scanpy’s heatmap:

    scanpy heatmap

Can you share an image. The responsive=True is tracked here, #6527

with the code from my first comment, but responsive=True or frame_width=...:

grafik

@hoxbro
Copy link
Member Author

hoxbro commented Aug 29, 2025

  1. If you put the dendrogram on the other axis, the ordering is still not preserved:

I think this occurs because of the difference between np.unique (sorting) and pd.unique (order of occurrence)

image
import holoviews as hv
import numpy as np
import pandas as pd
import scanpy as sc

import hv_anndata
from hv_anndata import ACCESSOR as A
from hv_anndata import register

register()

hv.extension("bokeh")

adata = sc.datasets.pbmc68k_reduced()

kdims = [A.obs.index, A.var.index]
vdims = [A[:, :], A.obs["n_counts"]]


ds = hv.Dataset(adata[:10, :10], kdims=kdims, vdims=vdims)
hm = hv.HeatMap(ds)
de = hv.operation.dendrogram(
    hm,
    adjoint_dims=[A.obs.index],
    main_dim=A.obs["n_counts"],
    linkage_metric="euclidean",
)
(hm + de).opts(shared_axes=False)

df = ds.dframe()
pd.unique(df["A.var.index"])
np.unique(df["A.var.index"])

with the code from my first comment, but responsive=True or frame_width=...:

Did you have a problem with sizing outside responsive=True / frame_width ?

@flying-sheep
Copy link
Contributor

Did you have a problem with sizing outside responsive=True / frame_width ?

nope!

@flying-sheep
Copy link
Contributor

Now there’s no exception, but also no dendrogram. Are you testing this with Basic.ipynb in hv-anndata?

grafik

@hoxbro
Copy link
Member Author

hoxbro commented Sep 8, 2025

Now there’s no exception, but also no dendrogram. Are you testing this with Basic.ipynb in hv-anndata?

I'm currently just looking at "pure" pandas and trying to tackle point 1 you raised. Point 2, I'm not sure if it is currently feasible, and will therefore likely not be part of this fix PR. I think this is what you are seeing by using bulk_labels.

@flying-sheep
Copy link
Contributor

flying-sheep commented Sep 8, 2025

OK, cool! I filed #6683 for that

Otherwise, things seem to work when using adjoint_dims=[A.obs.index] with this PR

@hoxbro
Copy link
Member Author

hoxbro commented Sep 8, 2025

I was about to write that I would file it, but then I saw something weird.

There appears to be a transpose issue when using anndata with HeatMap. Do you have an idea why? Hovering over the data, it matches up with the DataFrame.

Code

import holoviews as hv
import numpy as np
import scanpy as sc

import hv_anndata
from hv_anndata import ACCESSOR as A
from hv_anndata import register

register()

hv.extension("bokeh")

adata = sc.datasets.pbmc68k_reduced()
adata.obs.index = map(str, range(len(adata.obs.index)))  # Just for my own sake...

kdims = [A.obs.index, A.var.index]
vdims = [A[:, :]]

ds = hv.Dataset(adata[:10, :10], kdims=kdims, vdims=vdims)
hm_pd = hv.HeatMap(ds.clone(data=ds.dframe())).opts(tools=["hover"], title="pandas")
hm_adata = hv.HeatMap(ds).opts(tools=["hover"], title="anndata")

(hm_adata + hm_pd).opts(shared_axes=False) 

image

@hoxbro hoxbro force-pushed the fix_edge_cases_dendrogram branch from fbd86b1 to 66cdf44 Compare September 8, 2025 14:15
@droumis droumis added this to NIH-NCI Sep 8, 2025
@hoxbro hoxbro force-pushed the fix_edge_cases_dendrogram branch from 6ff8e1f to 26d566d Compare September 8, 2025 15:22
@hoxbro hoxbro marked this pull request as ready for review September 9, 2025 11:57
@flying-sheep
Copy link
Contributor

flying-sheep commented Sep 9, 2025

Seems like a strange expectation by the Heatmap code; if I change the hv-anndata interface to basically

return values.flatten() if flat else values.T

it starts to work, I just don’t understand why the values API is expected to work like that:

zvals = aggregate.dimension_values(2, flat=False)
zvals = zvals.T.flatten()

When run for the anndata version, values is called 3 times for A[:, :], and the values come out in the exact same order, only that it seams like the heatmap plotting code expects the flat=False version to be transposed for some reason:

dim=A[:, :], expanded=True, flat=True
  File "…/holoviews/plotting/plot.py", line 958, in update
    return self.initialize_plot()
  File "…/holoviews/plotting/bokeh/element.py", line 2172, in initialize_plot
    ranges = self.compute_ranges(self.hmap, key, ranges)
  File "…/holoviews/plotting/plot.py", line 617, in compute_ranges
    self._compute_group_range(group, elements, ranges, framewise,
  File "…/holoviews/plotting/plot.py", line 727, in _compute_group_range
    data_range = el.range(el_dim, dimension_range=False)
  File "…/holoviews/core/data/__init__.py", line 201, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "…/holoviews/element/raster.py", line 964, in range
    return super().range(dim, data_range, dimension_range)
  File "…/holoviews/core/data/__init__.py", line 201, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "…/holoviews/core/data/__init__.py", line 529, in range
    lower, upper = self.interface.range(self, dim)
  File "…/holoviews/core/data/interface.py", line 414, in range
    column = dataset.dimension_values(dimension)
  File "…/holoviews/core/data/__init__.py", line 201, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "…/holoviews/core/data/__init__.py", line 1178, in dimension_values
    values = self.interface.values(self, dim, expanded, flat)
dim=A[:, :], expanded=True, flat=False
  File "…/holoviews/plotting/plot.py", line 958, in update
    return self.initialize_plot()
  File "…/holoviews/plotting/bokeh/element.py", line 2201, in initialize_plot
    self._init_glyphs(plot, element, ranges, source)
  File "…/holoviews/plotting/bokeh/heatmap.py", line 154, in _init_glyphs
    super()._init_glyphs(plot, element, ranges, source)
  File "…/holoviews/plotting/bokeh/element.py", line 2101, in _init_glyphs
    data, mapping, style = self.get_data(element, ranges, style)
  File "…/holoviews/plotting/bokeh/heatmap.py", line 123, in get_data
    zvals = aggregate.dimension_values(2, flat=False)
  File "…/holoviews/core/data/__init__.py", line 201, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "…/holoviews/core/data/__init__.py", line 1178, in dimension_values
    values = self.interface.values(self, dim, expanded, flat)
dim=A[:, :], expanded=True, flat=True
  File "…/holoviews/plotting/plot.py", line 958, in update
    return self.initialize_plot()
  File "…/holoviews/plotting/bokeh/element.py", line 2201, in initialize_plot
    self._init_glyphs(plot, element, ranges, source)
  File "…/holoviews/plotting/bokeh/heatmap.py", line 154, in _init_glyphs
    super()._init_glyphs(plot, element, ranges, source)
  File "…/holoviews/plotting/bokeh/element.py", line 2101, in _init_glyphs
    data, mapping, style = self.get_data(element, ranges, style)
  File "…/holoviews/plotting/bokeh/heatmap.py", line 139, in get_data
    for v in aggregate.dimension_values(vdim)]
  File "…/holoviews/core/data/__init__.py", line 201, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "…/holoviews/core/data/__init__.py", line 1178, in dimension_values
    values = self.interface.values(self, dim, expanded, flat)

Comment on lines +1351 to +1352
code_map = defaultdict(lambda: len(code_map)) # noqa: B023
order = list(map(code_map.__getitem__, ddata))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance seems good here:

import pandas as pd
import numpy as np
from collections import defaultdict
from string import ascii_lowercase

var = [*'mtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnkslmtropqnksl'] * 100 + [*ascii_lowercase]
var_np = np.asarray(var)

print(len(var), len(set(var)))

code_map = defaultdict(lambda: len(code_map))  # noqa: B023
order1 = list(map(code_map.__getitem__, var))
order2 = pd.Categorical(var_np, pd.unique(var_np)).codes

np.testing.assert_array_equal(order1, order2)
Screenshot From 2025-09-11 10-56-10 Screenshot From 2025-09-11 10-56-54 Screenshot From 2025-09-11 10-58-02

@philippjfr
Copy link
Member

Just looked into it and it seems like that the AnnDataGridInterface is simply missing code that transposes the arrays to the expected orientations. Specifically, the values method is meant to transpose the arrays to match the order of the key dimensions, i.e. if the kdims declare [obs, var] as the dimensions then the array should be returned as the exact opposite, i.e. as var x obs. That is also the case for the expanded key dimensions.

@flying-sheep
Copy link
Contributor

@philippjfr
Copy link
Member

Yeah, this almost drove me insane. I don't think the conventions of the ordering and orientations expected of the flattened arrays make much sense but there was also some weird handling in the gridded interface. I've tried to resolve this in holoviz-topics/hv-anndata#89 and tried the various conditions, which now seem to work.

@flying-sheep
Copy link
Contributor

I’ll comment there!

@hoxbro hoxbro requested a review from maximlt September 11, 2025 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

3 participants