Description
Edit: note that to_values
is deprecated; use to_coo
instead
We have found that we sometimes perform to_values()
followed by from_values()
to perform different operations such as creating a diagonal matrix or using the values as an index.
We may want to give this common(-ish) pattern a name. For example:
Matrix:
@classmethod
def from_components(
row_obj,
col_obj,
val_obj,
dtype=None,
*,
row_how="row",
col_how="col",
val_how="val",
dup_op=None,
nrows=None,
ncols=None,
name=None,
):
Vector:
@classmethod
def from_components(
index_obj,
val_obj,
dtype=None,
*,
index_how="index",
dup_op=None,
size=None,
name=None,
):
For example, in generalized_degrees
python-graphblas/graphblas-algorithms#11, we compute a number for each edge, and then we calculate a histogram of counts using the column index as the count:
rows, cols, vals = Tri.to_values()
# The column index indicates the number of triangles an edge participates in.
# The largest this can be is `A.ncols - 1`. Values is count of edges.
return Matrix.from_values(
rows,
vals,
np.ones(vals.size, dtype=int),
dup_op=binary.plus,
nrows=A.nrows,
ncols=A.ncols - 1,
name="generalized_degree",
)
would be written as
return Matrix.from_components(
Tri,
Tri,
1,
col_how="val",
dup_op=binary.plus,
ncols=Tri.ncols - 1,
name="generalized_degree",
)
row_how
, col_how
, and val_how
should be one of {"index", "row", "col", "val"}
. For Vectors, we treat "row"
and "col"
the same as "index"
. It is an error to use "index"
for Matrix objects.
Default nrows
and ncols
are inferred from what they come from. For example, for nrows
:
row_how == "row"
:nrows = row_obj.nrows
row_how == "col"
:nrows = row_obj.ncols
row_how == "val"
:nrows = max(row_obj) + 1
row_obj
, col_obj
, and val_obj
must match type and structure, but val_obj
may also be a scalar. We will verify types and nvals
, but not structure (behavior is undefined if structures don't match). Note that a Matrix can be created from Vector components, and vice versa.
Values should be integral dtype when used as indices. If signed integer, we should take the min to check for negative indices. If negative, should we raise (the easiest to start with), or use it as negative offset from the end?
Until this is implemented, let's try to gather examples here where we would use this function or similar functions. Maybe we'll find different variations, better APIs, better names, or better recipes.
A potential goal is to try to get this functionality implemented in C or added to the C spec (if it is sufficiently justified), which may need to be less flexible than our Python API. But, we won't know if such functionality will be useful until we try to encapsulate it in a function in Python!
Alright, let's gather examples (from LAGraph, grblas-recipes
, etc) to see if this function works...