Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

maheshrajamani
Copy link
Contributor

@maheshrajamani maheshrajamani commented Feb 11, 2025

What this PR does:
Vectorize usage header

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CLA Signed: DataStax CLA

@maheshrajamani maheshrajamani requested a review from a team as a code owner February 11, 2025 17:25
*
* @return
*/
public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added method that accepts vectorize header to be returned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.

so lets make this PR just about what the embedding provider is doing

EmbeddingProvider.EmbeddingRequestType.INDEX)
.map(res -> res.embeddings());
.map(
res -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting value from the aggregated vectorize usage info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now just want it on the response object from calling the embedding provider

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIll remove the CDI related code.

int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) {
public static Response of(int batchId, List<float[]> embeddings) {
return new Response(batchId, embeddings);
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", ""));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added default VectorizeUsageInfo until all embedding provider class get refactored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.

either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params

we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of method for this record.

int totalToken = 0;
String provider = "";
String modelName = "";
for (Response vectorizedBatch : vectorizedBatches) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggregating the vectorize size from different micro batches.

import java.io.InputStream;
import java.util.logging.Logger;

public class NetworkUsageInterceptor implements ClientRequestFilter, ClientResponseFilter {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interceptor to capture the byte size.


Uni<EmbeddingResponse> response =
// ✅ Create an instance of NetworkUsageInfo and pass it to request properties
Uni<jakarta.ws.rs.core.Response> response =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept non serialized output as service response

if (resp.data() == null) {
return Response.of(batchId, Collections.emptyList());
res -> {
EmbeddingResponse embeddingResponse = res.readEntity(EmbeddingResponse.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialize as EmbeddingResponse

new ByteArrayInputStream(byteArrayOutputStream.toByteArray()));
}
LOGGER.info("Received Bytes: " + receivedBytes);
responseContext.getHeaders().add("sent-bytes", String.valueOf(sentBytes));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting to response header to parse in the embedding provider client class.

Copy link
Contributor

@amorton amorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reasonable start, take out the Data API returning just make it the embedding provider for now because we are changing things, and some improvments on the code

*
* @return
*/
public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.

so lets make this PR just about what the embedding provider is doing

String vectorize = null;
try {

if (!Strings.isNullOrEmpty(vectorizeUsageBean.getModel())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 - just the embedding provider for now

EmbeddingProvider.EmbeddingRequestType.INDEX)
.map(res -> res.embeddings());
.map(
res -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now just want it on the response object from calling the embedding provider

return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", ""));
}

public static Response of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this of() function has the same signature as the record constructor , why is it needed ?

int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) {
public static Response of(int batchId, List<float[]> embeddings) {
return new Response(batchId, embeddings);
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", ""));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.

either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params

we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage

public class VectorizeUsageInfo {
private int requestSize;
private int responseSize;
private int totalTokens;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also have the input and output tokens ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is split like that in all the embedding api response. Will check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified Not all provide these, Example cohere response is like `
"billed_units": {

2069 | "input_tokens": 2
2070 | },

`
Huggingface doesn't return any tokens size.

private String model;

public VectorizeUsageInfo(
int requestSize, int responseSize, int totalTokens, String provider, String model) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have an enum for the provider and/or model ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an enum as it's configured on the provider configuration yaml.

modelName = vectorizedBatch.vectorizeUsageInfo().getModel();
}
return Response.of(1, result);
return Response.of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments, do not need a static factory on a record

}

@Override
public void filter(ClientRequestContext requestContext, ClientResponseContext responseContext)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do not change the way of counting, and stick with this for an initial release, why not count both the request and the response in this function ?

Copy link
Contributor Author

@maheshrajamani maheshrajamani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make the change as suggested and let you know.

sentBytes += vectorizedBatch.vectorizeUsageInfo().getRequestSize();
receivedBytes += vectorizedBatch.vectorizeUsageInfo().getResponseSize();
totalToken += vectorizedBatch.vectorizeUsageInfo().getTotalTokens();
provider = vectorizedBatch.vectorizeUsageInfo().getProvider();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is done considering there is only one vectorize per collection/table. Not sure of the code refactoring to support table's multi vectorize.

EmbeddingProvider.EmbeddingRequestType.INDEX)
.map(res -> res.embeddings());
.map(
res -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIll remove the CDI related code.

int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) {
public static Response of(int batchId, List<float[]> embeddings) {
return new Response(batchId, embeddings);
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", ""));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of method for this record.

public class VectorizeUsageInfo {
private int requestSize;
private int responseSize;
private int totalTokens;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is split like that in all the embedding api response. Will check.

private String model;

public VectorizeUsageInfo(
int requestSize, int responseSize, int totalTokens, String provider, String model) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an enum as it's configured on the provider configuration yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants