-
Notifications
You must be signed in to change notification settings - Fork 22
First draft version for vectorize usage response #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* | ||
* @return | ||
*/ | ||
public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added method that accepts vectorize header to be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.
so lets make this PR just about what the embedding provider is doing
EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
.map(res -> res.embeddings()); | ||
.map( | ||
res -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting value from the aggregated vectorize usage info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now just want it on the response object from calling the embedding provider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIll remove the CDI related code.
int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
public static Response of(int batchId, List<float[]> embeddings) { | ||
return new Response(batchId, embeddings); | ||
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added default VectorizeUsageInfo until all embedding provider class get refactored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.
either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params
we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of
method for this record.
int totalToken = 0; | ||
String provider = ""; | ||
String modelName = ""; | ||
for (Response vectorizedBatch : vectorizedBatches) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aggregating the vectorize size from different micro batches.
import java.io.InputStream; | ||
import java.util.logging.Logger; | ||
|
||
public class NetworkUsageInterceptor implements ClientRequestFilter, ClientResponseFilter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interceptor to capture the byte size.
|
||
Uni<EmbeddingResponse> response = | ||
// ✅ Create an instance of NetworkUsageInfo and pass it to request properties | ||
Uni<jakarta.ws.rs.core.Response> response = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept non serialized output as service response
if (resp.data() == null) { | ||
return Response.of(batchId, Collections.emptyList()); | ||
res -> { | ||
EmbeddingResponse embeddingResponse = res.readEntity(EmbeddingResponse.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serialize as EmbeddingResponse
src/main/java/io/stargate/sgv2/jsonapi/service/embedding/operation/VectorizeUsageInfo.java
Outdated
Show resolved
Hide resolved
new ByteArrayInputStream(byteArrayOutputStream.toByteArray())); | ||
} | ||
LOGGER.info("Received Bytes: " + receivedBytes); | ||
responseContext.getHeaders().add("sent-bytes", String.valueOf(sentBytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting to response header to parse in the embedding provider client class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reasonable start, take out the Data API returning just make it the embedding provider for now because we are changing things, and some improvments on the code
* | ||
* @return | ||
*/ | ||
public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.
so lets make this PR just about what the embedding provider is doing
String vectorize = null; | ||
try { | ||
|
||
if (!Strings.isNullOrEmpty(vectorizeUsageBean.getModel())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 - just the embedding provider for now
EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
.map(res -> res.embeddings()); | ||
.map( | ||
res -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now just want it on the response object from calling the embedding provider
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); | ||
} | ||
|
||
public static Response of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this of() function has the same signature as the record constructor , why is it needed ?
int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
public static Response of(int batchId, List<float[]> embeddings) { | ||
return new Response(batchId, embeddings); | ||
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.
either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params
we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage
public class VectorizeUsageInfo { | ||
private int requestSize; | ||
private int responseSize; | ||
private int totalTokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also have the input and output tokens ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there is split like that in all the embedding api response. Will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified Not all provide these, Example cohere response is like `
"billed_units": {
2069 | "input_tokens": 2
2070 | },
`
Huggingface doesn't return any tokens size.
private String model; | ||
|
||
public VectorizeUsageInfo( | ||
int requestSize, int responseSize, int totalTokens, String provider, String model) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have an enum for the provider and/or model ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have an enum as it's configured on the provider configuration yaml.
modelName = vectorizedBatch.vectorizeUsageInfo().getModel(); | ||
} | ||
return Response.of(1, result); | ||
return Response.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see other comments, do not need a static factory on a record
} | ||
|
||
@Override | ||
public void filter(ClientRequestContext requestContext, ClientResponseContext responseContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we do not change the way of counting, and stick with this for an initial release, why not count both the request and the response in this function ?
src/main/java/io/stargate/sgv2/jsonapi/service/embedding/operation/NetworkUsageInterceptor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make the change as suggested and let you know.
sentBytes += vectorizedBatch.vectorizeUsageInfo().getRequestSize(); | ||
receivedBytes += vectorizedBatch.vectorizeUsageInfo().getResponseSize(); | ||
totalToken += vectorizedBatch.vectorizeUsageInfo().getTotalTokens(); | ||
provider = vectorizedBatch.vectorizeUsageInfo().getProvider(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is done considering there is only one vectorize per collection/table. Not sure of the code refactoring to support table's multi vectorize.
EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
.map(res -> res.embeddings()); | ||
.map( | ||
res -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIll remove the CDI related code.
int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
public static Response of(int batchId, List<float[]> embeddings) { | ||
return new Response(batchId, embeddings); | ||
return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of
method for this record.
public class VectorizeUsageInfo { | ||
private int requestSize; | ||
private int responseSize; | ||
private int totalTokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there is split like that in all the embedding api response. Will check.
private String model; | ||
|
||
public VectorizeUsageInfo( | ||
int requestSize, int responseSize, int totalTokens, String provider, String model) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have an enum as it's configured on the provider configuration yaml.
What this PR does:
Vectorize usage header
Checklist