-
Notifications
You must be signed in to change notification settings - Fork 267
Description
Backpressure Implementation in NoFlo
HUMAN:
In this noflo component, Use backpressure to prevent new input from being processed until the current conversation is done.
// Load the NoFlo interface
import { ILLMResultPart, LanguageModelRunner } from '@services/languageModel/interface';
import { Component } from 'noflo';
import { Observable } from 'rxjs';
const runner = LanguageModelRunner.llamaCpp;
class LLaMaChat extends Component {
description = 'Call local Llama model';
icon = 'file';
/**
* Only allow one conversation at a time. If this is not null, it means there is a conversation in progress.
* Use backpressure to prevent new input from being processed until the current conversation is done.
*/
currentConversationID: string | null = null;
constructor() {
super();
// Define the component's inports
this.inPorts.add('prompt', {
datatype: 'string',
description: 'Input for LLM',
required: true,
});
this.inPorts.add('cpu_count', {
datatype: 'number',
description: 'Number of CPU cores to use',
required: true,
default: 4,
});
// token by token
this.outPorts.add('token', {
datatype: 'string',
});
// full result when done
this.outPorts.add('result', {
datatype: 'string',
});
// Register a process handler for incoming data
this.process((input, output) => {
if (!input.hasData('prompt')) {
return;
}
// Retrieve the incoming data from the inport
const prompt = input.getData('prompt') as string;
// eslint-disable-next-line @typescript-eslint/strict-boolean-expressions
const cpuCount = input.getData('cpu_count') as number || 4;
this.currentConversationID = String(Date.now());
const runnerResultObserver: Observable<ILLMResultPart> = window.observables.languageModel.runLanguageModel$(runner, {
completionOptions: {
prompt,
nThreads: cpuCount,
},
id: this.currentConversationID,
});
/**
* Wait for Observable to done, then send this full output.
*/
let fullResult = '';
// Subscribe to the observable to process the results
runnerResultObserver.subscribe({
next: (resultPart: ILLMResultPart) => {
// Process the result part as needed
// For this example, we'll just send the result to the outport
output.send({
token: resultPart.token,
});
fullResult += resultPart.token;
},
complete: () => {
this.currentConversationID = null;
output.send({
result: fullResult,
});
// Mark the process as finished
output.done();
},
error: (error: Error) => {
this.currentConversationID = null;
output.done(error);
},
});
});
}
async tearDown() {
if (this.currentConversationID !== null) {
await window.service.languageModel.abortLanguageModel(runner, this.currentConversationID);
}
}
}
// Register the component
export const getComponent = () => new LLaMaChat();ASSISTANT:
Your LLaMaChat component in NoFlo is designed to process an input prompt through the Llama model and send the result token by token, followed by the full result once completed. You also have a mechanism to only allow one conversation at a time using the currentConversationID.
To achieve backpressure, you should prevent new input data from being processed until the currentConversationID is null, indicating that the current conversation has finished processing.
Here's how you can implement backpressure in your process handler:
- When the process handler is invoked, check if
currentConversationIDis null before processing new input. - If
currentConversationIDis not null, simply return without processing the new input. - After the observable has completed or encountered an error, set
currentConversationIDback to null to allow new input data to be processed.
You've already implemented steps 2 and 3. Here's the modification needed for step 1:
this.process((input, output) => {
// If there's a conversation in progress, skip processing the new input
if (this.currentConversationID !== null) {
return;
}
if (!input.hasData('prompt')) {
return;
}
// ... rest of the code
});With this modification, when the process handler is invoked while a conversation is still in progress, it won't process the new input, effectively implementing backpressure.
Another thing to note is the behavior of NoFlo's hasData method. It checks if there's data in the input port without dequeuing it. This means that even if you return early due to a current conversation, the data remains in the port buffer and the process handler will be invoked again the next time NoFlo processes components. This way, once the current conversation is finished and currentConversationID is set back to null, the process handler will process the queued input during the next cycle.
HUMAN:
Does this mean if I not calling `input.getData('prompt')` and `input.getData('cpu_count')`, these two data will remain in some 'queue', so we can still access it later? But does this means we should manually trigger `this.process` in some way when previous Observal is done?
ASSISTANT:
Yes, you're correct in your understanding. When you call input.hasData('prompt'), it checks if there's data available on that port without dequeuing it. If you do not call input.getData('prompt'), the data remains in the input buffer of the port.
However, the behavior of NoFlo is that it will automatically call this.process again for data that remains in the buffer. The NoFlo scheduler determines which components should be processed based on the data in their input buffers.
This means you do not need to manually trigger this.process. If you return early from this.process because currentConversationID is not null (indicating a conversation is still in progress), the data will remain in the input buffer. Once the observable completes and currentConversationID is set to null, this.process will be called again by NoFlo during its next processing cycle, and the data that was in the buffer will be processed.
This behavior of NoFlo allows you to implement backpressure without having to manually manage the reprocessing of skipped data.
HUMAN:
You said `this.process` will be called again by NoFlo during its next processing cycle, how is this happened, what triggers this? Is it `output.done();` in my code? Will `output.send` trigger "next processing cycle"? Is `output.done();` necessary in every component, or normal component just need `output.send`?
ASSISTANT:
NoFlo's processing mechanism can be a bit intricate. Let's break it down:
-
The Trigger for Processing: The core of NoFlo operates with an event-driven model. When data arrives on an input port of a component, it triggers the
processfunction of that component. The exact mechanism is based on the way the NoFlo Network is built and how data flows through it. -
output.sendvs.output.done:output.send: This sends data to the corresponding outport. It does not indicate that the processing of the current input data is complete. Therefore,output.senditself does not trigger another cycle for theprocessfunction.output.done: This is used to signal that the processing for the current input data has been finished. It can also be used to propagate errors. Onceoutput.doneis called, the data that triggered the current process function is removed from the input buffer, which allows new data to be processed in a subsequent cycle. If there's more data in the input buffer (because you didn't callinput.getDatapreviously due to an ongoing conversation), this will trigger theprocessfunction again in the next cycle.
-
Necessity of
output.done: For components that operate asynchronously (like yours, which uses observables),output.doneis essential. It tells NoFlo that the component has finished processing the current input and is ready for the next piece of data. Withoutoutput.done, NoFlo would not know when to consider the processing complete and might not handle subsequent data properly. -
Normal Component Behavior: In synchronous components where processing happens immediately, you can often get away with just using
output.sendif the processing is straightforward. But it's still good practice to calloutput.doneto explicitly indicate that processing is complete. For asynchronous components or ones with potential multiple exit points,output.donebecomes crucial to ensure the component behaves correctly in the flow of the NoFlo network.
In your code, output.done() is doing the work of signaling to NoFlo that the current conversation has been processed completely, either successfully (result obtained) or with an error. This makes space for the next conversation in the queue to be processed in the subsequent cycle.
HUMAN: