Introduction
Wrapping some great AI code in a CodeProject.AI module is straightforward for the cases where your code performs a quick inference then returns the results to the server. For cases where the AI operation is longer - for example generative AI - this flow won't work due to timeouts and a generally poor user experience.
This article will show you how to create a module for CodeProject.AI Server that wraps some code that takes a long time to complete. We will focus solely on the code required to write the adapter for our AI code, and not on the AI code itself. For that, and a fun example of an LLM on your desktop, please read the follow up article by Matthew Dennis Creating a LLM Chat Module for CodeProject.AI Server.
Getting Started
We're going to assume you have read CodeProject.AI Module creation: A full walkthrough in Python. We'll be creating a module in exactly the same manner, with the small addition that we'll show how to handle long running processes.
First, as always, clone the CodeProject.AI Server repo and in the /src/modules folder create a new folder for your module. We'll call it PythonLongProcess. A simple name for us simple folk.
We will also assume we have some code that we want to expose via CodeProject.AI Server. The amazing code we'll be wrapping is below:
import time
cancelled = False
def a_long_process(callback):
result = ""
step = 0
cancelled = False
for i in range(1, 11):
if cancelled: break
time.sleep(1)
step = 1 if not step else step + 1
result = str(step) if not result else f"{result} {step}"
callback(result, step)
def cancel_process():
global cancelled
cancelled = True
All the code does is progressively build a string containing the numbers 1 - 10. At each step it checks if the process has been cancelled, and also calls a callback to allow the caller to check on progress. Nothing exciting, but it'll serve as a good demo.
Creating the adapter
We want to wrap this long process code in a CodeProject.AI Server module, so we'll create an adapter, a modulesettings.json file, install scripts and a test page. We'll start with the adapter.
Our adapter will be very bare-bones. We don't need to get values from the caller, there's not a lot of error checking, and we're not going to log any info.
We need to create a ModuleRunner
derived class and override the initialize
and process
methods. To provide support for long processes we also need to override command_status
, cancel_command_task
, and provide a method that will actually call the long process we're wrapping. It's this last piece that provides long process support in modules.
Long Process support
To allow a CodeProject.AI Server module to handle long processes we do three things:
- Signal to the caller, and to the Server itself, that a call to a method is going to result in a long process.
- Run the long process in the background
- Provide the means to check on it's status and cancel if necessary
To do this we return a Callable from the usual process
method, rather than a JSON object that would normally contains the results of call. Returning a Callable signals to the server that we need to run a method in the background. The caller will then need to poll the module status API to check on progress, and if needed, call the cancel task API to cancel the long running process.
- To check a module's status you make an API call to /v1/<moduleid>/get_command_status
- To cancel a long process you make an API call to /v1/<moduleid>/cancel_command.
These routes are automatically added to each module and do not need to be defined in the module settings files. The calls will map to the module's command_status
and cancel_command_task
methods respectively.
The Code
Here is the (mostly) complete listing for our adapter. Note the usual initialize
and process
methods, as well as the long_process
method which is returned from process
to signal a long process is starting.
Within long_process
we don't do much other than call the code we're wrapping (a_long_process
) and report back the results.
The command_status
and cancel_command_task
methods are equally simple: return what we have so far, and cancel the long operation if requested.
The final piece is our long_process_callback
which we pass to long_process
. This will receive updates from long_process
and gives us the chance to collect interim results.
... other imports go here
from long_process import a_long_process, cancel_process
class PythonLongProcess_adapter(ModuleRunner):
def initialise(self) -> None:
self.result = None
self.step = 0
self.cancelled = False
self.stop_reason = None
def process(self, data: RequestData) -> JSON:
return self.long_process
def long_process(self, data: RequestData) -> JSON
self.cancelled = False
self.stop_reason = None
self.result = None
self.step = 0
start_time = time.perf_counter()
a_long_process(self.long_process_callback)
inferenceMs : int = int((time.perf_counter() - start_time) * 1000)
if self.stop_reason is None:
self.stop_reason = "completed"
response = {
"success": True,
"result": self.result,
"stop_reason": self.stop_reason,
"processMs": inferenceMs,
"inferenceMs": inferenceMs
}
return response
def command_status(self) -> JSON
return {
"success": True,
"result": self.result or ""
}
def cancel_command_task(self)
cancel_process()
self.stop_reason = "cancelled"
self.force_shutdown = False
def long_process_callback(self, result, step)
self.result = result
self.step = step
if __name__ == "__main__":
PythonLongProcess_adapter().start_loop()
Create the modulesettings.json files
Again, make sure you've reviewed A full walkthrough in Python and The ModuleSettings files. Our modulesettings file is very basic, with the interesting bits being:
- The path to our adapter, which will be used to launch the module, is long_process_demo_adapter.py
- We'll run under python3.9
- We'll define a route "pythonlongprocess/long-process" that takes a command "command" that doesn't accept any input values and returns a string "reply"
- It can run on all platforms
{
"Modules": {
"PythonLongProcess": {
"Name": "Python Long Process Demo",
"Version": "1.0.0",
"PublishingInfo" : {
...
},
"LaunchSettings": {
"FilePath": "llama_chat_adapter.py",
"Runtime": "python3.8",
},
"EnvironmentVariables": {
...
},
"GpuOptions" : {
...
},
"InstallOptions" : {
"Platforms": [ "all" ],
...
},
"RouteMaps": [
{
"Name": "Long Process",
"Route": "pythonlongprocess/long-process",
"Method": "POST",
"Command": "command",
"MeshEnabled": false,
"Description": "Demos a long process.",
"Inputs": [
],
"Outputs": [
{
"Name": "success",
"Type": "Boolean",
"Description": "True if successful."
},
{
"Name": "reply",
"Type": "Text",
"Description": "The reply from the model."
},
...
]
}
]
}
}
}
There's a fair bit of boilerplate that has been removed from this snippet, so please refer to the source code to see the full Monty.
The installation scripts.
We don't actually have any installing to do for our example. When this module is downloaded, the server will unpack it, move the files to the correct folder, and then run the install script so we can perform any actions needed to setup the module. We need do nothing, so we'll include empty scripts. Not including a script will signal to the server that this module should not be installed.
@if "%1" NEQ "install" (
echo This script is only called from ..\..\setup.bat
@goto:eof
)
call "!sdkScriptsDirPath!\utils.bat" WriteLine "No custom setup steps for this module." "!color_info!"
if [ "$1" != "install" ]; then
read -t 3 -p "This script is only called from: bash ../../setup.sh"
echo
exit 1
fi
writeLine "No custom setup steps for this module" "$color_info"
Create the CodeProject.AI Test page (and the Explorer UI)
We have the code we wish to wrap and expose to the world, an adapter to do this, a modulesettings.json file to define how to setup and start our adapter, and our install scripts. The final piece is the demo page that allows us to test our new module.
Our demo page (explore.html) is as basic as it gets: a button to start the long process, a button to cancel, and an output pane to view the results.
<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title>Python Long Process demo module</title>
<link id="bootstrapCss" rel="stylesheet" type="text/css" href="http://localhost:32168/assets/bootstrap-dark.min.css">
<link rel="stylesheet" type="text/css" href="http://localhost:32168/assets/server.css?v=2.6.1.0">
<script type="text/javascript" src="http://localhost:32168/assets/server.js"></script>
<script type="text/javascript" src="http://localhost:32168/assets/explorer.js"></script>
<style>
</style>
</head>
<body class="dark-mode">
<div class="mx-auto" style="max-width: 800px;">
<h2 class="mb-3">Python Long Process demo module</h2>
<form method="post" action="" enctype="multipart/form-data" id="myform">
<!--
<div class="form-group row g-0">
<input id="_MID_things" class="form-control btn-success" type="button" value="Start long process"
style="width:9rem" onclick="_MID_onLongProcess()"/>
<input id="_MID_cancel" class="form-control btn-warn" type="button" value="Cancel"
style="width:5rem" onclick="_MID_onCancel()"/>
</div>
<!--
<div>
<h2>Results</h2>
<div id="results" name="results" class="bg-light p-3" style="min-height: 100px;"></div>
</div>
</form>
<script type="text/javascript">
let _MID_params = null;
async function _MID_onLongProcess() {
if (_MID_params) {
setResultsHtml("Process already running. Cancel first to start a new process");
return;
}
setResultsHtml("Starting long process...");
let data = await submitRequest('pythonlongprocess/long-process', 'command', null, null);
if (data) {
_MID_params = [['commandId', data.commandId], ['moduleId', data.moduleId]];
let done = false;
while (!done) {
await delay(1000);
if (!_MID_params)
break;
let results = await submitRequest('pythonlongprocess', 'get_command_status',
null, _MID_params);
if (results && results.success) {
if (results.commandStatus == "failed") {
done = true;
setResultsHtml(results?.error || "Unknown error");
}
else {
let message = results.result;
if (results.commandStatus == "completed")
done = true;
setResultsHtml(message);
}
}
else {
done = true;
setResultsHtml(results?.error || "No response from server");
}
}
_MID_params = null;
};
}
async function _MID_onCancel() {
if (!_MID_params)
return;
let moduleId = _MID_params[1][1];
let result = await submitRequest(moduleId, 'cancel_command', null, _MID_params);
if (result.success) {
_MID_params = null;
setResultsHtml("Command stopped");
}
}
</script>
</div>
</body>
</html>
Conclusion
Wrapping code that takes a long time to execute in a CodeProject.AI module is straightforward thanks to help from the server. It helps enormously if the code you are wrapping provides a means of regularly querying its progress, but even that isn't necessary (though the user experience will suffer a little).
We've used the long process support to wrap a text-to-image module using stable diffusion, and the Llama large language model to provide ChatGPT functionality on your desktop. The only additional beyond writing a standard CodeProject.AI Server module was adding the methods to the adapter to check status and cancel if necessary, plus the code to actually call these methods in our test HTML page.
Long process support is perfect for generative AI solutions, but is also useful where you wish to support AI operations on low-spec hardware. While OCR, for instance, might take a fraction of a second on a decent machine, running the same text detection and recognition models on large amounts of data on a Raspberry Pi could take a while. Offering the functionality via a long process module can provide a better user experience and avoid issues of HTTP timeouts.