Arm64 enables more memory space, faster texture loading, some faster operations than Arm32, and reduced power consumption. You achieve the best performance boost if you switch entirely to Arm64 to benefit from its native architecture. However, this approach may be impossible when existing applications contain many dependencies. Often, it requires porting all your dependencies to Arm64 before porting the actual app, creating a bottleneck.
Arm64EC is a Windows 11 Application Binary Interface (ABI) that helps you transition existing x64 apps to Arm64. It enables your existing x64 dependencies to load in the same process as the Arm64 binaries, resolving the bottleneck of porting dependencies. This approach improves your app’s performance without changing any code.
Using Arm64EC helps migrate large and complex applications with their own ecosystem. The app vendor may not know of the dependencies the app customer requires. For example, professionals may use plugins from many independent vendors in an image processing application. When that image processing application is compatible with Arm64EC, the users can use all the plugins, whether or not the makers have ported them to Arm64. Arm64EC enables shipping applications that link legacy binary dependencies, even if they have missing source code or an unsupported toolchain.
This series demonstrates how to employ Arm64EC to port a complete application consisting of the main application and dependencies, including separate dynamic link libraries (DLLs). Another article uses a simple single DLL example to explain how Arm64EC works and how to port your existing x64 DLLs to Arm64.
In this article, you will build a Qt-based Python application with two C/C++-based DLL dependencies. This architecture mimics a typical scenario of using Python and Qt for rapid UI prototyping and DLLs for computation-intense work. The Python application demonstrates an alternative way of building a UI for C/C++ based DLL dependencies since native C/C++ dependencies are typical in the Python ecosystem and may not offer native builds yet. However, you could still use Qt-based C/C++ UI, as another article shows.
Port Applications with Arm64EC
This tutorial starts by using the first DLL, Vectors.dll, to calculate the dot product of two vectors. The DLL has a function that runs the calculations several times and returns the overall computation time. The UI displays the time.
The second DLL, Filters.dll, generates a synthetic signal and then truncates it using a predefined threshold. A Qt chart displays the input signal and the filtered one, like the image below.
You will use ctypes to invoke C/C++ DLLs.
Prerequisites
To follow this tutorial, ensure you have the following:
- Visual Studio 2022
- Arm64 build tools. Install these tools through your Visual Studio installer under Individual Components > MSVC v143 > VS 2022 C++ Arm64/ARM64EC build tools. Note that this is the default selected component.
- Python installed on your machine. This demonstration uses Python version 3.11.3.
For a detailed view of the tutorial, review the complete project code.
Project Setup
To set up the project, start by creating the dependencies (the DLLs). This demonstration uses CMake in Visual Studio 2022 to create the base project for your dependencies. You can also use MS Build/Visual C++ project templates to compile to Arm64EC by adding the architecture to your build configuration. To access CMake, click File > New > Project… and look for CMake Project in the window that appears.
Then, click Next and set the following configurations:
- Project name: Arm64EC.Porting
- Location: Choose any location
- Solution: Create new solution
- Solution name: Arm64EC.Porting
Finally, click Create.
Application Build
Now, invoke functions exported from two DLLs, Vectors.dll
and Filters.dll
, that you will create shortly. To implement this app, create a folder called Main-app under Arm64EC.Porting. Inside the Main-app folder, create a main.py script, and add a subfolder called dependencies. The dependencies folder will contain the DLLs you compile later in this tutorial.
Next, you need to install several dependencies. The first is PySide, which provides Python bindings for Qt. This action will also install Qt-dependent binaries. You install PySide via pip by running pip install pyside6
.
Alternatively, you can install PySide using a virtual environment by running python -m venv path_to_virtual_environment
. Then, activate the environment by running path_to_virtual_environment/Scripts/activate.bat
and install the dependencies by running pip install -r requirements.txt
. Note that for this method, you must first download the requirements.txt file from the companion code.
In the main.py file, import the ctypes
, sys
, os
, and Qt
packages:
import ctypes, sys, os
from PySide6 import QtCore, QtWidgets
from PySide6.QtGui import QPainter
from PySide6.QtCharts import QChart, QChartView, QLineSeries, QChartView, QValueAxis
Then, get absolute paths to your DLLs using the code below.
rootPath = os.getcwd()
vectorsLibName = os.path.join(rootPath, "Dependencies\\Vectors.dll")
filtersLibName = os.path.join(rootPath, "Dependencies\\Filters.dll")
Next, define the MainWindowWidget
class and its initializer:
class MainWindowWidget(QtWidgets.QWidget):
def __init__(self):
super().__init__()
self.buttonVectors = QtWidgets.QPushButton("Vectors")
self.buttonFilters = QtWidgets.QPushButton("Filters")
self.computationTimeLabel = QtWidgets.QLabel("", alignment=QtCore.Qt.AlignTop)
self.chart = QChart()
self.chart.legend().hide()
self.chartView = QChartView(self.chart)
self.chartView.setRenderHint(QPainter.Antialiasing)
self.layout = QtWidgets.QVBoxLayout(self)
self.layout.addWidget(self.computationTimeLabel)
self.layout.addWidget(self.buttonVectors)
self.layout.addWidget(self.buttonFilters)
self.layout.addWidget(self.chartView)
self.axisY = QValueAxis()
self.axisY.setRange(-150, 150)
self.chart.addAxis(self.axisY, QtCore.Qt.AlignLeft)
self.buttonVectors.clicked.connect(self.runVectorCalculations)
self.buttonFilters.clicked.connect(self.runTruncation)
The initializer here defines the UI. Specifically, the code above adds two buttons: Vectors and Filters. It also creates a label to display the computation time. Then, it generates the chart.
The code also adds all UI components to the vertical layout. It specifies the y-axis for plotting and associate two methods, runVectorCalculations
and runTruncation
, with the buttons. The user invokes those methods by pressing the corresponding buttons.
Next, define runVectorCalculations
as follows:
@QtCore.Slot()
def runVectorCalculations(self):
vectorsLib = ctypes.CDLL(vectorsLibName)
vectorsLib.performCalculations.restype = ctypes.c_double
computationTime = vectorsLib.performCalculations()
self.computationTimeLabel.setText(f"Computation time: {computationTime:.2f} ms")
The method loads the DLL using ctypes. Then, it sets the return type of the performCalculations
function, which comes from the Vectors.dll library, to double. Finally, the runVectorCalculations
method invokes the function from the library, and the label displays the resulting computation time.
Next, define the runTruncation
method in the main.py file under MainWindowWidget
class:
@QtCore.Slot()
def runTruncation(self):
filtersLib = ctypes.CDLL(filtersLibName)
self.chart.removeAllSeries()
filtersLib.generateSignal()
filtersLib.getInputSignal.restype = ctypes.POINTER(ctypes.c_double)
signal = filtersLib.getInputSignal()
seriesSignal = self.prepareSeries(signal, filtersLib.getSignalLength())
self.chart.addSeries(seriesSignal)
filtersLib.truncate()
filtersLib.getInputSignalAfterFilter.restype = ctypes.POINTER(ctypes.c_double)
signalAfterFilter = filtersLib.getInputSignalAfterFilter()
seriesSignalAfterFilter = self.prepareSeries(signalAfterFilter, filtersLib.getSignalLength())
self.chart.addSeries(seriesSignalAfterFilter)
seriesSignal.attachAxis(self.axisY)
seriesSignalAfterFilter.attachAxis(self.axisY)
As before, you first load the DLL. Then, you remove all series from the chart. This way, the chart clears whenever the user clicks the Filters button before plotting new data.
You should retrieve the inputSignal
and add it to the chart using a helper method, prepareSeries
, which copies data from the underlying pointer to the Python array. You should also invoke the truncate
method, retrieve the filtered signal, and add it to the plot. To do all this, add the following method to the main.py file:
def prepareSeries(self, inputData, length):
series = QLineSeries()
for i in range(1, length):
series.append(i, inputData[i])
return series
The last step is to add the MainWindowWidget
to the Qt application and show the application window. Do this by adding the following statements to the bottom of the main.py file.
if __name__ == "__main__":
app = QtWidgets.QApplication([])
widget = MainWindowWidget()
widget.resize(600, 400)
widget.show()
sys.exit(app.exec())
Add Dependencies
To add dependencies for the Python application, you must create two subfolders: Vectors and Filters.
Vectors
In the Vectors folder, create two files: Vectors.h and Vectors.cpp. Here is the declaration of the Vectors.h:
#pragma once
#include <iostream>
#include <chrono>
using namespace std;
extern "C" __declspec(dllexport) double performCalculations();
After the pragma precompiler declaration, the above declaration imports two headers: iostream
and chrono
. Then, you add the std
namespace and export one function, performCalculations
. Later, you will call this function from the main Python app.
Now, modify Vectors.cpp by including Vectors.h and defining three functions:
#include "Vectors.h"
int* generateRamp(int startValue, int len) {
int* ramp = new int[len];
for (int i = 0; i < len; i++) {
ramp[i] = startValue + i;
}
return ramp;
}
double dotProduct(int* vector1, int* vector2, int len) {
double result = 0;
for (int i = 0; i < len; i++) {
result += (double)vector1[i] * vector2[i];
}
return result;
}
double msElapsedTime(chrono::system_clock::time_point start) {
auto end = chrono::system_clock::now();
return chrono::duration_cast<chrono::milliseconds>(end - start).count();
}
In the above code, the first function — generateRamp
— creates the synthetic vector of a given length. You set the vector values using the startValue
and len
functions’ parameters. Then, you defined the dotProduct
function, which multiplies two input vectors element-wise. Finally, you added a helper function msElapsedTime
, which uses the C++ chrono
library to measure the code execution time.
Next, prepare another helper function below to generate two vectors, calculate their dot product, and measure the code execution time. You can use this function later to measure the code performance.
double performCalculations() {
const int rampLength = 1024;
const int trials = 100000;
auto ramp1 = generateRamp(0, rampLength);
auto ramp2 = generateRamp(100, rampLength);
auto start = chrono::system_clock::now();
for (int i = 0; i < trials; i++) {
dotProduct(ramp1, ramp2, rampLength);
}
return msElapsedTime(start);
}
Now, create the CMakeLists.txt file below in the Vectors directory. You will use this file for building.
add_library (Vectors SHARED "Vectors.cpp" "Vectors.h")
if (CMAKE_VERSION VERSION_GREATER 3.12)
set_property(TARGET Vectors PROPERTY CXX_STANDARD 20)
endif()
The above file sets the build target to a DLL using the SHARED
flag in the add_library
statement.
Filters
Now you can implement a second DLL similarly. Again, you use CMake (see Filters/CMakeLists.txt). First, create the Filters.h header file in the Filters folder:
#pragma once
#define _USE_MATH_DEFINES
#include <math.h>
#include <iostream>
#include <algorithm>
using namespace std;
#define SIGNAL_LENGTH 1024
#define SIGNAL_AMPLITUDE 100
#define NOISE_AMPLITUDE 50
#define THRESHOLD 70
double inputSignal[SIGNAL_LENGTH];
double inputSignalAfterFilter[SIGNAL_LENGTH];
extern "C" __declspec(dllexport) int getSignalLength();
extern "C" __declspec(dllexport) void generateSignal();
extern "C" __declspec(dllexport) void truncate();
extern "C" __declspec(dllexport) double* getInputSignal();
extern "C" __declspec(dllexport) double* getInputSignalAfterFilter();
Then, export five functions:
getSignalLength
— Returns the length of the synthetic signal defined under SIGNAL_LENGTH
generateSignal
— Creates the synthetic signal and stores it in the inputSignal global variable truncate
— Filters the signal by truncating all values above a THRESHOLD
getInputSignal
— Returns the generated signal (stored in the inputSignal
variable) getInputSignalAfterFilter
— Returns the filtered signal (stored in the inputSignalAfterFilter
variable)
Define these functions under Filters.cpp using the code below.
#include "Filters.h"
double* getInputSignal() {
return inputSignal;
}
double* getInputSignalAfterFilter() {
return inputSignalAfterFilter;
}
int getSignalLength() {
return SIGNAL_LENGTH;
}
Then, add the generateSignal
function, which creates the sine wave with a random additive noise:
void generateSignal() {
auto phaseStep = 2 * M_PI / SIGNAL_LENGTH;
for (int i = 0; i < SIGNAL_LENGTH; i++) {
auto phase = i * phaseStep;
auto noise = rand() % NOISE_AMPLITUDE;
inputSignal[i] = SIGNAL_AMPLITUDE * sin(phase) + noise;
}
}
Finally, the truncate
function appears below:
void truncate() {
for (int i = 0; i < SIGNAL_LENGTH; i++) {
inputSignalAfterFilter[i] = min(inputSignal[i], (double)THRESHOLD);
}
}
This function analyzes the inputSignal
and replaces all the values larger than the THRESHOLD
with that value. Other values are unmodified. For example, if the value is 100, it will be replaced by 70. On the other hand, the value of 50 will not change.
Compilation
Before running the application, you need to compile both libraries. Simply click the Build/Build All menu item, and DLL files will be available in the out/build/x64-release folder (Vectors/Vectors.dll and Filters/Filters.dll). You can also build the DLLs using the command line.
Once you generate the DLLs, copy them to Main-app/Dependencies
.
Results
Finally, run the main application by following these steps:
- Open the terminal and change the folder to the location with your
Arm64EC.Porting
project (for example, c:\Users\<User-name>\source\repos\Arm64EC.Porting) - If you used the virtual environment to install Python packages, activate the environment by typing Scripts\activate.bat. Ensure that you have installed Python dependencies by calling
pip install -r requirements.txt
. - Then, change the folder to Main-app, and type
python main.py
.
The application now launches.
First, click the button labeled Vectors. The application will run dot product calculations, and after a short while, the total computation time will display on a label. Then, click the second button, Filters. This action will plot the original signal in blue and the filtered signal in green, like the screenshot below.
You have now confirmed that the Python Qt-based application can load both dependencies. You can use this approach to implement computation-intense calculations in C++ DLLs and rapidly build the UI using Python bindings for Qt.
Next Steps
Developers often use Python and Qt for rapid UI prototyping. In this article, you learned how to prepare a Qt-based Python application with two C/C++-based DLL dependencies. You configured both DLLs for x64.
Before Arm64EC was available, porting the app containing several C++ dependencies to Arm64 was time-consuming. It required changing the platform to Arm64 and recompiling the entire solution simultaneously, often involving code changes.
Now, you can use Arm64EC to port selected dependencies to Arm64 by simply switching the build target from x64 to Arm64EC. This approach is helpful when the binary’s source is no longer available, you cannot build the dependencies yet because compilers or toolchains are missing or buggy, or you do not control third-party ecosystem dependencies but still want to load them at runtime for the user’s benefit.
The Arm64EC dependencies can work with x64 dependencies in the same process, enabling you to port your app and benefit from Arm64’s native computations. In the next article of this series, you will learn how to port the C++ dependencies to Arm64 using Arm64EC. Specifically, you will learn how to configure your C++ projects to build DLLs for Arm64EC. Then, you will discover how to load those dependencies in the Python application launched with Python for Arm64.
Try Arm64EC on Windows 11 and use the Windows Dev Kit 2023 as a cost-effective way to test your app on Arm64 devices running Windows.