Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Building Native Windows on Arm Apps with Python

0.00/5 (No votes)
15 May 2023 1  
This article demonstrates the convenience of using native Arm Python 3.11 on Arm-powered devices to experience up to a threefold performance boost over using it in emulation mode.

The Arm architecture brings power and efficiency to edge computing and mobile devices, especially for newer Windows on Arm (WoA) devices.

Python, a widely used programming language, now has native support for Arm platforms using Windows. Starting with Python 3.11, an official installer for WoA is now available, so it’s time to start targeting WoA.

This article demonstrates the convenience of using native Arm Python 3.11 on Arm-powered devices to experience up to a threefold performance boost over using it in emulation mode.

Python on Arm

CPython provides the official Python implementation and its standard library. It compiles code into bytecode before interpretation, enabling it to contain platform-specific code. You can install it using the installer for ARM64 or an older version available from Nuget or build it directly from the source.

Prerequisites

This tutorial uses the Windows Dev Kit 2023 (Project Volterra) for development. However, you can achieve a similar performance boost on Surface Pro 9 5G, and the Lenovo X13s or on Apple silicon devices, by using Parallels Desktop to run Windows 11.

Setting Up

To set up native Arm support for Python, you need native Arm C build tools. As explained in Arm documentation, you can install them using a standalone installer or through Visual Studio 2022 Community for desktop development with C++. This ensures that native Arm C build tools are present. Alternatively, you can use a standalone installer for the build tools.

Once you have Visual Studio installed, switch to Visual Studio Code (VS Code) for Arm64 for development so you can use its included Python tools.

After installing VS Code, install the Python extensions.

For this demo, you will install both the Arm64-specific Python and the standard non-Arm Python to compare and contrast them.

Start by installing Arm64 Python 3.11 and then x64Python 3.11. Choose the default settings for simplicity and consistency.

By default Python installs both packages in the following file path:

Users\<User_name>\AppData\Local\Programs\Python

There are two subfolders: Python311 for the x64 version and Python311-arm64 for the Arm64 version. By running python.exe from each subfolder, you see they are built with different 64-bit Microsoft C compilers: either AMD64 or ARM64.

Alternatively, to see the different Python versions, you could call py –3.11 or py –3.11-arm64.

The installation process is straightforward. However, because x64 and Arm64-based Python use different C compilers, the Python packages can have compatibility and porting issues. Traditionally, you install Python packages using pip, which automatically installs the dependencies. First, pip tries to find the platform-independent package (called the wheel). Then, it looks for the platform-specific package and eventually builds it from the source code.

Python Packages on Arm64

If you are writing Python packages to take advantage of Arm64, you must ensure you compile your packages for Arm64, not x64. This problem is not present for pure (platform-independent) Python packages.

Make the Python directory your working directory:

BAT
cd: Users\<User_name>\AppData\Local\Programs\Python

Now set up x64 by typing the following command:

BAT
Python311\python.exe -m pip install --upgrade pip

This upgrades pip to the most recent version.

Now, install the NumPy package, which you will use later to implement your sample application. To install NumPy, type:

BAT
Python311\Scripts\pip.exe install numpy

You’ll see that it downloaded the platform-specific NumPy’s wheel for x64 Python.

Now, repeat the procedure for the Arm64 version of Python:

BAT
Python311-Arm64\python.exe -m pip install --upgrade pip
Python311-arm64\Scripts\pip.exe install numpy

For Arm64, there is not a platform-specific wheel. So, pip downloads and builds the package from the source code to create the local Arm64 package wheel.

Development

You now have all the tools needed to implement the actual Python app.

Start by creating the new file, sample.py, in the PythonOnWoa directory.

Then, import the NumPy and time packages.

Python
import numpy as np
import time

The first package is for numerical computations and the second is for measuring the computation time.

Next, define a function that calculates a signal’s fast Fourier transform (FFT). Here, the signal is composed of a single-frequency sine wave with some random noise.

Repeat the FFT multiple times (trial_count) to have a stable estimate of the computation time.

Python
def perform_sin_fft(signal_length, frequency, trial_count):    
    start = time.time()
    
    for i in np.arange(1, trial_count+1):
        ramp = np.linspace(0, 2 * np.pi, signal_length)
        noise = np.random.rand(signal_length) 
 
        input_signal = np.sin(ramp * frequency) + 0.1*noise
        np.fft.fft(input_signal)
    
    computation_time = time.time() - start
 
    return computation_time

The above function returns the total time (in seconds) needed for calculating the FFT.

To measure the performance, invoke the perform_sin_fft function for various signal lengths.

Python
signal_lengths = [2**10, 2**11, 2**12, 2**13, 2**14]
trial_count = 5000
 
for signal_length in signal_lengths:
    frequency = int(signal_length / 4)
    computation_time = perform_sin_fft(signal_length, frequency, trial_count)
    print("Signal length {}, Computation time {:.3f} s".format(signal_length, computation_time))

Now run this script using Arm64 and non-Arm64 Python 3.11 to measure the performance difference:

Python
.\Python311\python.exe <path_to_your_sample.py>
.\Python311-arm64\python.exe <path_to_your_sample.py>

The first command executes the script using x64 emulation mode. The computation times depend on the signal length. Specifically, for 16,384 points, the computation time is 6.86 seconds. The second command uses Arm64 Python, producing much shorter computation times. The same 16,384-point computation takes 2.72 seconds, reducing the computation time to about 40 percent of the time needed by the emulation mode (x64). This difference represents a performance boost of about two and a half times the speed of the emulation mode.

This graph illustrates the computation times and the corresponding performance boosts.

The Future of Python on Arm

Python 3.11 with native Arm64 presents a massive opportunity for Python developers looking to get the most out of your Arm-powered devices on Windows 11. As more developers add support to their Python packages, you will see even more performance improvements.

One example is this Linaro demonstration of porting TensorFlow to Arm64, which displays impressive speed improvements and offers tremendous possibilities for AI, data scientists, and researchers reliant on the ease and power of Python.

Conclusion

This article walked you through installing native Arm64 Python 3.11 on Windows 11, including setting up your development environment to ensure all the necessary tools are in place.

You wrote a simple module that applied a fast Fourier transformation to a signal and saw the performance improvements Arm64 Python unlocked. This performance improvement accelerates support for WoA. Many companies are jumping on board to port libraries and the toolset, so they can employ Arm64 to accelerate Python workloads.

Get started with WoA today and try Python 3.11 from the official WoA installer to get the power you need with the efficiency you demand from your Arm devices.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here