Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Everything / Languages / CUDA

CUDA

CUDA

Great Reads

by Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
by Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
by ObiWan_MCC
A C# SMTP server (receiver).
by billconan, kavinguy
This article describes the implementation of a neural network with CUDA.

Latest Articles

by Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
by Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
by ObiWan_MCC
A C# SMTP server (receiver).
by billconan, kavinguy
This article describes the implementation of a neural network with CUDA.

All Articles

Sort by Score

CUDA 

by Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
by Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
by ObiWan_MCC
A C# SMTP server (receiver).
by billconan, kavinguy
This article describes the implementation of a neural network with CUDA.
by Intel
In this blog post, we highlight one particular class of low precision networks named binarized neural networks (BNNs), the fundamental concepts underlying this class, and introduce a Neon CPU and GPU implementation.
by Intel
Boosting Performance with Intel® FPGA SDK for OpenCL™ Technology
by Dan Buskirk
Understanding the organization of a Visual Studio project for CUDA development
by Nick Kopp
Performing base64 encoding on a graphics processing unit using CUDAfy.NET (CUDA in .NET).
by CodeProject
Version 2.6.5. Our fast, free, self-hosted Artificial Intelligence Server for any platform, any language
by Dhruv__Patel
In this article we compare and contrast SYCL and CUDA, and discuss how the oneAPI compiler can work with SYCL.
by grilialex
Flow and tools to convert Xilinx bitstreams to C source code for programming FPGA/CPLD
by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
by Ryan Scott White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
by Mike Lanzetta
In this post, I'll walk you through how to get one of the most popular toolkits up and running on Windows, and run through and explain some fun examples.
by Roman Ginzburg
A text overlay filter and a JPEG/JPEG2000 encoder using transform filters.
by hax_
Introduction to the open-source hxGrid library for distributed computing. Main benefits of the library: cluster uses only idle time of Windows 2000/XP/Vista workstation (no dedicated workstations required); easy to use; free.
by phoaivu
GPU Implementation of Extended Gaussian mixture model for Background Subtraction
by Afzaal Ahmad Zeeshan
In this post, I am going to walk you through creating your own central hub to allow your connected devices to authenticate people using facial recognition system.
by ChaoJui
High performance and good quality of image blurring
by Adam Wojnar
Simple .jp2/.j2k viewer using Kakadu executables demonstration pack for decoding
by Ryan Scott White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
by Intel
Theano is a Python library developed at the LISA lab to define, optimize, and evaluate mathematical expressions, including the ones with multi-dimensional arrays (numpy.ndarray)
by Alesiani Marco
A Wave PDE simulation using GPGPU capabilities
by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
by Mark H Bishop
Tutorial: GPU computing with JCuda and Nsight (Eclipse)
by Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
by Intel
This article introduces the beta release of the oneAPI product to facilitate heterogeneous programming.
by Nick Kopp
How to get 30x performance increase for queries by using your Graphics Processing Unit (GPU) instead of LINQ and PLINQ.
by Igor Gribanov
Performing linear static analysis on a tetrahedral mesh with a little bit of help from a third-party solver.
by Vangos
This post will show you how to build OpenCV for Windows with CUDA.
by Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
by Robert Mueller-Albrecht
Using the Intel® oneAPI Math Kernel Library SYCL API
by Joren Heit
A Hybrid Framework Code-Generator for CUDA
by Kerem Kat
Process webcam images on the CPU and GPU with OpenCV, CUDA and C++ AMP
by Arthur V. Ratz
In this article, we'll demonstrate an approach the allows to increase the performance (up to 600%) of the code that implements the conventional distribution counting algorithm (DCA) using NVIDIA CUDA 8.0 Runtime API
by Mark H Bishop
Getting Cuda started on a VS Express budget
by Thomas Daniels
In this article, let’s dive into Keras, a high-level library for neural networks.
by Adnan Boz
From spam filters to movie recommendation and face detection, nowadays machine learning algorithms are used everywhere to make the machine think for us. But, running these algorithms require high computation power and in most cases supercomputers. This is where the 500 core GPUs step in...
by ChaoJui
Image processing with a burst of performance from CUDA
by Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
by Kevin Drzycimski
Unroll loops at compile time, deduced by a template argument.
by Intel
This document demonstrates how a linear algebra Jacobi iterative method written in CUDA* can be migrated to the SYCL* heterogenous programing language.
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
by headmyshoulder
odeint v2 - Solving ordinary differential equations in C++
by Max R McCarty
OWASP's #6 most vulnerable security risk has to do with keeping secrets secret.
by Andrew Kirillov
This article describes the implementation of parallel computations using plain C#.
by Debdatta Basu
Examine the various approaches to implementing Radix sort on the GPU
by Arthur V. Ratz
This article is a practical guide on using Intel® Threading Building Blocks (TBB) and OpenMP libraries for C++ based on the example of delivering parallel scalable code that implements Burrows-Wheeler Transformation (BWT) algorithm.
by CatchExAs
How to make best use of current technology for computationally intensive applications?
by manythreads
This sixth article in a series on portable multithreaded programming using OpenCL™ where Rob Farber discusses how to calculate data in OpenCL™ and render it with OpenGL within the same application.
by Shao Voon Wong
Finding lexicographical permutations on GPU
by Jeremy C. Ong
A quick 5-minute introduction to porting a CUDA app to Data Parallel C++ (DPC++)
by Maxim Kartavenkov
Article describes how to make DirectShow Filters in .NET, it consist of BaseClasses and couple of samples
by Matthew Faithfull
Querysoft Open Runtime: Architecture compatibility aspect.
by Shao Voon Wong
How to convert a code from parallel C++ ray-tracing code to CUDA, then to SYCL 2020 via Intel® DPC++
by Sushil Sh.
How to setup android development enviornment using eclipse and Android studio.
by Philippe Kirsanov
A small class representing DateTime in seconds elapsed since "01 Jan, 0001 00:00:00".
by headmyshoulder, Denis Demidov
This article shows how ordinary differential equations can be solved with OpenCL. In detail it shows how odeint - a C++ library for ordinary differential equations - can be adapted to work with VexCL - a library for OpenCL. The resulting performance is studied on two examples.
by Dhruv__Patel
In this article we compare and contrast SYCL and CUDA, introduce oneAPI, and discuss how the oneAPI compiler can work with SYCL.
by Alex Mikunov
Runtime MSIL Code Instrumentation and .NET Metadata Extensions
by Dino Konstantopoulos
Running Theano with an Nvidia 1070 GPU on Windows 10, with CUDA 8 and Visual Studio 2015
by Intel
TotalView includes a set of tools that provide scientific and academic developers with controlover processes and thread execution, along with deep visibility into program states and data.
by Jeff B. Cromwell
Granger Causality in both R and C#.NET with open source libraries.
by Nick Kopp
Ultra high quality frequency domain image rotation on a GPU.
by Nick Kopp
An introduction to using Cudafy.NET to perform processing on a GPU
by Sergiu Ovidiu Oprea
This article is a hands-on look at the process of converting CUDA to SYCL.
by Denis Demidov
This article is an introduction to VexCL. VexCL is vector expression template library created for ease of C++ based OpenCL development.
by grilialex
How-To Embed Xilinx FPGA Configuration Data to AVRILOS