LOS GATOS, CALIF.,– CacheQ Systems, Inc. has announced GPU support for its QCC Acceleration Platform. It is a heterogeneous compute development environment delivering faster performance and reduced development time for computer architectures including multi-core processors, GPUs and field programmable gate arrays (FPGA).
“Demand for hardware acceleration using GPUs and other heterogenous compute hardware is growing exponentially,” remarks Clay Johnson, CEO and co-founder of CacheQ Systems, developer of heterogeneous acceleration solutions. “Our goal is to simplify high-performance data center and edge-computing application development. The QCC Acceleration Platform meets that goal and will enable new solutions across a variety of applications.”
GPU deployment has advanced at a rapid pace in the last five years. The yearly $25 billion dollar industry is expected to continue growing at approximately 33% CAGR through 2028.
The QCC Acceleration Platform Advantage
Heterogeneous compute systems such as multicore processors, GPUs as well as FPGAS attached to these processing systems have relied on software tools supported by hardware vendors and the open-source community. These tools traditionally relied upon software developers to pass information to the compilers. This is to express parallelism in their code accomplished through hardware-specific APIs such as CUDA from NVIDIA, HIP from AMD, and oneAPI from Intel.
Other efforts attempt to support pragmas embedded in C, C++, and Fortran through OpenACC, OpenMP, and OpenCL. All require deep knowledge of the target hardware to control memory copy and synchronization events. Additionally, to create teams of threads, manually remove loop carry dependencies, race conditions, and to add summations. The purpose is to achieve performance and correct code behavior on parallel compute units.
CacheQ QCC is the first compiler platform to automatically extract parallelism from standard C, C++, and Fortran code. It does not require the developer to explicitly communicate parallelism to the compiler. QCC automatically accelerates applications using a variety of hardware, exceeding the performance of pragma-based approaches. It can also approach hand-coded API solutions with minimal hardware knowledge. This allows a developer to write generic code and target high-performance hardware at compile time without refactoring code, or refactoring in such a way that it is not target hardware specific and is easily functionally verifiable.
Based on the proprietary CacheQ virtual machine (CQVM), the QCC Acceleration Platform is a heterogenous compute development environment that converts serial high-level language (HLL) code into a parallel representation in less than 30 seconds for the most complex designs. It supports code profiling, utilization estimates, performance simulation, memory configuration and partitioning across a variety of compute engine processors including GPUs, x86, Arm and RISC-V, and FPGAs prior to generating a compute executable.
Features:
Features include a development environment with uniform drivers, protected containers and support for multiple boards from multiple vendors. Its design analysis offers profiling, performance simulation and memory activity reporting. An optimization capability adds code unrolling, user-driven memory configuration, and automatic and user-guided partitioning.
The FPGA implementation includes a resource estimator, pre-configured shells, multiple boards and parts, and implementation tool automation. The memory implementation supports automatic integration, multi-port/multi-access and striping.
Availability and Pricing
The QCC Acceleration Platform is shipping now in limited volume with general availability in project to be in late 2023. The 0.18 release supports GPUs from nVidia and AMD, FPGA accelerator boards from Xilinx, and CPUs from Intel, AMD, Arm, Apple, and RISC-V.
Pricing is available on request.
Meanwhile, visit the CacheQ website for additional information, or requests for a demonstration or early access to the QCC Acceleration Platform.