RAFT

Reusable CUDA-accelerated primitives for high-performance GPU ML

RAFT offers header-only CUDA-accelerated primitives and optional shared libraries, plus lightweight Python wrappers, to speed development of GPU-based machine-learning and data-mining applications.

Overview

Who should use RAFT?

RAFT is aimed at developers and engineers building custom GPU‑accelerated machine‑learning or data‑mining pipelines. It provides low‑level, reusable primitives that can be composed into higher‑level algorithms, making it ideal for library authors and performance‑focused teams.

What RAFT provides

The library is a C++ header‑only template collection with an optional shared library to reduce compile times. It includes host‑accessible runtime APIs, seamless integration with RAPIDS Memory Manager (RMM), and multi‑dimensional array abstractions (mdspan, mdarray). Python access is available through pylibraft and distributed GPU support via raft-dask. While RAFT once bundled vector‑search utilities, those have migrated to cuVS; users needing ANN should adopt cuVS directly.

Getting started and deployment

Install RAFT from source or via RAPIDS package channels, link against RMM, and create a raft::device_resources handle to manage streams and library handles. Use the C++ primitives directly or call them from Python with pylibraft. For multi‑node workloads, integrate raft-dask with Dask clusters to scale across GPUs.

Highlights

Header‑only C++ templates with optional shared library for faster builds

Host‑accessible runtime APIs usable without a CUDA compiler

Lightweight Python wrappers (`pylibraft`) and multi‑GPU Dask integration (`raft-dask`)

Unified memory management via RAPIDS Memory Manager and mdarray abstractions

Pros

High performance CUDA acceleration for core ML primitives
Reduces development time by providing reusable building blocks
Seamless interoperability with other RAPIDS libraries
Supports both single‑GPU and multi‑GPU distributed workflows

Considerations

Requires CUDA‑aware development expertise
Vector‑search functionality has moved to cuVS
Limited to C++ and Python interfaces
Steeper learning curve for developers unfamiliar with RAPIDS ecosystem

Managed products teams compare with

When teams consider RAFT, these hosted platforms usually appear on the same shortlist.

Pinecone

Managed vector database for AI applications

Qdrant

Open-source vector database

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

GPU‑centric application developers needing custom ML kernels
Library authors building higher‑level algorithms on top of primitives
Performance engineers optimizing existing CUDA codebases
Teams deploying distributed GPU workloads with Dask

Not ideal when

Data scientists seeking out‑of‑the‑box analytics tools
CPU‑only environments without GPU access
Projects that require built‑in vector‑search/ANN capabilities
Users unfamiliar with C++ template programming

How teams use it

Custom clustering algorithm

Leverage RAFT sparse operations and random blob generation to implement a GPU‑accelerated clustering pipeline.

High‑throughput linear algebra in simulations

Use RAFT dense matrix utilities to replace CPU BLAS calls, achieving significant speedups in scientific codes.

Python distance computation

Call `pylibraft` distance primitives from CuPy arrays to compute pairwise Euclidean distances without leaving the Python ecosystem.

Distributed GPU training with Dask

Integrate `raft-dask` to share resources and synchronize data across multiple GPU nodes during model training.

Tech snapshot

Cuda70%

C++19%

Jupyter Notebook7%

Python2%

Cython1%

Shell1%

Frequently asked questions

Which programming languages does RAFT support?

Core primitives are written in C++; lightweight Python wrappers are provided via `pylibraft` and `raft-dask`.

How does RAFT manage GPU memory?

RAFT relies on RAPIDS Memory Manager (RMM) and offers `mdarray`/`mdspan` abstractions that handle allocation and deallocation automatically.

Is RAFT compatible with other RAPIDS libraries?

Yes, RAFT is designed to interoperate with cuDF, cuML, and other RAPIDS components.

Where can I find vector‑search functionality?

Vector‑search and clustering have been moved to the dedicated cuVS library; use cuVS for those routines.

How do I install RAFT?

Install via conda or build from source following the RAFT documentation; ensure RMM is available in the environment.

Project at a glance

Active

Visit site View repo

Stars: 987
Watchers: 987
Forks: 227

LicenseApache-2.0

Repo age6 years old

Last commit2 days ago

Primary languageCuda

Last synced 3 hours ago