RAFT logo

RAFT

Reusable CUDA-accelerated primitives for high-performance GPU ML

RAFT offers header-only CUDA-accelerated primitives and optional shared libraries, plus lightweight Python wrappers, to speed development of GPU-based machine-learning and data-mining applications.

RAFT banner

Overview

Who should use RAFT?

RAFT is aimed at developers and engineers building custom GPU‑accelerated machine‑learning or data‑mining pipelines. It provides low‑level, reusable primitives that can be composed into higher‑level algorithms, making it ideal for library authors and performance‑focused teams.

What RAFT provides

The library is a C++ header‑only template collection with an optional shared library to reduce compile times. It includes host‑accessible runtime APIs, seamless integration with RAPIDS Memory Manager (RMM), and multi‑dimensional array abstractions (mdspan, mdarray). Python access is available through pylibraft and distributed GPU support via raft-dask. While RAFT once bundled vector‑search utilities, those have migrated to cuVS; users needing ANN should adopt cuVS directly.

Getting started and deployment

Install RAFT from source or via RAPIDS package channels, link against RMM, and create a raft::device_resources handle to manage streams and library handles. Use the C++ primitives directly or call them from Python with pylibraft. For multi‑node workloads, integrate raft-dask with Dask clusters to scale across GPUs.

Highlights

Header‑only C++ templates with optional shared library for faster builds
Host‑accessible runtime APIs usable without a CUDA compiler
Lightweight Python wrappers (`pylibraft`) and multi‑GPU Dask integration (`raft-dask`)
Unified memory management via RAPIDS Memory Manager and mdarray abstractions

Pros

  • High performance CUDA acceleration for core ML primitives
  • Reduces development time by providing reusable building blocks
  • Seamless interoperability with other RAPIDS libraries
  • Supports both single‑GPU and multi‑GPU distributed workflows

Considerations

  • Requires CUDA‑aware development expertise
  • Vector‑search functionality has moved to cuVS
  • Limited to C++ and Python interfaces
  • Steeper learning curve for developers unfamiliar with RAPIDS ecosystem

Managed products teams compare with

When teams consider RAFT, these hosted platforms usually appear on the same shortlist.

Pinecone logo

Pinecone

Managed vector database for AI applications

Qdrant logo

Qdrant

Open-source vector database

ZIL

Zilliz

Managed vector database service for AI applications

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • GPU‑centric application developers needing custom ML kernels
  • Library authors building higher‑level algorithms on top of primitives
  • Performance engineers optimizing existing CUDA codebases
  • Teams deploying distributed GPU workloads with Dask

Not ideal when

  • Data scientists seeking out‑of‑the‑box analytics tools
  • CPU‑only environments without GPU access
  • Projects that require built‑in vector‑search/ANN capabilities
  • Users unfamiliar with C++ template programming

How teams use it

Custom clustering algorithm

Leverage RAFT sparse operations and random blob generation to implement a GPU‑accelerated clustering pipeline.

High‑throughput linear algebra in simulations

Use RAFT dense matrix utilities to replace CPU BLAS calls, achieving significant speedups in scientific codes.

Python distance computation

Call `pylibraft` distance primitives from CuPy arrays to compute pairwise Euclidean distances without leaving the Python ecosystem.

Distributed GPU training with Dask

Integrate `raft-dask` to share resources and synchronize data across multiple GPU nodes during model training.

Tech snapshot

Cuda70%
C++19%
Jupyter Notebook7%
Python2%
Cython1%
Shell1%

Tags

nearest-neighborsprimitivesannslinear-algebrallmvector-searchmachine-learningclusteringinformation-retrievalsparsegpuvector-similaritybuilding-blocksstatisticscudasolversrandom-samplingneighborhood-methodsvector-storedistance

Frequently asked questions

Which programming languages does RAFT support?

Core primitives are written in C++; lightweight Python wrappers are provided via `pylibraft` and `raft-dask`.

How does RAFT manage GPU memory?

RAFT relies on RAPIDS Memory Manager (RMM) and offers `mdarray`/`mdspan` abstractions that handle allocation and deallocation automatically.

Is RAFT compatible with other RAPIDS libraries?

Yes, RAFT is designed to interoperate with cuDF, cuML, and other RAPIDS components.

Where can I find vector‑search functionality?

Vector‑search and clustering have been moved to the dedicated cuVS library; use cuVS for those routines.

How do I install RAFT?

Install via conda or build from source following the RAFT documentation; ensure RMM is available in the environment.

Project at a glance

Active
Stars
972
Watchers
972
Forks
225
LicenseApache-2.0
Repo age6 years old
Last commit18 hours ago
Primary languageCuda

Last synced 12 hours ago