197 / 391

Graphzero

Graphzero - Product Hunt launch logo and brand identity

Zero-copy C++ graph engine to train PyTorch GNNs with 0 RAM.

#Open Source #Developer Tools #Artificial Intelligence #GitHub

Graphzero – Zero-copy C++ graph engine for PyTorch GNN training without RAM limits

Summary: GraphZero enables training large PyTorch Graph Neural Networks by memory-mapping graph datasets directly from SSD, avoiding RAM overload. It uses a custom C++20 engine with nanobind to expose zero-copy NumPy arrays to PyTorch, allowing models up to 50GB on consumer hardware.

What it does

GraphZero compiles graph data into optimized binary formats and memory-maps them via POSIX mmap, streaming data from SSD during training. It provides raw C++ pointers as zero-copy NumPy arrays to PyTorch, while multi-threaded neighbor sampling uses OpenMP and releases the Python GIL to maximize disk I/O.

Who it's for

It targets developers and researchers working with large-scale graph neural networks who face PyTorch out-of-memory crashes on standard hardware.

Why it matters

It solves the problem of loading massive graph datasets into RAM by streaming data directly from SSD, enabling training of large GNN models without system memory crashes.