GFQL Cypher Benchmark: CPU/GPU DataFrames vs Neo4j#

Run Cypher graph queries and analytics directly on Python dataframes — no database required. This benchmark compares Graphistry’s local Cypher (CPU and GPU) against Neo4j + GDS on the same end-to-end pipeline.

	Neo4j + GDS	GFQL Cypher (CPU)	GFQL Cypher (GPU)	GPU speedup vs Neo4j
Twitter (2.4M edges)	13.83s	2.55s	0.30s	46x
GPlus (30M edges)	>187s	75.78s	3.33s	>56x

Warm median of 5 runs, 2 warmup iterations. DGX dgx-spark, GB10 GPU.

The pipeline#

One g.gfql(...) call — search, enrich with PageRank, search again:

# pip install graphistry
result = g.gfql("""
    GRAPH g1 = GRAPH {
      MATCH (n)-[e]-(m)
      WHERE n.degree >= $degree_cutoff
    }
    GRAPH g2 = GRAPH {
      USE g1
      CALL graphistry.cugraph.pagerank.write()
    }
    GRAPH {
      USE g2
      MATCH (n)-[e]-(m)
      WHERE n.pagerank >= $pagerank_cutoff
    }
""",
    params={
        "degree_cutoff": degree_cutoff,
        "pagerank_cutoff": pagerank_cutoff,
    },
    engine="cudf",  # or "pandas" with igraph backend
)

GRAPH g1: find high-degree nodes and their neighbors
GRAPH g2: enrich g1 with PageRank scores (igraph on CPU, cugraph on GPU)
Final GRAPH: keep high-PageRank nodes and their neighbors

The same pipeline shape, different backends:

CPU: engine="pandas", backend="igraph"
GPU: engine="cudf", backend="cugraph"

The Neo4j equivalent requires ~30 lines of Cypher + GDS projection + batched writes (see Neo4j + GDS analog below).

Twitter (2.4M edges): exact 3-way comparison#

Stacked by workload phase: ETL (load + shape), Search (graph queries), Analytics (PageRank).

Neo4j total lifecycle: ~21.6s (6.0s import + 1.7s prep + 13.8s pipeline)
GFQL Cypher CPU: ~2.8s — 8x faster than Neo4j
GFQL Cypher GPU: ~0.4s — 54x faster than Neo4j

GPlus (30M edges): larger graph#

GPlus: Neo4j (lower bound) vs GFQL Cypher CPU vs GFQL Cypher GPU

Neo4j: >187s (lower bound — the transaction did not finish)
GFQL Cypher CPU: ~85.5s — still faster than Neo4j’s incomplete run
GFQL Cypher GPU: ~7.1s — >26x faster than Neo4j

Why this matters#

The CPU path already beats Neo4j without a GPU. You get Cypher-style graph search + PageRank directly on your dataframe, no database to stand up or maintain.

The GPU path accelerates everything — ETL, search, and analytics — because cudf and cugraph are drop-in replacements for pandas and igraph under the same GFQL Cypher surface.

Neo4j + GDS analog#

The Neo4j equivalent of the same pipeline:

-- 1. Mark seed nodes by degree
MATCH (n:Node)
SET n.seed = n.degree >= $cutoff;

-- 2. Expand one hop from seeds
UNWIND $seed_ids AS sid
MATCH (s:Node) WHERE id(s) = sid
MATCH (s)-[r:LINK]-(target:Node)
SET target.in_subgraph = true, r.in_subgraph = true;

-- 3. Project subgraph and run PageRank
CALL gds.graph.project.cypher(
  'subgraph',
  'MATCH (n:Node) WHERE n.in_subgraph RETURN id(n) AS id',
  'MATCH (a)-[r:LINK]->(b) WHERE r.in_subgraph
   RETURN id(a) AS source, id(b) AS target
   UNION ALL
   MATCH (a)-[r:LINK]->(b) WHERE r.in_subgraph
   RETURN id(b) AS source, id(a) AS target'
);
CALL gds.pageRank.write('subgraph', {writeProperty: 'pagerank'});

-- 4. Keep high-PageRank core + one hop
MATCH (n:Node) WHERE n.pagerank >= $cutoff
SET n.core = true;
UNWIND $core_ids AS cid
MATCH (c:Node) WHERE id(c) = cid
MATCH (c)-[r:LINK]-(target:Node)
SET target.final = true, r.final = true;

Why the GFQL pipeline is shorter#

The difference in pipeline length above is not accidental. It reflects a design difference in how graphs flow through the system:

Graphs as first-class values. GFQL’s GRAPH { } constructors treat graphs as composable values that flow between pipeline stages. Each stage receives a graph, transforms it, and passes a graph to the next stage. Standard Cypher and GQL are constrained to paths and rows as output values, which forces the Neo4j pipeline into explicit property-flag marking, separate GDS projections, and batched write-back steps.

Multi-language, single engine. The GFQL engine is being designed to support Cypher, and over time additional property graph query languages, all compiled to the same vectorized columnar execution backend. Users write in whichever declarative syntax they prefer; the engine handles CPU/GPU dispatch transparently. See Cypher Syntax In GFQL for the current Cypher surface and Overview of GFQL for the broader GFQL design.

Modern execution without legacy constraints. Because GFQL does not inherit a database storage layer or a row-at-a-time execution model, it can represent intermediate graph results natively in columnar memory (Arrow / pandas / cuDF). That is what makes the CPU-to-GPU switch a configuration flag (engine="cudf") rather than a rewrite, and what keeps ETL, search, and analytics in the same in-process pipeline.

For more on the GFQL design and supported surface:

Cypher Syntax In GFQL — Cypher syntax through g.gfql("MATCH ...")
Overview of GFQL — GFQL design, features, and GPU acceleration
10 Minutes to GFQL — 10-minute introduction to GFQL

Benchmark environment#

Host: dgx-spark, GPU: GB10, driver 580.126.09
Container: graphistry/test-gpu:latest
Datasets: SNAP Twitter (2.4M edges) and GPlus (30M edges)
Measurement: median of 5 runs after 2 warmup iterations
Results rendered from saved JSON — this page does not rerun benchmarks

Notebook version#

See demos/gfql/benchmark_filter_pagerank_cpu_gpu.ipynb for a notebook version of this writeup using the same saved DGX results.

GFQL Cypher Benchmark: CPU/GPU DataFrames vs Neo4j

Contents