.. _gfql-translate:

Translate Between SQL, Pandas, Cypher, and GFQL
=================================================

This guide provides a comparison between **SQL**, **Pandas**, **Cypher**, and **GFQL**, helping you translate familiar queries into GFQL.

Introduction
------------

GFQL (GraphFrame Query Language) is designed to be intuitive for users familiar with SQL, Cypher, or dataframe like Pandas and Spark. By comparing equivalent queries across these languages, you can quickly grasp GFQL's syntax, benefits, and start utilizing its powerful graph querying capabilities within your workflows.

GFQL operates on graph DataFrames - graphs represented as node and edge DataFrames. This DataFrame-native approach enables seamless integration with the PyData ecosystem and natural vectorization for both CPU and GPU processing.

GFQL accepts both **native chain syntax** (``g.gfql([n(), e(), n()])``) and
**Cypher strings** (``g.gfql("MATCH ...")``). Most examples below show both
forms. GFQL also extends Cypher with ``GRAPH { }`` constructors for
graph-state results and composable multi-stage pipelines — see
:doc:`/gfql/cypher` for the full Cypher-in-GFQL guide.

Who Is This Guide For?
----------------------

- **Data Scientists:** Familiar with Pandas or SQL, exploring graph relationships.
- **Engineers:** Integrating graph queries into applications.
- **DBAs:** Understanding how GFQL complements SQL for graph data.
- **Graph Specialists:** Experienced with Cypher, integrating graph queries into Python.

Common Graph and Query Tasks
----------------------------

We'll cover a range of common graph and query tasks:

.. contents::
   :depth: 2
   :local:

Finding Nodes with Specific Properties
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find all nodes where the ``type`` is ``"person"``.

**SQL**

.. code-block:: sql

    SELECT * FROM nodes
    WHERE type = 'person';

**Pandas**

.. code-block:: python

    people_nodes_df = nodes_df[ nodes_df['type'] == 'person' ]

**Cypher**

.. code-block:: cypher

    MATCH (n {type: 'person'})
    RETURN n;

**GFQL (chain syntax)**

.. code-block:: python

    from graphistry import n

    # df[['id', 'type', ...]]
    g.gfql([ n({"type": "person"}) ])._nodes

**GFQL (Cypher syntax)**

.. code-block:: python

    # Row result — same as standard Cypher MATCH/RETURN
    g.gfql("MATCH (n {type: 'person'}) RETURN n")._nodes

    # Graph result — GFQL extension, keeps graph state
    g.gfql("GRAPH { MATCH (n {type: 'person'}) }")._nodes

**Explanation**:

- **GFQL chain**: ``n({"type": "person"})`` filters nodes where ``type`` is ``"person"``. ``g.gfql([...])`` applies this filter to the graph ``g``, and ``._nodes`` retrieves the resulting nodes. The performance is similar to that of Pandas (CPU) or cuDF (GPU).
- **GFQL Cypher**: The same query as a Cypher string. ``MATCH ... RETURN`` gives row output; ``GRAPH { MATCH ... }`` keeps graph state (both ``_nodes`` and ``_edges``).

.. graphviz::

   digraph find_nodes {
       node [shape=ellipse];
       person1 [label="person", style="filled,bold", fillcolor="#90EE90", penwidth=3, color="#228B22"];
       person2 [label="person", style="filled,bold", fillcolor="#90EE90", penwidth=3, color="#228B22"];
       company1 [label="company", shape=box, style=filled, fillcolor="#D3D3D3", color="#A9A9A9", fontcolor="#696969"];
       person1 -> company1 [color="#A9A9A9"];
   }

---

Exploring Relationships Between Nodes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find all edges connecting nodes of type ``"person"`` to nodes of type ``"company"``.

**SQL**

.. code-block:: sql

    SELECT e.*
    FROM edges e
    JOIN nodes n1 ON e.src = n1.id
    JOIN nodes n2 ON e.dst = n2.id
    WHERE n1.type = 'person' AND n2.type = 'company';

**Pandas**

.. code-block:: python

    merged_df = edges_df.merge(
        nodes_df[['id', 'type']], left_on='src', right_on='id', suffixes=('', '_src')
    ).merge(
        nodes_df[['id', 'type']], left_on='dst', right_on='id', suffixes=('', '_dst')
    )

    result = merged_df[
        (merged_df['type_src'] == 'person') &
        (merged_df['type_dst'] == 'company')
    ]

**Cypher**

.. code-block:: cypher

    MATCH (n1 {type: 'person'})-[e]->(n2 {type: 'company'})
    RETURN e;

**GFQL (chain syntax)**

.. code-block:: python

    from graphistry import n, e_forward

    # df[['src', 'dst', ...]]
    g.gfql([
        n({"type": "person"}), e_forward(), n({"type": "company"})
    ])._edges

**GFQL (Cypher syntax)**

.. code-block:: python

    # Graph result — keeps matched subgraph with edges
    g.gfql(
        "GRAPH { MATCH (n1 {type: 'person'})-[e]->(n2 {type: 'company'}) }"
    )._edges

    # Row result — returns edge properties as rows
    g.gfql(
        "MATCH (n1 {type: 'person'})-[e]->(n2 {type: 'company'}) RETURN e"
    )._nodes

**Explanation**:

- **GFQL chain**: Starts from nodes of type ``"person"``, traverses forward edges, and reaches nodes of type ``"company"``. This version starts to gain the legibility and maintainability benefits of graph query syntax for graph tasks, and maintains the performance benefits of automatically vectorized pandas and GPU-accelerated cuDF.
- **GFQL Cypher**: ``GRAPH { MATCH ... }`` returns the matched subgraph (graph state with ``_edges``); ``MATCH ... RETURN e`` returns edge properties as rows.
- **Same-path constraints**: Use `where` to relate attributes across steps
  (same-path scope only; see :doc:`/gfql/where`).

.. graphviz::

   digraph relationships {
       rankdir=LR;
       person [label="person", style="filled,bold", fillcolor="#87CEEB", penwidth=3, color="#4682B4"];
       company [label="company", shape=box, style="filled,bold", fillcolor="#FFFACD", penwidth=3, color="#DAA520"];
       person -> company [label="works_at", style=bold, color="#228B22", penwidth=2];
   }

---

Performing Multi-Hop Traversals
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find nodes that are two hops away from node ``"Alice"``.

**SQL**

.. code-block:: sql

    WITH first_hop AS (
        SELECT e1.dst AS node_id
        FROM edges e1
        WHERE e1.src = 'Alice'
    ),
    second_hop AS (
        SELECT e2.dst AS node_id
        FROM edges e2
        JOIN first_hop fh ON e2.src = fh.node_id
    )
    SELECT * FROM nodes
    WHERE id IN (SELECT node_id FROM second_hop);

**Pandas**

.. code-block:: python

    first_hop = edges_df[ edges_df['src'] == 'Alice' ]['dst']
    second_hop = edges_df[ edges_df['src'].isin(first_hop) ]['dst']
    result_nodes_df = nodes_df[ nodes_df['id'].isin(second_hop) ]

**Cypher**

.. code-block:: cypher

    MATCH (n {id: 'Alice'})-->()-->(m)
    RETURN m;

**GFQL (chain syntax)**

.. code-block:: python

    from graphistry import n, e_forward

    # df[['id', ...]]
    g.gfql([
        n({g._node: "Alice"}), e_forward(), e_forward(), n(name='m')
    ])._nodes.query('m')

**GFQL (Cypher syntax)**

.. code-block:: python

    # Row result — return 2-hop destinations
    g.gfql("MATCH (n {id: 'Alice'})-->()-->(m) RETURN m")._nodes

    # Graph result — return the 2-hop subgraph
    g.gfql("GRAPH { MATCH (n {id: 'Alice'})-->()-->(m) }")

**GFQL (bounded hop alternative)**

.. code-block:: python

    # Same intent using hop() with explicit bounds + optional labels
    g.hop(
        nodes=pd.DataFrame({g._node: ['Alice']}),
        min_hops=2,
        max_hops=2
    )

**Explanation**:

- `min_hops`/`max_hops` express the same bounded traversal intent as a Cypher
  pattern like `[*2..2]` after translation into native GFQL. Direct
  `g.gfql("MATCH ...")` now supports the endpoint-only `[*...]` slice, but
  `hop()` still gives you the full native GFQL control surface for labels,
  output slicing, and more complex multihop rewrites. If you set
  `label_node_hops`/`label_edge_hops`, those column names will store the hop
  step (nodes = first arrival, edges = traversal step); omit or `None` to skip
  labels.
- `output_min_hops`/`output_max_hops` (optional) slice the displayed hops after traversal. By default, all traversed hops up to `max_hops` remain visible; set `output_min_hops` if you want to drop early hops (e.g., traverse 2..4 but only show 3..4). Invalid slices (e.g., `output_min_hops` > `max_hops` or `output_max_hops` < `min_hops`) raise a `ValueError`.

**Explanation**:

- **GFQL**: Starts at node ``"Alice"``, performs two forward hops, and obtains nodes two steps away. Results are in ``nodes_df``. Building on the expressive and performance benefits of the previous 1-hop example, it begins adding the parallel path finding benefits of GFQL over Cypher, which benefits both CPU and GPU usage.

.. graphviz::

   digraph multi_hop {
       rankdir=LR;
       Alice [label="Alice\n(start)", style="filled,bold", fillcolor="#87CEEB", penwidth=3, color="#4682B4"];
       n1 [label="?\n(hop 1)", style="filled,bold", fillcolor="#D3D3D3", penwidth=2, color="#A9A9A9"];
       n2 [label="m\n(result)", style="filled,bold", fillcolor="#90EE90", penwidth=3, color="#228B22"];
       Alice -> n1 [label="hop 1", style=bold, color="#4682B4", penwidth=2];
       n1 -> n2 [label="hop 2", style=bold, color="#228B22", penwidth=2];
   }

---

Filtering Edges and Nodes with Conditions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find all edges where the weight is greater than `0.5`.

**SQL**

.. code-block:: sql

    SELECT * FROM edges
    WHERE weight > 0.5;

**Pandas**

.. code-block:: python

    filtered_edges_df = edges_df[ edges_df['weight'] > 0.5 ]

**Cypher**

.. code-block:: cypher

    MATCH ()-[e]->()
    WHERE e.weight > 0.5
    RETURN e;

**GFQL (chain syntax)**

.. code-block:: python

    from graphistry import e_forward

    # df[['src', 'dst', 'weight', ...]]
    g.gfql([ e_forward(edge_query='weight > 0.5') ])._edges

**GFQL (Cypher syntax)**

.. code-block:: python

    g.gfql(
        "MATCH ()-[e]->() WHERE e.weight > 0.5 RETURN e"
    )._nodes

**Explanation**:

- **GFQL chain**: Uses ``e_forward(edge_query='weight > 0.5')`` to filter edges where ``weight > 0.5``. This version introduces the string query form that can be convenient. Underneath, it still benefits from the vectorized execution of Pandas and cuDF.
- **GFQL Cypher**: Same filter as a Cypher ``WHERE`` clause.

---

Aggregations and Grouping
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Count the number of outgoing edges for each node.

**SQL**

.. code-block:: sql

    SELECT src, COUNT(*) AS out_degree
    FROM edges
    GROUP BY src;

**Pandas**

.. code-block:: python

    out_degree = edges_df.groupby('src').size().reset_index(name='out_degree')

**Cypher**

.. code-block:: cypher

    MATCH (n)-[e]->()
    RETURN n.id AS node_id, COUNT(e) AS out_degree;

**GFQL (dataframe)**

.. code-block:: python

    # df[['src', 'out_degree']]
    g._edges.groupby('src').size().reset_index(name='out_degree')

**GFQL (Cypher syntax)**

.. code-block:: python

    # Enrich graph with degree columns, then query
    g.gfql("CALL graphistry.degree.write()")._nodes

    # Or as a single pipeline — enrich then return top nodes
    g.gfql(
        "GRAPH g1 = GRAPH { CALL graphistry.degree.write() } "
        "USE g1 "
        "MATCH (n) RETURN n.id AS node_id, n.degree_out AS out_degree "
        "ORDER BY out_degree DESC LIMIT 10"
    )._nodes

**Explanation**:

- **GFQL dataframe**: Performs aggregation directly on ``g._edges`` using standard dataframe operations. Or even shorter, call ``g.get_degrees()`` to enrich each node with in, out, and total degrees.
- **GFQL Cypher**: ``CALL graphistry.degree.write()`` enriches the graph with degree columns in graph state. Wrap in ``GRAPH { }`` to compose with subsequent queries in a single expression.

---

.. _all-paths:

All Paths and Connectivity
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find all paths between nodes ``"Alice"`` and ``"Bob"`` that go through friendships.

**SQL**

.. code-block:: sql

    WITH RECURSIVE path AS (
        -- Base case: Start from "Alice" (no type or edge restrictions)
        SELECT e.src, e.dst, ARRAY[e.src, e.dst] AS full_path, 1 AS hop
        FROM edges e
        WHERE e.src = 'Alice'
        
        UNION ALL

        -- Recursive case: Expand path where intermediate src/dst are 'people' and edge is 'friend'
        SELECT e.src, e.dst, full_path || e.dst, p.hop + 1
        FROM edges e
        JOIN path p ON e.src = p.dst
        JOIN nodes n_src ON e.src = n_src.id  -- Check src type for intermediate nodes
        JOIN nodes n_dst ON e.dst = n_dst.id  -- Check dst type for intermediate nodes
        WHERE n_src.type = 'person' AND n_dst.type = 'person'  -- Intermediate nodes must be 'people'
        AND e.type = 'friend'  -- Intermediate edges must be 'friend'
        AND e.dst != ALL(full_path)  -- Avoid cycles (optional)
    )
    -- Final filter to ensure the path ends with "Bob"
    SELECT *
    FROM path
    WHERE dst = 'Bob';

**Pandas**

.. doc-test: skip

.. code-block:: python

    def find_paths_fixed_point(edges_df, nodes_df, start_node, end_node):
        # Initialize paths with base case (start with 'Alice')
        paths = [{'path': [start_node], 'last_node': start_node}]
        all_paths = []
        expanded = True  # Continue loop as long as there are paths to expand

        while expanded:
            new_paths = []
            expanded = False

            # Expand each path
            for path in paths:
                last_node = path['last_node']

                # Find all outgoing 'friend' edges from the last node
                valid_edges = edges_df.merge(nodes_df, left_on='dst', right_on='id') \
                                    .merge(nodes_df, left_on='src', right_on='id') \
                                    [(edges_df['src'] == last_node) & 
                                    (edges_df['type'] == 'friend') &
                                    (nodes_df['type_x'] == 'person') &  # src is 'person'
                                    (nodes_df['type_y'] == 'person')]   # dst is 'person'

                for _, edge in valid_edges.iterrows():
                    new_path = path['path'] + [edge['dst']]

                    # If we reached 'Bob', add to all_paths
                    if edge['dst'] == end_node:
                        all_paths.append(new_path)
                    else:
                        # Otherwise, add to new paths to continue expanding
                        new_paths.append({'path': new_path, 'last_node': edge['dst']})
                        expanded = True  # Mark that we found new paths to expand

            # Stop if no new paths were found (fixed-point behavior)
            paths = new_paths

        return all_paths

    # Run the pathfinding function to fixed point
    paths = find_paths_fixed_point(edges_df, nodes_df, 'Alice', 'Bob')

**Cypher**

.. code-block:: cypher

    MATCH p = (n1 {id: 'Alice'})-[e:friend*]-(n2 {id: 'Bob'})
    WHERE ALL(rel IN relationships(p) WHERE type(rel) = 'friend')
    AND ALL(node IN NODES(p) WHERE node.type = 'person')
    RETURN p;

**GFQL**

.. code-block:: python

    # g._edges: df[['src', 'dst', ...]]
    # g._nodes: df[['id', ...]]
    g.gfql([
        n({"id": "Alice"}), 
        e_forward(
            source_node_query='type == "person"',
            edge_query='type == "friend"',
            destination_node_query='type == "person"',
            to_fixed_point=True), 
        n({"id": "Bob"})
    ])

**Explanation**:

- **GFQL**: Uses ``e(to_fixed_point=True)`` to find edge sequences of arbitrary length between nodes ``"Alice"`` and ``"Bob"``. The SQL and Pandas version suffer from syntactic and semantic imepedance mismatch with graph tasks on this example.

.. graphviz::

   digraph all_paths {
       rankdir=LR;
       Alice [label="Alice\n(start)", style="filled,bold", fillcolor="#87CEEB", penwidth=3, color="#4682B4"];
       Bob [label="Bob\n(end)", style="filled,bold", fillcolor="#90EE90", penwidth=3, color="#228B22"];
       m1 [label="person", style="filled,bold", fillcolor="#87CEEB", penwidth=2, color="#4682B4"];
       m2 [label="person", style="filled,bold", fillcolor="#87CEEB", penwidth=2, color="#4682B4"];
       n1 [label="person", style="filled,bold", fillcolor="#87CEEB", penwidth=2, color="#4682B4"];
       Alice -> m1 [label="friend", style=bold, color="#228B22", penwidth=2];
       m1 -> m2 [label="friend", style=bold, color="#228B22", penwidth=2];
       m2 -> Bob [label="friend", style=bold, color="#228B22", penwidth=2];
       Alice -> n1 [label="friend", style=bold, color="#228B22", penwidth=2];
       n1 -> Bob [label="friend", style=bold, color="#228B22", penwidth=2];
   }

---

Community Detection and Clustering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Identify communities within the graph using the Louvain algorithm.

**SQL and Pandas**

- Not designed for complex graph algorithms like community detection.

**Cypher**

.. code-block:: cypher

    CALL algo.louvain.stream() YIELD nodeId, communityId

**GFQL (Python)**

.. code-block:: python

    # g._nodes: df[['id', 'louvain']]
    g.compute_cugraph('louvain')._nodes

**GFQL (Cypher syntax)**

.. code-block:: python

    # Graph-preserving enrichment via CALL .write()
    g.gfql("CALL graphistry.cugraph.louvain.write()")._nodes

    # Full pipeline: filter subgraph, enrich, query results
    g.gfql(
        "GRAPH g1 = GRAPH { MATCH (a)-[r]->(b) WHERE a.score > 5 } "
        "GRAPH g2 = GRAPH { USE g1 CALL graphistry.cugraph.louvain.write() } "
        "USE g2 "
        "MATCH (n) RETURN n.id AS id, n.louvain AS community "
        "ORDER BY community, id"
    )._nodes

**Explanation**:

- **GFQL Python**: Enriches with many algorithms such as the GPU-accelerated :func:`graphistry.plugins.cugraph.compute_cugraph` for community detection. Any CPU and GPU library can be used, with top plugins already natively supported out-of-the-box.
- **GFQL Cypher**: ``CALL graphistry.cugraph.louvain.write()`` runs the same algorithm via GFQL's Cypher surface. Wrapping in ``GRAPH { }`` with ``USE`` enables single-expression pipelines that filter, enrich, and query.

---

Time-Windowed Graph Analytics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Objective**: Find all edges between nodes ``"Alice"`` and ``"Bob"`` that occurred in the last 7 days.

**SQL**

.. code-block:: sql

    SELECT * FROM edges
    WHERE ((src = 'Alice' AND dst = 'Bob') OR (src = 'Bob' AND dst = 'Alice')) 
      AND timestamp >= NOW() - INTERVAL '7 days';

.. warning::

    This version incorrectly simplifies to a two-hop relationship. For multihop scenarios, refer to :ref:`all-paths` for more advanced techniques.

**Pandas**

.. code-block:: python

    filtered_edges_df = edges_df[
        ((edges_df['src'] == 'Alice') & (edges_df['dst'] == 'Bob')) |
        ((edges_df['src'] == 'Bob') & (edges_df['dst'] == 'Alice')) &
        (edges_df['timestamp'] >= pd.Timestamp.now() - pd.Timedelta(days=7))
    ]

.. warning::

    This version incorrectly simplifies to a two-hop relationship. For multihop scenarios, refer to :ref:`all-paths` for more advanced techniques.

**Cypher**

.. code-block:: cypher

    MATCH path = (a {id: 'Alice'})-[e]-(b {id: 'Bob'})
    WHERE e.timestamp >= datetime().subtract(duration({days: 7}))
    RETURN e;

**GFQL**

.. code-block:: python

    from graphistry import n, e_forward, is_in

    past_week = pd.Timestamp.now() - pd.Timedelta(7)
    g.gfql([
        n({"id": is_in(["Alice", "Bob"])}),
        e_forward(edge_query=f'timestamp >= "{past_week}"'),
        n({"id": is_in(["Alice", "Bob"])})
    ])._edges

**Explanation**:

- **SQL** and **Pandas**: These versions incorrectly simplify to a two-hop relationships; for multihop scenarios, refer to :ref:`all-paths`.

- **GFQL**: Utilizes the ``chain`` method to filter edges between ``"Alice"`` and ``"Bob"`` based on a timestamp within the last 7 days. This approach allows for multihop relationships as it leverages the graph's structure, and further using cuDF for GPU acceleration when available.


---

Parallel Pathfinding
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


**Objective**: Find all paths from ``"Alice"`` to ``"Bob"`` and ``"Charlie"`` in parallel. Parallel pathfinding is particularly interesting because it allows for efficient querying of multiple target nodes at the same time, reducing the time and complexity required to compute multiple independent paths, especially in large graphs.

**SQL**

- **Not suitable**: SQL is not designed for pathfinding on graphs.

**Pandas**

- **Not suitable**: Pandas is not designed for pathfinding across graphs.

**Cypher**


.. warning::

    Cypher is **path-oriented** and does not natively support parallel pathfinding. Each path must be processed individually, which can result in performance bottlenecks for large graphs or multiple targets. Neo4j users can utilize the APOC or GDS libraries to add parallelism, but this is a limited external workaround, rather than a native strength.

.. code-block:: cypher

    MATCH (a {id: 'Alice'}), (target)
    WHERE target.id IN ['Bob', 'Charlie']
    CALL algo.shortestPath.stream(a, target)
    YIELD nodeId, cost
    RETURN nodeId, cost;

**GFQL**

.. code-block:: python

    from graphistry import n, e_forward, is_in

    # g._nodes: cudf.DataFrame[['src', 'dst', ...]]
    g.gfql([
        n({"id": "Alice"}),
        e_forward(to_fixed_point=False),
        n({"id": is_in(["Bob", "Charlie"])})
    ], engine='cudf')

**Explanation**:


- **Cypher**: Cypher processes paths individually and does not support native parallelism. Libraries like APOC or GDS offer a way to achieve parallel execution, but this adds complexity.

- **GFQL**: GFQL natively supports parallel pathfinding using a bulk wavefront algorithm, processing all paths at once, making it highly efficient in GPU-accelerated environments.

---

GPU Execution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*Objective**: Execute pathfinding queries on the GPU, computing all paths from ``"Alice"`` to ``"Bob"`` and ``"Charlie"`` simultaneously across hardware resources.

**SQL**

- **Not suitable**: SQL is not designed for parallel execution of graph queries.

**Pandas**

- **Not suitable**: Pandas is not designed for parallel execution across graphs.

**Cypher**

- **Not suitable**: Popular Cypher engines like Neo4j do not natively support GPU execution.

**GFQL**

.. code-block:: python

    from graphistry import n, e_forward, is_in

    # Executing pathfinding queries in parallel
    g.gfql([
        n({"id": "Alice"}),
        e_forward(to_fixed_point=False),
        n({"id": is_in(["Bob", "Charlie"])})
    ], engine='cudf')

**Explanation**:

This example builds on the previous one, showing how **GFQL** handles parallel execution natively. GFQL benefits from **bulk vector processing**, which boosts performance in both CPU and GPU modes:

- **In CPU environments**, the bulk processing model accelerates query execution algorithmically and takes advantage of hardware parallelism, improving efficiency.
  
- **In GPU mode**, GFQL **natively parallelizes** pathfinding, further leveraging hardware acceleration to process multiple paths concurrently and quickly, making it highly efficient for large-scale graph traversals.

---










GFQL Functions and Equivalents
------------------------------

**Node Matching**

- **SQL**: ``SELECT * FROM nodes WHERE ...``
- **Pandas**: ``nodes_df[ condition ]``
- **Cypher**: ``MATCH (n {property: value})``
- **GFQL chain**: ``n({ "property": value })``
- **GFQL Cypher**: ``g.gfql("MATCH (n {property: value}) RETURN n")``

**Edge Matching**

- **SQL**: ``SELECT * FROM edges WHERE ...``
- **Pandas**: ``edges_df[ condition ]``
- **Cypher**: ``MATCH ()-[e {property: value}]->()``
- **GFQL chain**: ``e_forward({ "property": value })`` or ``e_reverse({ "property": value })`` or ``e({ "property": value })``
- **GFQL Cypher**: ``g.gfql("MATCH ()-[e {property: value}]->() RETURN e")``

**Traversal**

- **SQL**: Complex joins or recursive queries
- **Pandas**: Multiple merges; not efficient for deep traversals
- **Cypher**: Patterns like ``()-[]->()`` for traversal
- **GFQL chain**: Chains of ``n()``, ``e_forward()``, ``e_reverse()``, and ``e()`` functions
- **GFQL Cypher**: ``g.gfql("MATCH (a)-[]->(b) RETURN ...")`` or ``g.gfql("GRAPH { MATCH ... }")``

**Graph-State Results (subgraph extraction)**

- **SQL**: Not applicable
- **Pandas**: Manual node/edge DataFrame filtering
- **Cypher**: Not supported (Cypher always returns rows)
- **GFQL chain**: ``g.gfql([n(...), e_forward(), n()])`` — inherently graph-returning
- **GFQL Cypher**: ``g.gfql("GRAPH { MATCH (a)-[r]->(b) WHERE ... }")`` — GFQL extension

**Graph Enrichment (algorithms)**

- **SQL**: Not applicable
- **Pandas**: External library calls
- **Cypher**: ``CALL algo.pagerank.stream()``
- **GFQL Python**: ``g.compute_cugraph('pagerank')`` or ``g.compute_igraph('pagerank')``
- **GFQL Cypher**: ``g.gfql("CALL graphistry.cugraph.pagerank.write()")`` or ``g.gfql("GRAPH { CALL graphistry.cugraph.pagerank.write() }")``

**Multi-Stage Pipelines**

- **SQL**: CTEs or temp tables
- **Pandas**: Sequential variable assignment
- **Cypher**: Not supported in standard Cypher (row-only)
- **GFQL chain**: Sequential ``g.gfql([...])`` calls
- **GFQL Cypher**: ``GRAPH g1 = GRAPH { ... } GRAPH g2 = GRAPH { USE g1 CALL ... } USE g2 MATCH ... RETURN ...``

Tips for Users
--------------

- **Data Scientists and Analysts**: Use your Pandas knowledge. GFQL operates on dataframes, allowing familiar operations.
- **Engineers and Developers**: Integrate GFQL into Python applications without extra infrastructure.
- **Database Administrators**: Complement SQL queries with GFQL for graph data without changing databases.
- **Graph Enthusiasts**: Start with simple queries and explore complex analytics. Visualize results using PyGraphistry.

Additional Resources
--------------------

- :ref:`gfql-quick`
- :ref:`gfql-predicates-quick`: Use predicates for filtering on nodee and edge attributes.
- :ref:`10min-pygraphistry`: Visualize GFQL queries with GPU-accelerated tools.

Conclusion
----------

GFQL bridges the gap between traditional querying languages and graph analytics. By translating queries from SQL, Pandas, and Cypher into GFQL, you can leverage powerful graph queries within your Python workflows.

Start exploring GFQL today and unlock new insights from your graph data!
