GFQL Validation Fundamentals#
Learn how to use GFQL’s built-in validation system to catch errors early and build robust graph applications.
Note
This guide is accompanied by an interactive Jupyter notebook. To run the examples yourself, see GFQL Validation Fundamentals notebook.
What You’ll Learn#
How GFQL automatically validates queries
Understanding structured error messages with error codes
Schema validation against your data
Pre-execution validation for performance
Collecting all errors vs fail-fast mode
Prerequisites#
Basic Python knowledge
PyGraphistry installed (
pip install graphistry[ai])
Quick Start#
from graphistry.compute.chain import Chain
from graphistry.compute.ast import n, e_forward
from graphistry.compute.exceptions import GFQLValidationError
# Automatic validation during construction
try:
chain = Chain([
n({'type': 'customer'}),
e_forward(),
n()
])
print("Valid chain created!")
except GFQLValidationError as e:
print(f"Error: [{e.code}] {e.message}")
Key Concepts#
Built-in Validation#
GFQL validates automatically - no separate validation calls needed:
Syntax validation: Happens during chain construction
Schema validation: Happens by default during
g.chain()executionStructured errors: Error codes (E1xx, E2xx, E3xx) for programmatic handling
Error Types#
GFQLSyntaxError (E1xx): Structural issues in query
GFQLTypeError (E2xx): Type mismatches and invalid values
GFQLSchemaError (E3xx): Missing columns, incompatible types
Common Errors and Fixes#
Invalid Parameters#
# Wrong - negative hops
try:
chain = Chain([n(), e_forward(hops=-1)])
except GFQLTypeError as e:
print(f"Error: {e.message}") # "hops must be a positive integer"
# Correct
chain = Chain([n(), e_forward(hops=2)])
Missing Columns#
# Wrong - column doesn't exist
try:
result = g.chain([n({'category': 'VIP'})])
except GFQLSchemaError as e:
print(f"Error: {e.message}") # Column "category" does not exist
print(f"Suggestion: {e.context.get('suggestion')}")
# Correct - use existing columns
result = g.chain([n({'type': 'customer'})])
Type Mismatches#
# Wrong - string value on numeric column
try:
result = g.chain([n({'score': 'high'})])
except GFQLSchemaError as e:
print(f"Error: {e.message}") # Type mismatch
# Correct - use numeric predicate
from graphistry.compute.predicates.numeric import gt
result = g.chain([n({'score': gt(80)})])
Temporal Comparisons#
import pandas as pd
from graphistry.compute.predicates.numeric import gt, lt
# Compare datetime columns
result = g.chain([
n({'created_at': gt(pd.Timestamp('2024-01-01'))})
])
# Find recent activity (last 7 days)
result = g.chain([
e_forward({
'timestamp': gt(pd.Timestamp.now() - pd.Timedelta(days=7))
})
])
How Validation Works#
Default Behavior#
GFQL validates automatically - just write your queries and run them:
# Validation happens automatically
result = g.chain([n({'type': 'customer'})])
# Errors are caught and reported clearly
try:
result = g.chain([n({'invalid_column': 'value'})])
except GFQLSchemaError as e:
print(f"Error: {e.message}")
Pre-Execution Validation Options#
Use the inline GFQL entrypoints first:
g.gfql_validate(...)for validate-only preflight (no execution)g.gfql(..., validate=True)for preflight + executionvalidate_chain_schema()for low-level chain-schema checks only
g.gfql_validate(...) (validate-only, no execution) supports:
Input forms: Cypher strings, GFQL JSON payloads, and GFQL Python objects (for example
Chain(...),[n(), e(), n()], andASTLet(...)) String inputs are always validated as Cypher (no separate string-shape precheck).Predicate + structural validation: yes
Schema validation:
GFQL JSON and GFQL Python chain-like forms: yes (default
schema=True)GFQL Let/DAG forms: DAG structure + schema checks for direct graph-bound steps; reference-based steps stay structural-only
Cypher strings: syntax/compile + schema-aware name checks against the bound graph schema by default (
strict=True); passstrict=Falsefor syntax/compile-only preflight
# Chain / JSON-style GFQL
g.gfql_validate([n({'type': 'customer'})], collect_all=True)
# Cypher
g.gfql_validate("MATCH (c) RETURN c.id AS id LIMIT $n", params={"n": 10})
Validation failures raise GFQLValidationError / GFQLSyntaxError with
structured, inspectable context:
from graphistry.compute.exceptions import GFQLValidationError
try:
g.gfql_validate([n({"missing_col": "x"})], collect_all=True)
except GFQLValidationError as exc:
payload = exc.to_dict()
# LM-friendly payload:
# {
# "code": "...",
# "message": "...",
# "query_type": "chain",
# "language": "gfql",
# "diagnostics": [...]
# }
print(payload)
g.gfql(..., validate=True) accepts the same query inputs as g.gfql(...)
(Cypher string, GFQL JSON, GFQL Python objects), runs local preflight first, and
executes only when preflight passes. Its preflight uses g.gfql_validate(...)
defaults, so local bound-graph execution runs schema-aware checks by default.
# Run preflight first; execute only if preflight passes
result = g.gfql(
"MATCH (c) RETURN c.id AS id LIMIT $n",
params={"n": 10},
validate=True,
)
Use validate_chain_schema() when you specifically want the low-level chain-schema helper.
It is intentionally narrower than g.gfql_validate(...):
validates chain operations against currently bound node/edge dataframe columns
does not parse/compile Cypher strings
does not run Let/DAG orchestration validation
does not execute query operators
from graphistry.compute.validate_schema import validate_chain_schema
# Step 1: Validate (no execution)
try:
validate_chain_schema(g, chain) # Only validates, doesn't execute
print("Chain is valid for this graph schema")
except GFQLSchemaError as e:
print(f"Schema incompatibility: {e}")
# Step 2: Execute (after validation passes)
result = g.gfql(chain.chain)
print(f"Query executed: {len(result._nodes)} nodes")
Execution-time Preflight Toggles#
For remote execution, g.gfql_remote(..., validate=True) runs local query
prevalidation before implicit upload/network execution, so invalid queries fail
before data upload when possible. For Cypher strings, remote prevalidation uses
strict=False by default because the authoritative schema is on the remote dataset.
Grounded vs Ungrounded Validation#
Schema checks are most useful when local graph tables are bound on g.
If local node/edge tables are missing, GFQL JSON/AST chain validation can only
do structural/predicate checks, and column-existence checks are effectively
ungrounded.
Error Collection#
Choose between fail-fast and collect-all modes:
# Fail-fast (default)
try:
chain = Chain([problematic_operations])
except GFQLValidationError as e:
print(f"First error: {e}")
# Collect all errors
errors = chain.validate(collect_all=True)
for error in errors:
print(f"[{error.code}] {error.message}")
Next Steps#
GFQL Validation for LLMs - AI integration patterns
GFQL Validation in Production - Production deployment patterns
See Also#
GFQL Language Specification - Complete language specification
Overview of GFQL - GFQL overview