Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 183 additions & 0 deletions demos/more_examples/graphistry_features/collections.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Collections in PyGraphistry\n",
"\n",
"Collections define labeled subsets of a graph (nodes, edges, or subgraphs) using full GFQL. They enable advanced, layered styling that overrides base encodings when you need precise highlights.\n",
"\n",
"Use collections when you want:\n",
"- baseline encodings (for example, by entity type) plus overlays for alerts or critical paths\n",
"- multiple overlapping highlights with a priority order\n",
"- a UI panel for toggling focused subsets on and off\n",
"\n",
"Collections are evaluated in priority order, with higher priority collections overriding lower ones for styling.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we build sets using GFQL AST helpers, combine them with intersections, and apply node and edge colors. Collections can be based on nodes, edges, or multi-step graph traversals (Chain).\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"from pathlib import Path\n",
"import pandas as pd\n",
"import graphistry\n",
"from graphistry import collection_set, collection_intersection, n, e_forward, Chain\n",
"\n",
"edges = pd.read_csv(Path('demos/data/honeypot.csv'))\n",
Copy link
Copy Markdown
Contributor

@mj3cheun mj3cheun Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is not included in the repo, probably should be or a way to download the data should be provided?

"g = graphistry.edges(edges, \"attackerIP\", \"victimIP\")\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# Use Chain to select subgraphs (nodes + edges) by edge attributes\n",
"collections = [\n",
" collection_set(\n",
" expr=Chain([n(), e_forward({\"vulnName\": \"MS08067 (NetAPI)\"}), n()]),\n",
" id='netapi',\n",
" name='MS08067 (NetAPI)',\n",
" node_color='#00BFFF',\n",
" edge_color='#00BFFF',\n",
" ),\n",
" collection_set(\n",
" expr=Chain([n(), e_forward({\"victimPort\": 445.0}), n()]),\n",
" id='port445',\n",
" name='Port 445',\n",
" node_color='#32CD32',\n",
" edge_color='#32CD32',\n",
" ),\n",
" collection_intersection(\n",
" sets=['netapi', 'port445'],\n",
" name='NetAPI + 445',\n",
" node_color='#AABBCC',\n",
" edge_color='#AABBCC',\n",
" ),\n",
"]\n",
"\n",
"g2 = g.collections(\n",
" collections=collections,\n",
" show_collections=True,\n",
" collections_global_node_color='CCCCCC',\n",
" collections_global_edge_color='CCCCCC',\n",
")\n",
"\n",
"g2._url_params\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# Render (requires graphistry.register(...))\n",
"g2.plot()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Notes and validation\n",
"\n",
"- Order matters: earlier collections override later ones.\n",
"- Use collections for priority-based subsets and overlaps; use encode_* for simple column-driven colors.\n",
"- Helper constructors: `graphistry.collection_set(...)` and `graphistry.collection_intersection(...)` return JSON-friendly dicts (AST inputs wrap to `gfql_chain`).\n",
"- Provide `id` for sets used by intersections.\n",
"- Global colors apply to nodes/edges not in any collection; `#` is optional.\n",
"- Use `validate='strict'` to raise, or `warn=False` to silence warnings.\n",
"\n",
"Wire protocol and pre-encoded strings:\n",
"\n",
"```python\n",
"collections_wire = [\n",
" {\n",
" \"type\": \"set\",\n",
" \"name\": \"Wire Protocol Example\",\n",
" \"node_color\": \"#AA00AA\",\n",
" \"expr\": {\n",
" \"type\": \"gfql_chain\",\n",
" \"gfql\": [\n",
" {\"type\": \"Node\", \"filter_dict\": {\"status\": \"purchased\"}}\n",
" ]\n",
" }\n",
" }\n",
"]\n",
"g.collections(collections=collections_wire)\n",
"\n",
"g.collections(collections=encoded_collections, encode=False)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run `g2.plot()` in a notebook session with valid credentials to render inline.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overlap priority example\n",
"\n",
"Earlier collections override later ones when they overlap.\n",
"\n",
"```python\n",
"collections_priority = [\n",
" collection_set(\n",
" expr=Chain([n(), e_forward({\"vulnName\": \"MS08067 (NetAPI)\"}), n()]),\n",
" id=\"netapi\",\n",
" name=\"MS08067 (NetAPI)\",\n",
" node_color=\"#FFAA00\",\n",
" edge_color=\"#FFAA00\",\n",
" ),\n",
" collection_set(\n",
" expr=Chain([n(), e_forward({\"victimPort\": 445.0}), n()]),\n",
" id=\"port445\",\n",
" name=\"Port 445\",\n",
" node_color=\"#00BFFF\",\n",
" edge_color=\"#00BFFF\",\n",
" ),\n",
"]\n",
"g.collections(collections=collections_priority)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For more on color encodings, see the [Color encodings notebook](encodings-colors.ipynb).\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.x"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
22 changes: 22 additions & 0 deletions docs/source/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,28 @@ Example visualization:

Now, edges are colored based on the type of vulnerability, helping you distinguish different attack types.

Advanced: Collections for layered highlights
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use collections when you want GFQL-driven subsets (nodes, edges, or subgraphs) to override base encodings.
This is useful for overlays like alerts or critical paths that take precedence over your normal color rules.

.. code-block:: python

from graphistry import collection_set, n

collections = [
collection_set(
expr=n({"vip": True}),
name="VIP",
node_color="#FF8800",
)
]
g.collections(collections=collections, show_collections=True).plot()

See the :doc:`Collections tutorial notebook </demos/more_examples/graphistry_features/collections>` and
:doc:`GFQL docs </gfql/index>` for full details.

Adjusting Sizes, Labels, Icons, Badges, and More
------------------------------------------------

Expand Down
5 changes: 5 additions & 0 deletions docs/source/cheatsheet.md
Original file line number Diff line number Diff line change
Expand Up @@ -555,6 +555,11 @@ g.encode_point_color('type', as_categorical=True,
categorical_mapping={"cat": "red", "sheep": "blue"}, default_mapping='#CCC')
```

For subset-based coloring and conditional styling across multiple encodings, use Collections
via `g.collections(...)` with GFQL AST helpers. See the
[layout settings](visualization/layout/settings.html)
and the [Collections tutorial notebook](demos/more_examples/graphistry_features/collections.ipynb).

For more in-depth examples, check out the tutorials on [colors](https://github.com/graphistry/pygraphistry/tree/master/demos/more_examples/graphistry_features/encodings-colors.ipynb).

### Custom icons and badges
Expand Down
3 changes: 3 additions & 0 deletions docs/source/gfql/quick.rst
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,9 @@ Run graph algorithms like PageRank, community detection, and layouts directly wi
# Results have x, y coordinates for visualization
result.plot()

Tip: For subset-based coloring after GFQL, use ``result.collections(...)`` and see
:doc:`/visualization/layout/settings`.

Remote Graph References
-----------------------

Expand Down
5 changes: 5 additions & 0 deletions docs/source/gfql/remote.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ Run chain remotely and fetch results
g2 = g1.gfql_remote([n(), e(), n()])
assert len(g2._nodes) <= len(g1._nodes)

.. note::
Collections are visualization URL settings; apply them after GFQL results
(for example, ``g2.collections(...)``). The GFQL remote/upload APIs do not
accept collections payloads yet.
Comment on lines +20 to +23
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why this note was needed but i dont see an issue in having it either


Method :meth:`chain_remote <graphistry.compute.ComputeMixin.ComputeMixin.chain_remote>` runs chain remotely and fetched the computed graph

- **chain**: Sequence of graph node and edge matchers (:class:`ASTObject <graphistry.compute.ast.ASTObject>` instances).
Expand Down
44 changes: 44 additions & 0 deletions docs/source/gfql/spec/wire_protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,50 @@ null // null

**Note**: The `timezone` field is optional for DateTime values and defaults to "UTC" if omitted. This ensures consistent behavior across systems while allowing explicit timezone specification when needed.

## Collections Payloads

Collections are Graphistry visualization overlays that use GFQL wire protocol operations to define subsets
of nodes, edges, or subgraphs. They are applied in priority order, with earlier collections overriding later
ones for styling.

### Collection Set

Collection sets wrap GFQL operations in a `gfql_chain` object:

```json
{
"type": "set",
"id": "purchasers",
"name": "Purchasers",
"node_color": "#00BFFF",
"expr": {
"type": "gfql_chain",
"gfql": [
{"type": "Node", "filter_dict": {"status": "purchased"}}
]
}
}
```

### Collection Intersection

Intersections reference previously defined set IDs:

```json
{
"type": "intersection",
"name": "High Value Purchasers",
"node_color": "#AA00AA",
"expr": {
"type": "intersection",
"sets": ["purchasers", "vip"]
}
}
```

For Python examples and helper constructors, see the
:doc:`Collections tutorial notebook </demos/more_examples/graphistry_features/collections>`.

## Examples

### User 360 Query
Expand Down
2 changes: 1 addition & 1 deletion docs/source/gfql/wire_protocol_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -513,4 +513,4 @@ filter3 = n(filter_dict={"date": gt({"type": "datetime", "value": "2023-01-01T00
- Temporal predicates leverage pandas' optimized datetime operations
- Timezone conversions are handled efficiently
- For large datasets, ensure datetime columns are properly typed (not object dtype)
- Use `pd.Timestamp` for best performance when creating many predicates programmatically
- Use `pd.Timestamp` for best performance when creating many predicates programmatically
1 change: 1 addition & 0 deletions docs/source/notebooks/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Encodings
Sizes <../demos/more_examples/graphistry_features/encodings-sizes.ipynb>
Icons <../demos/more_examples/graphistry_features/encodings-icons.ipynb>
Badges <../demos/more_examples/graphistry_features/encodings-badges.ipynb>
Collections <../demos/more_examples/graphistry_features/collections.ipynb>

Geographic (Kepler.gl)
----------------------
Expand Down
19 changes: 19 additions & 0 deletions docs/source/visualization/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,25 @@ You can encode your graph attributes visually using colors, sizes, icons, and mo
- :meth:`graphistry.PlotterBase.PlotterBase.encode_edge_color`
- :meth:`graphistry.PlotterBase.PlotterBase.encode_edge_icon`

* **Collections (advanced coloring)**: Define subsets using GFQL AST helpers and color them consistently:

.. code-block:: python

from graphistry import collection_set, n

collections = [
collection_set(
expr=n({"subscribed_to_newsletter": True}),
name="Subscribers",
node_color="#32CD32",
)
]
g.collections(collections=collections, show_collections=True).plot()

See :doc:`Layout settings <layout/settings>` and the
:doc:`Collections tutorial notebook </demos/more_examples/graphistry_features/collections>`.
Tip: order matters (earlier collections override later ones) and intersections require set IDs.

* **Bind**: Simpler data-driven settings are done through :meth:`graphistry.PlotterBase.PlotterBase.bind`:

.. code-block:: python
Expand Down
2 changes: 2 additions & 0 deletions docs/source/visualization/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ Visualize
=============

We recommend getting started with :ref:`10 Minutes to PyGraphistry <10min>`, :ref:`10 Minutes to Graphistry Visualization<10min-viz>`, and the :ref:`layout guide <layout-guide>`.
For advanced, subset-based coloring, see the
:doc:`Collections tutorial notebook </demos/more_examples/graphistry_features/collections>`.

For static image export (documentation, reports), see the `static rendering tutorial <../demos/demos_databases_apis/graphviz/static_rendering.ipynb>`_.

Expand Down
Loading