networks

Compute network centralities. There are three network centrality methods available depending on whether you’re using a node-based or segment-based approach, with the former available in both shortest and simplest (angular) variants.

These methods wrap the underlying rust optimised functions for computing centralities. Multiple classes of measures and distances are computed simultaneously to reduce the amount of time required for multi-variable and multi-scalar strategies.

When sample=True, adaptive sampling uses the Hoeffding bound to select a distance-dependent sampling probability. The epsilon parameter controls the error tolerance (lower = more samples, higher accuracy). The default for when sampling is enabled is 0.06.

Distanceε=0.02ε=0.04ε=0.06ε=0.08ε=0.1
1 km100%100%100%100%100%
2 km100%100%100%100%100%
5 km100%100%58.7%33.0%21.1%
10 km100%37.3%16.6%9.3%6.0%
20 km41.5%10.4%4.6%2.6%1.7%

Sampling is exact (100%) at short distances and becomes progressively sparser at longer distances where reachability is high enough to maintain relative accuracy. The theoretical speedup is approximately 1/p. When comparing centrality values across different locations, use the same epsilon to ensure consistent error tolerances and comparable sampling rates.

The reasons for picking one approach over another are varied:

  • Node based centralities compute the measures relative to each reachable node within the threshold distances. For this reason, they can be susceptible to distortions caused by messy graph topologies such redundant and varied concentrations of degree=2 nodes (e.g. to describe roadway geometry) or needlessly complex representations of street intersections. In these cases, the network should first be cleaned using methods such as those available in the graph module (see the graph cleaning guide for examples). If a network topology has varied intensities of nodes but the street segments are less spurious, then segmentised methods can be preferable because they are based on segment distances: segment aggregations remain the same regardless of the number of intervening nodes, however, are not immune from situations such as needlessly complex representations of roadway intersections or a proliferation of walking paths in greenspaces;
  • Node-based harmonic centrality can be problematic on graphs where nodes are erroneously placed too close together or where impedances otherwise approach zero, as may be the case for simplest-path measures or small distance thesholds. This happens because the outcome of the division step can balloon towards \infty once impedances decrease below 1.
  • Simplest (angular) node measures require a dual graph representation. Convert primal graphs with graphs.nx_to_dual before ingesting them.
  • Measures should only be directly compared on the same topology because different topologies can otherwise affect the expression of a measure. Accordingly, measures computed on dual graphs cannot be compared to measures computed on primal graphs because this does not account for the impact of differing topologies. Dual graph representations can have substantially greater numbers of nodes and edges for the same underlying street network; for example, a four-way intersection consisting of one node with four edges translates to four nodes and six edges on the dual. This effect is amplified for denser regions of the network.
  • Segmentised versions of centrality measures should not be computed on dual graph topologies because street segment lengths would be duplicated for each permutation of dual edge spanning street intersections. By way of example, the contribution of a single edge segment at a four-way intersection would be duplicated three times.
  • The usual formulations of closeness or normalised closeness are discouraged because these do not behave suitably for localised graphs. Harmonic closeness or Hillier normalisation (which resembles a simplified form of Improved Closeness Centrality proposed by Wasserman and Faust) should be used instead.
  • Network decomposition can be a useful strategy when working at small distance thresholds, and confers advantages such as more regularly spaced snapshots and fewer artefacts at small distance thresholds where street edges intersect distance thresholds. However, the regular spacing of the decomposed segments will introduce spikes in the distributions of node-based centrality measures when working at very small distance thresholds. Segmentised versions may therefore be preferable when working at small thresholds on decomposed networks.

node_centrality_shortest

node_centrality_shortest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
compute_closeness : bool = True
compute_betweenness : bool = True
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute node centrality using shortest paths with a single Dijkstra per source. When both compute_closeness and compute_betweenness are True, a single Brandes-style Dijkstra traversal per source produces the data for both closeness accumulation and betweenness backpropagation, halving computation time compared to computing them separately.

.. versionchanged:: 4.24.0 The cycles output now measures the circuit rank of the locally reachable subgraph (m - n + c), computed per source and then target-aggregated using the same source/IPW framework as the other shortest-path metrics. This provides a more stable measure of network meshedness (independent loops / city blocks) than the older tree-cycle heuristic.

When sample=True, sampling probability is derived from each distance threshold using a canonical grid network model (see sampling.compute_distance_p). This produces deterministic, reach-agnostic sample fractions that are comparable across networks.

Parameters

network_structure
None

nodes_gdf
None

A GeoDataFrame representing nodes. The outputs of calculations will be written to this GeoDataFrame.

distances
list[int]

Distances corresponding to the local dmaxd_{max} thresholds to be used for calculations.

betas
list[float]

A list of β\beta to be used for the exponential decay function for weighted metrics.

minutes
list[float]

A list of walking times in minutes to be used for calculations.

compute_closeness
bool

Compute closeness centralities. True by default. The cycles output measures the circuit rank of the source’s locally reachable subgraph and target-aggregates that loopiness contribution over all sources that can reach each node within the threshold.

compute_betweenness
bool

Compute betweenness centralities. True by default.

min_threshold_wt
float

The default min_threshold_wt parameter can be overridden to generate custom mappings between the distance and beta parameters.

speed_m_s
float

The default speed_m_s parameter can be configured to generate custom mappings between walking times and distance thresholds dmaxd_{max}.

tolerance
float

Relative tolerance for betweenness path equality, as a percentage (e.g. 1.0 = 1%). Paths within this percentage of the shortest are treated as near-equal for multi-predecessor Brandes betweenness. A tiny internal epsilon is always enforced as a minimum for floating-point stability.

random_seed
int

Optional seed for reproducible sampling.

sample
bool

If True, uses distance-based sampling. If False, computes exact centrality.

epsilon
float

Normalised additive error tolerance for sampling. Defaults to sampling.HOEFFDING_EPSILON.

Returns

nodes_gdf
GeoDataFrame

The input nodes_gdf parameter is returned with additional centrality columns.

build_od_matrix

build_od_matrix
(
od_df : pandas.DataFrame
zones_gdf : geopandas.geodataframe.GeoDataFrame
network_structure : NetworkStructure
origin_col : str
destination_col : str
weight_col : str
zone_id_col : str | None = None
max_snap_dist : float = 500.0
)->[ OdMatrix ]

Build an OdMatrix from OD flow data and zone boundaries. Computes zone centroids, snaps them to the nearest network nodes, and constructs a sparse OD weight matrix for use with betweenness_od.

Parameters

od_df
pd.DataFrame

Origin-destination flow data with columns for origin zone, destination zone, and weight.

zones_gdf
gpd.GeoDataFrame

Zone boundaries (polygons) or centroids (points). Must be in a projected CRS matching the network, or in EPSG

(will be auto-reprojected).

network_structure
rustalgos.graph.NetworkStructure

The network to snap zone centroids to.

origin_col
str

Column in od_df containing origin zone identifiers.

destination_col
str

Column in od_df containing destination zone identifiers.

weight_col
str

Column in od_df containing trip weights (e.g., number of bicycle commuters).

zone_id_col
str | None

Column in zones_gdf containing zone identifiers matching origin_col/destination_col. If None, uses the GeoDataFrame index.

max_snap_dist
float

Maximum distance (in CRS units, typically metres) for snapping a centroid to a network node. Centroids beyond this distance are excluded with a warning.

Returns

rustalgos.centrality.OdMatrix

Sparse OD matrix ready for use with betweenness_od.

betweenness_od

betweenness_od
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
od_matrix : OdMatrix
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
)->[ GeoDataFrame ]

Compute OD-weighted betweenness centrality using the shortest path heuristic. Weights betweenness by origin-destination trip counts from a sparse OD matrix. Only source nodes with outbound trips are traversed, and each shortest-path contribution is scaled by the corresponding OD weight. Closeness metrics are not computed.

Parameters

network_structure
None

nodes_gdf
None

A GeoDataFrame representing nodes. The outputs of calculations will be written to this GeoDataFrame.

od_matrix
None

An OdMatrix mapping (origin, destination) node pairs to trip weights. Build with config.build_od_matrix.

distances
list[int]

Distances corresponding to the local dmaxd_{max} thresholds to be used for calculations.

betas
list[float]

A list of β\beta to be used for the exponential decay function for weighted metrics.

minutes
list[float]

A list of walking times in minutes to be used for calculations.

min_threshold_wt
float

The default min_threshold_wt parameter can be overridden to generate custom mappings between the distance and beta parameters.

speed_m_s
float

The default speed_m_s parameter can be configured to generate custom mappings between walking times and distance thresholds dmaxd_{max}.

Returns

nodes_gdf
GeoDataFrame

The input nodes_gdf parameter is returned with additional betweenness columns.

node_centrality_simplest

node_centrality_simplest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
compute_closeness : bool = True
compute_betweenness : bool = True
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
angular_scaling_unit : float = 90
farness_scaling_offset : float = 1
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute node centrality using simplest (angular) paths with a single Dijkstra per source. When both compute_closeness and compute_betweenness are True, a single Brandes-style Dijkstra traversal per source produces the data for both closeness accumulation and betweenness backpropagation.

.. versionchanged:: 4.24.0 Angular routing now uses endpoint-aware dual-graph traversal instead of bearing-based angular costs. This requires a dual graph representation (convert with graphs.nx_to_dual). The tolerance parameter now uses the same relative-percentage semantics as shortest-path betweenness, but applies to angular route cost instead of metric distance. User-facing tolerance=0.0 means no additional tolerance beyond a tiny internal epsilon used for floating-point stability. Closeness values are nearly identical; betweenness values may differ slightly.

Parameters

network_structure
None

nodes_gdf
None

A GeoDataFrame representing nodes. The outputs of calculations will be written to this GeoDataFrame.

distances
list[int]

Distances corresponding to the local dmaxd_{max} thresholds to be used for calculations.

betas
list[float]

A list of β\beta to be used for the exponential decay function for weighted metrics.

minutes
list[float]

A list of walking times in minutes to be used for calculations.

compute_closeness
bool

Compute closeness centralities. True by default.

compute_betweenness
bool

Compute betweenness centralities. True by default.

min_threshold_wt
float

The default min_threshold_wt parameter can be overridden to generate custom mappings between the distance and beta parameters.

speed_m_s
float

The default speed_m_s parameter can be configured to generate custom mappings between walking times and distance thresholds dmaxd_{max}.

tolerance
float

Relative tolerance for angular betweenness path equality, as a percentage (e.g. 1.0 = 1%). Paths whose angular route cost is within this percentage of the best angular route are treated as near-equal for multi-predecessor Brandes betweenness. A tiny internal epsilon is always enforced as a minimum for floating-point stability.

angular_scaling_unit
float

Scaling unit for angular cost normalisation.

farness_scaling_offset
float

Offset for farness calculation.

random_seed
int

Optional seed for reproducible sampling.

sample
bool

If True, uses distance-based sampling. If False, computes exact centrality.

epsilon
float

Normalised additive error tolerance for sampling. Defaults to sampling.HOEFFDING_EPSILON.

Returns

nodes_gdf
GeoDataFrame

The input nodes_gdf parameter is returned with additional centrality columns.

segment_centrality

segment_centrality
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
compute_closeness : bool | None = True
compute_betweenness : bool | None = True
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
)->[ GeoDataFrame ]

Compute segment-based network centrality using the shortest path heuristic. > Simplest path heuristics introduce conceptual and practical complications and support is deprecated since v4.

For conceptual and practical reasons, segment based centralities are not weighted by node weights.

Parameters

network_structure
None

nodes_gdf
None

A GeoDataFrame representing nodes. Best generated with the io.network_structure_from_nx method. The outputs of calculations will be written to this GeoDataFrame, which is then returned from the method.

distances
list[int]

Distances corresponding to the local dmaxd_{max} thresholds to be used for calculations. The β\beta for distance-weighted metrics will be determined implicitly using min_threshold_wt. If the distances parameter is not provided, then the beta or minutes parameters must be provided instead.

betas
list[float]

A list of β\beta to be used for the exponential decay function for weighted metrics. The dmaxd_{max} thresholds for unweighted metrics will be determined implicitly. If the betas parameter is not provided, then the distances or minutes parameter must be provided instead.

minutes
list[float]

A list of walking times in minutes to be used for calculations. The dmaxd_{max} thresholds for unweighted metrics and β\beta for distance-weighted metrics will be determined implicitly using the speed_m_s and min_threshold_wt parameters. If the minutes parameter is not provided, then the distances or betas parameters must be provided instead.

compute_closeness
bool

Compute closeness centralities. True by default.

compute_betweenness
bool

Compute betweenness centralities. True by default.

min_threshold_wt
float

The default min_threshold_wt parameter can be overridden to generate custom mappings between the distance and beta parameters. See rustalgos.distances_from_beta for more information.

speed_m_s
float

The default speed_m_s parameter can be configured to generate custom mappings between walking times and distance thresholds dmaxd_{max}.

Returns

nodes_gdf
GeoDataFrame

The input node_gdf parameter is returned with additional columns populated with the calcualted metrics.

Notes

Segment path centralities are available with the following keys:

keyformulanotes
seg_density(a,b)edgesdbda\sum_{(a, b)}^{edges}d_{b} - d_{a}A summation of edge lengths.
seg_harmonic(a,b)edgesabln(b)ln(a)\sum_{(a, b)}^{edges}\int_{a}^{b}\ln(b) -\ln(a)A continuous form of harmonic closeness centrality applied to edge lengths.
seg_beta(a,b)edgesabexp(βb)exp(βa)β\sum_{(a, b)}^{edges}\int_{a}^{b}\frac{\exp(-\beta\cdot b) -\exp(-\beta\cdot a)}{-\beta}A continuous form of beta-weighted (gravity index) centrality applied to edge lengths.
seg_betweennessA continuous form of betweenness: Resembles segment_beta applied to edges situated on shortest paths between all nodes jj and kk passing through ii.

closeness_shortest

closeness_shortest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute closeness centrality using shortest paths. Wraps node_centrality_shortest.

closeness_simplest

closeness_simplest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
angular_scaling_unit : float = 90
farness_scaling_offset : float = 1
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute closeness centrality using simplest (angular) paths. Wraps node_centrality_simplest.

betweenness_shortest

betweenness_shortest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute betweenness centrality using shortest paths. Wraps node_centrality_shortest.

betweenness_simplest

betweenness_simplest
(
network_structure : NetworkStructure
nodes_gdf : geopandas.geodataframe.GeoDataFrame
distances : list[int] | None = None
betas : list[float] | None = None
minutes : list[float] | None = None
min_threshold_wt : float = 0.01831563888873418
speed_m_s : float = 1.33333
tolerance : float | None = None
random_seed : int | None = None
sample : bool = False
epsilon : float | None = None
)->[ GeoDataFrame ]

Compute betweenness centrality using simplest (angular) paths. Wraps node_centrality_simplest.