Guide
cityseer
is developed from the ground-up to address a particular range of issues that are prevalent in pedestrian-scale urban analysis:
- It uses localised forms of network analysis (as opposed to global forms of analysis) based on network methods applied over the graph through a ‘moving-window’ methodology. The graph is isolated at the specified distance thresholds for a selected node, and the process subsequently repeats for every other node in the network. These thresholds are conventionally based on either crow-flies euclidean distances or actual network distances (@Cooper2015):
cityseer
takes the position that network distances are more representative when working at smaller pedestrian distance thresholds, especially when applied to land-use accessibilities and mixed-use calculations; - It is common to use either shortest-distance or simplest-path (shortest angular ‘distance’) impedance heuristics. When using simplest-path heuristics, it is necessary to modify the underlying shortest-path algorithms to prevent side-stepping of sharp angular turns (@Turner2007); otherwise, two smaller side-steps can be combined to ‘short-cut’ sharp corners. It is also common for methods to be applied to either primal graph representations (generally used with shortest-path methods such as those applied by multiple centrality assessment (@Porta2006) analysis) or dual graph representations (typically used with simplest-path methods in the tradition of space syntax(@Hillier1984));
- There is a range of possible centrality and mixed-use methods, many of which can be weighted by distances or street lengths. These methods and their implications are explored in detail in the localised centrality methods and localised land-use diversity methods papers. Some conventional methods, even if widely used, have not necessarily proved suitable for localised urban analysis;
- Centrality methods are susceptible to topological distortions arising from ‘messy’ graph representations as well as due to the conflation of topological and geometrical properties of street networks.
cityseer
addresses these through the inclusion of graph cleaning functions; procedures for splitting geometrical properties from topological representations; and the inclusion of segmentised centrality measures, which are less susceptible to distortions introduced by varying intensities of nodes; - Micro-morphological analysis requires approaches facilitating the evaluation of respective measures at finely-spaced intervals along street fronts. Further, granular evaluation of land-use accessibilities and mixed-uses requires that land uses be assigned to the street network in a contextually precise manner. These are addressed in
cityseer
by applying network decomposition combined with algorithms incorporating bidirectional assignment of data points to network nodes based on the closest adjacent street edge.
The broader emphasis on localised methods and how cityseer
addresses these is broached in the associated paper. cityseer
includes a variety of convenience methods for the general preparation of networks and their conversion into (and out of) the lower-level data structures used by the underlying algorithms. These graph utility methods are designed to work with NetworkX
to facilitate ease of use. A complement of code tests has been developed to maintain the codebase’s integrity through general package maintenance and upgrade cycles. Shortest-path algorithms, harmonic closeness, and betweenness algorithms are tested against NetworkX
. Mock data and test plots have been used to visually confirm the intended behaviour for divergent simplest and shortest-path heuristics and testing data assignment to network nodes given various scenarios.
CRS
x
and y
node attributes determine the spatial coordinates of the node, and should be in a suitable projected (“flat”) coordinate reference system (CRS) in metres. For convenience, nx_wgs_to_utm
can be used for converting a networkX
graph from WGS84 lng
, lat
geographic CRS to the local UTM x
, y
projected CRS. geopandas
data points should likewise be in a projected CRS and the CRS should match that used by the node attributes.
Edge Rolloff
When calculating network or layer metrics, the network has to be buffered by a distance equal to the maximum distance threshold being considered by the algorithms. This prevents problematic results arising due to edge roll-off effects. For example, if running centrality and/or land-use analysis using distances of 500, 1000, 2000m, then the network must be buffered by 2000m. When using data layers, the data points should — for the same reasons — cover these buffered extents as well.
The live=True
node attribute is used for identifying nodes falling within the original non-buffered graph extents as opposed to the live=False
nodes that fall within the surrounding buffered area. The underlying shortest-path algorithms will have access to both live=True
and live=False
nodes (thus preventing edge rolloff), but derivative metrics are only tabulated for live=True
nodes. This eliminates edge roll-off effects, reduces unnecessary computation, and cleanly identifies which nodes are or are not in the buffered roll-off area. If some other post-processing step will be used for filtering the nodes, or if boundary roll-off is not being considered, then set all nodes to live=True
.
Graph Cleaning
Good sources of street network data, such as the Ordnance Survey’s OS Open Roads, typically have two distinguishing characteristics:
- The network has been simplified to its essential structure: i.e. unnecessarily complex representations of intersections, on-ramps, divided roadways, etc., have been reduced to a simpler representation concurring more readily with the core topological structure of street networks. Simplified forms of network representation contrast those focusing on completeness (e.g. for route way-finding, see OS ITN Layer): these introduce unnecessary complexity serving to hinder rather than help shortest-path algorithms in the sense used by pedestrian centrality measures.
- The topology of the network is kept distinct from the geometry of the streets. Often-times, as can be seen with Open Street Map, additional nodes are added to streets to represent geometric twists and turns along a roadway. These additional nodes cause topological distortions that impact network centrality measures.
When a high-quality source is available, it is best not to attempt additional clean-up unless there is a particular reason. On the other hand, many indispensable sources of network information, particularly Open Street Map data, can be particularly messy for network analysis purposes.
cityseer
uses customisable graph cleaning methods that reduce topological idiosyncrasies which may otherwise confound centrality measures. It can, for example, remove dual carriageways while merging nodes and roadways in a manner that is as ‘tidy’ as possible. These are demonstrated in the Graph Cleaning
guide.
There are still situations where it can be difficult to automate network cleaning, especially when using the same workflows across different cities. Each city tends to have its own quirks, and for this reason it can be preferable to retain the unsimplified representation but to control for algorithmic artefacts encountered for centrality computations by using other methods. These are demonstrated in the OSM Strategies
guide.
OSM and NetworkX
cityseer
is intended to be data-source agnostic, and is predominately used in concert with Postgres/PostGIS
databases or with OSM
data. OSM
queries can be used to populate cityseer
graphs directly, else OSMnx
can be used to gather OSM
data which can then be converted into cityseer
graphs, an example of which is provided in the code snippet beneath.
cityseer
uses networkX
primarily as an in-out and graph preparation tool for end-user ease of use, not as a means for algorithmic analysis. It avoids networkX
for algorithmic analysis for two reasons. First, the algorithms employed in cityseer
are intended for localised (windowed) graph analysis specifically within an urban analysis context: they use explicit distance thresholds; engage unique variants of centrality measures; handle cases such as simplest-path heuristics and segmentised forms of analysis; and extend these algorithms to handle the derivation of land-use accessibilities, mixed-uses, and statistical aggregations using similarly windowed and network-distance-weighted methods. Second, networkX
scales very poorly to larger graphs, and can become unusable for large cities or large distance thresholds due to poor performance.
The following points may be helpful when using OSMnx
and cityseer
together:
OSMnx
prepared graphs can be converted tocityseer
compatible graphs by using thetools.io.nx_from_osm_nx
method. In doing so, keep the following in mind:OSMnx
usesnetworkX
multiDiGraph
graph structures that use directional edges. As such, it can be used for understanding vehicular routing, i.e. where one-way routes can have a major impact on the available shortest-routes.cityseer
is only concerned with pedestrian networks and therefore usesnetworkX
MultiGraphs
on the premise that pedestrian networks are not ordinarily directional. When using thetools.io.nx_from_osm_nx
method, be cognisant that all directional information will be discarded.cityseer
graph simplification and consolidation workflows will give different results to those employed inOSMnx
. If you’re usingOSMnx
to ingest networks fromOSM
but wish to simplify and consolidate the network as part of acityseer
workflow, set theOSMnx
simplify
argument toFalse
so that the network is not automatically simplified.
cityseer
uses internal validation workflows to check that the geometries associated with an edge remain connected to the coordinates of the nodes on either side. If performing graph manipulation outside ofcityseer
before conversion, the conversion function may complain of disconnected geometries. In these cases, you may need to relax the tolerance parameter used for error checking upon conversion to acityseer
MultiGraph
, in which case geometries disconnected from their end-nodes (within the tolerance parameter) will be “snapped” to meet their endpoints as part of the conversion process.
See the OSM - cityseer guide.
GeoPandas
GeoPandas
adds support for spatial features and related operations to Pandas
dataframes. However, dataframes can be slow for purposes of iteratively adding and removing rows, for which reason it is preferable to use networkX
graphs for the graph cleaning and preparation stage of analysis. After graph preparation steps, cityseer
uses GeoDataFrame
structures for data state. The tools.io.nx_from_generic_geopandas
method can be used to convert GeoPandas
LineString DataFrames — such as those used by momepy
— to and from cityseer
compatible graphs.
Optimised packages
Computational methods for network-based analysis rely extensively on shortest-path algorithms: these present substantial computational complexity due to nested-loops. For this reason, methods implemented in pure Python
, i.e. NetworkX
or packages that depend on it for algorithmic analysis, can be prohibitively slow. Speed improvements can be found by running intensive algorithms against packages implemented in lower-level languages such as Graph-Tool
or igraph
, which wrap underlying optimised code libraries implemented in more performant languages such as C++
. However, off-the-shelf network analysis packages are not ideal for application to urbanism; they do not ordinarily cater for localised distance thresholds, specialised centrality methods, shortest vs simplest-path heuristics, or calculation of land-use accessibilities and mixed-uses.
cityseer
evolved as a WIP package over the span of years, initially used for experimentation and comparative tests of centrality methods and landuse methods during a PhD. Initially, computationally intensive algorithms were wrapped in numba
for the sake of performant JIT compilation and parallelisation. The use of numba
made it feasible to scale these methods to large and, optionally, decomposed networks with significant numbers of nodes. More recent versions of cityseer
have moved underlying algorithms from numba
to rust
, which offers yet better performance, doesn’t require a compilation step when running a function for the first time, and affords the use of more elegant design patterns.