atra package

OI-Analytics utility package for Argentina project

Subpackages

Submodules

atra.adaptation_options module

Estimate costs and benefits under fixed parameters varying the

cost components, durations of disruptions, GDP growth rates
calc_benefits_and_bcr(x, discount_rates, discount_growth_rates, duration_max=10, min_loss=True, mode='road')[source]

Estimate the total cost and benefits for a road segment. This function is used within a pandas apply

Parameters:
  • x – a row from the road segment dataframe that we are considering
  • param_values – numpy array with a set of parameter combinations
  • mnt_dis_cost – adaptation costs for a district road in the mountains
  • mnt_nat_cost – adaptation costs for a national road in the mountains
  • cst_dis_cost – adaptation costs for a district road on flat terrain
  • cst_nat_cost – adaptation costs for a national road on flat terrain
  • pavement – set of paving combinations. This corresponds with the cost table and the param_values
  • mnt_main_cost – maintenance costs for roads in the mountains
  • cst_main_cost – maintenance costs for roads on flat terrain
  • discount_rates – discount rates to be used for the costs
  • discount_growth_rates – discount rates to be used for the losses
  • rehab_costs – rehabilitation costs after a disaster
  • min_main_dr – discount rates for 4-year periodic maintenance
  • max_main_dr – discount rates for 8-year periodic maintenance
  • min_exp (bool, optional) – Specify whether we want to use the minimum or maximum exposure length. The default value is set to True
  • national (bool, optional) – Specify whether we are looking at national roads. The default value is set to False
  • min_loss (bool, optional) – Specify whether we want to use the minimum or maximum economic losses. The default value is set to True
Returns:

  • uncer_output (list) – outcomes for the initial adaptation costs of this road segment
  • tot_uncer_output (list) – outcomes for the total adaptation costs of this road segment
  • rel_share (list) – relative share of each factor in the initial adaptation cost of this road segment
  • tot_rel_share (list) – relative share of each factor in the total adaptation cost of this road segment
  • bc_ratio (list) – benefit cost ratios for this road segment

calc_costs(x, cst_2L_asphalt, cst_2L_concrete, cst_4L_concrete, cst_rehab, cst_routine, cst_periodic, discount_rates, min_main_dr, max_main_dr, mode='road')[source]

Estimate the total cost and benefits for a road segment. This function is used within a pandas apply

Parameters:
  • x – a row from the road segment dataframe that we are considering
  • param_values – numpy array with a set of parameter combinations
  • mnt_dis_cost – adaptation costs for a district road in the mountains
  • mnt_nat_cost – adaptation costs for a national road in the mountains
  • cst_dis_cost – adaptation costs for a district road on flat terrain
  • cst_nat_cost – adaptation costs for a national road on flat terrain
  • pavement – set of paving combinations. This corresponds with the cost table and the param_values
  • mnt_main_cost – maintenance costs for roads in the mountains
  • cst_main_cost – maintenance costs for roads on flat terrain
  • discount_rates – discount rates to be used for the costs
  • discount_growth_rates – discount rates to be used for the losses
  • rehab_costs – rehabilitation costs after a disaster
  • min_main_dr – discount rates for 4-year periodic maintenance
  • max_main_dr – discount rates for 8-year periodic maintenance
  • min_exp (bool, optional) – Specify whether we want to use the minimum or maximum exposure length. The default value is set to True
  • national (bool, optional) – Specify whether we are looking at national roads. The default value is set to False
  • min_loss (bool, optional) – Specify whether we want to use the minimum or maximum economic losses. The default value is set to True
Returns:

  • uncer_output (list) – outcomes for the initial adaptation costs of this road segment
  • tot_uncer_output (list) – outcomes for the total adaptation costs of this road segment
  • rel_share (list) – relative share of each factor in the initial adaptation cost of this road segment
  • tot_rel_share (list) – relative share of each factor in the total adaptation cost of this road segment
  • bc_ratio (list) – benefit cost ratios for this road segment

calculate_discounting_arrays(discount_rate=12, growth_rate=2.7, start_year=2016, end_year=2050, min_period=4, max_period=8)[source]

Set discount rates for yearly and period maintenance costs

Parameters:
  • discount_rate – yearly discount rate
  • growth_rate – yearly growth rate
Returns:

  • discount_rate_norm – discount rates to be used for the costs
  • discount_rate_growth – discount rates to be used for the losses
  • min_main_dr – discount rates for 4-year periodic maintenance
  • max_main_dr – discount rates for 8-year periodic maintenance

get_adaptation_options_costs(file_id, data_path, output_path, results_type, discount_rate=10, start_year=2016, end_year=2050, min_period=4, max_period=8, read_from_file=False)[source]
run_adaptation_calculation(roads, file_id, output_path, file_id_col, results_type_index_col, results_type, duration_max=10, discount_rate=10, growth_rate=2.8, start_year=2016, end_year=2050, min_period=4, max_period=8, read_from_file=False)[source]

atra.network module

Network representation and utilities

class Network(nodes=None, edges=None)[source]

Bases: object

A Network is composed of nodes (points in space) and edges (lines)

Parameters:
  • nodes (geopandas.geodataframe.GeoDataFrame, optional) –
  • edges (geopandas.geodataframe.GeoDataFrame, optional) –
nodes
Type:geopandas.geodataframe.GeoDataFrame
edges
Type:geopandas.geodataframe.GeoDataFrame
set_crs(crs=None, epsg=None)[source]

Set network (node and edge) crs

Parameters:
  • crs (dict or str) – Projection parameters as PROJ4 string or in dictionary form.
  • epsg (int) – EPSG code specifying output projection
to_crs(crs=None, epsg=None)[source]

Set network (node and edge) crs

Parameters:
  • crs (dict or str) – Projection parameters as PROJ4 string or in dictionary form.
  • epsg (int) – EPSG code specifying output projection
add_endpoints(network)[source]

Add nodes at line endpoints

add_ids(network, id_col='id', edge_prefix='edge', node_prefix='node', update=False)[source]

Add an id column with ascending ids

add_topology(network, id_col='id', update=False)[source]

Add from_id, to_id to edges

add_vertex(line, point)[source]

Add a vertex to a line at a point

concat_dedup(dfs)[source]

Concatenate a list of GeoDataFrames, dropping duplicate geometries - note: repeatedly drops indexes for deduplication to work

d_within(geom, gdf, distance)[source]

Find the subset of a GeoDataFrame within some distance of a shapely geometry

drop_duplicate_geometries(gdf, keep='first')[source]

Drop duplicate geometries from a dataframe

edges_within(point, edges, distance)[source]

Find edges within a distance of point

geometry_column_name(gdf)[source]

Get geometry column name, fall back to ‘geometry’

get_endpoints(network)[source]

Get nodes for each edge endpoint

intersects(geom, gdf, tolerance=1e-09)[source]

Find the subset of a GeoDataFrame intersecting with a shapely geometry

line_endpoints(line)[source]

Return points at first and last vertex of a line

Link nodes to all edges within some distance

Link nodes to all edges within some distance

matching_gdf_from_geoms(gdf, geoms)[source]

Create a geometry-only GeoDataFrame with column name to match an existing GeoDataFrame

merge_edges(network)[source]

Merge edges that share a node with a connectivity degree of 2

merge_multilinestring(geom)[source]

Merge a MultiLineString to LineString

nearest(geom, gdf)[source]

Find the element of a GeoDataFrame nearest a shapely geometry

nearest_edge(point, edges)[source]

Find nearest edge to a point

nearest_node(point, nodes)[source]

Find nearest node to a point

nearest_point_on_edges(point, edges)[source]

Find nearest point on edges to a point

nearest_point_on_line(point, line)[source]

Return the nearest point on a line

nearest_vertex_idx_on_line(point, line)[source]

Return the index of nearest vertex to a point on a line

node_connectivity_degree(node, network)[source]
nodes_intersecting(line, nodes, tolerance=1e-09)[source]

Find nodes intersecting line

round_geometries(network, precision=3)[source]

Round coordinates of all node points and vertices of edge linestrings to some precision

set_precision(geom, precision)[source]

Set geometry precision

snap_line(line, points, tolerance=1e-09)[source]

Snap a line to points within tolerance, inserting vertices as necessary

snap_nodes(network, threshold=None)[source]

Move nodes (within threshold) to edges

split_edge_at_points(edge, points, tolerance=1e-09)[source]

Split edge at point/multipoint

split_edges_at_nodes(network, tolerance=1e-09)[source]

Split network edges where they intersect node geometries

split_line(line, points, tolerance=1e-09)[source]

Split line at point or multipoint, within some tolerance

split_multilinestrings(network)[source]

Create multiple edges from any MultiLineString edge

Ensures that edge geometries are all LineStrings, duplicates attributes over any created multi-edges.

atra.transport_flow_and_failure_functions module

Functions used in the provincial and national-scale network failure analysis

add_dataframe_generalised_costs(G, vehicle_numbers, tonnage)[source]
add_igraph_generalised_costs(G, vehicle_numbers, tonnage)[source]
change_depth_string_to_number(x)[source]
combine_hazards_and_network_attributes_and_impacts(hazard_dataframe, network_dataframe, network_id_column)[source]
correct_exposures(x, length_thr)[source]
create_hazard_scenarios_for_adaptation(all_edge_fail_scenarios, index_cols, length_thr)[source]
edge_failure_sampling(failure_scenarios, edge_column)[source]

Criteria for selecting failure samples

Parameters:
  • - Pandas DataFrame of failure scenarios (failure_scenarios) –
  • - String name of column to select failed edge ID's (edge_column) –
Returns:

Return type:

edge_failure_samples - List of lists of failed edge sets

get_flow_paths_indexes_of_edges(flow_dataframe, path_criteria)[source]
igraph_scenario_edge_failures_new(network_df_in, edge_failure_set, flow_dataframe, edge_flow_path_indexes, path_criteria, tons_criteria, cost_criteria, time_criteria, transport_mode, new_path=True)[source]

Estimate network impacts of each failures When the tariff costs of each path are fixed by vehicle weight

Parameters:
  • - Pandas DataFrame of network (network_df_in) –
  • - List of string edge ID's (edge_failure_set) –
  • - Pandas DataFrame of list of edge paths (flow_dataframe) –
  • - String name of column of edge paths in flow dataframe (path_criteria) –
  • - String name of column of path tons in flow dataframe (tons_criteria) –
  • - String name of column of path costs in flow dataframe (cost_criteria) –
  • - String name of column of path travel time in flow dataframe (time_criteria) –
Returns:

edge_failure_dictionary – With attributes edge_id - String name or list of failed edges origin - String node ID of Origin of disrupted OD flow destination - String node ID of Destination of disrupted OD flow no_access - Boolean 1 (no reroutng) or 0 (rerouting) new_cost - Float value of estimated cost of OD journey after disruption new_distance - Float value of estimated distance of OD journey after disruption new_path - List of string edge ID’s of estimated new route of OD journey after disruption new_time - Float value of estimated time of OD journey after disruption

Return type:

list[dict]

merge_failure_results(flow_df_select, failure_df, id_col, tons_col, dist_col, time_col, cost_col)[source]

Merge failure results with flow results

Parameters:
  • flow_df_select (pandas.DataFrame) – edge flow values
  • failure_df (pandas.DataFrame) – edge failure values
  • tons_col (str) – name of column of tonnages in flow dataframe
  • dist_col (str) – name of column of distance in flow dataframe
  • time_col (str) – name of column of time in flow dataframe
  • cost_col (str) – name of column of cost in flow dataframe
  • vehicle_col (str) – name of column of vehicle counts in flow dataframe
  • changing_tonnages (bool) –
Returns:

flow_df_select – Of edge flow and failure values merged

Return type:

pandas.DataFrame

network_failure_assembly_shapefiles(edge_failure_dataframe, gdf_edges, save_edges=True, shape_output_path='')[source]

Write results to Shapefiles

Outputs gdf_edges - a Shapefile with results of edge failure dataframe

Parameters:
  • edge_failure_dataframe – Pandas DataFrame of edge failure results
  • gdf_edges – GeoDataFrame of network edge set with edge ID’s and geometry
  • save_edges (bool) – Boolean condition to tell code to save created edge shapefile
  • shape_output_path (str) – Path where the output shapefile will be stored
network_od_path_estimations(graph, source, target, cost_criteria, time_criteria)[source]

Estimate the paths, distances, times, and costs for given OD pair

Parameters:
  • graph – igraph network structure
  • source – String/Float/Integer name of Origin node ID
  • source – String/Float/Integer name of Destination node ID
  • tonnage (float) – value of tonnage
  • vehicle_weight (float) – unit weight of vehicle
  • cost_criteria (str) – name of generalised cost criteria to be used: min_gcost or max_gcost
  • time_criteria (str) – name of time criteria to be used: min_time or max_time
  • fixed_cost (bool) –
Returns:

  • edge_path_list (list[list]) – nested lists of Strings/Floats/Integers of edge ID’s in routes
  • path_dist_list (list[float]) – estimated distances of routes
  • path_time_list (list[float]) – estimated times of routes
  • path_gcost_list (list[float]) – estimated generalised costs of routes

rearrange_minmax_values(edge_failure_dataframe)[source]

Write results to Shapefiles

Parameters:edge_failure_dataframe (pandas.DataFrame) – with min-max columns
Returns:edge_failure_dataframe – With columns where min < max
Return type:pandas.DataFrame
spatial_scenario_selection(network_shapefile, polygon_dataframe, hazard_dictionary, data_dictionary, network_id_column, network_type='nodes')[source]

Intersect network edges/nodes and boundary Polygons to collect boundary and hazard attributes

Parameters
  • network_shapefile - Shapefile of edge LineStrings or node Points
  • polygon_shapefile - Shapefile of boundary Polygons
  • hazard_dictionary - Dictionary of hazard attributes
  • data_dictionary - Dictionary of network-hazard-boundary intersection attributes
  • network_type - String value -‘edges’ or ‘nodes’ - Default = ‘nodes’
  • name_province - String name of province if needed - Default = ‘’
Outputs
data_dictionary - Dictionary of network-hazard-boundary intersection attributes:
  • edge_id/node_id - String name of intersecting edge ID or node ID
  • length - Float length of intersection of edge LineString and hazard Polygon: Only for edges
  • province_id - String/Integer ID of Province
  • province_name - String name of Province in English
  • district_id - String/Integer ID of District
  • district_name - String name of District in English
  • commune_id - String/Integer ID of Commune
  • commune_name - String name of Commune in English
  • hazard_attributes - Dictionary of all attributes from hazard dictionary
swap_min_max(x, min_col, max_col)[source]

Swap columns if necessary

write_flow_paths_to_network_files(save_paths_df, min_industry_columns, max_industry_columns, gdf_edges, save_csv=True, save_shapes=True, shape_output_path='', csv_output_path='')[source]

Write results to Shapefiles

Outputs gdf_edges - a shapefile with minimum and maximum tonnage flows of all commodities/industries for each edge of network.

Parameters:
  • save_paths_df – Pandas DataFrame of OD flow paths and their tonnages
  • industry_columns – List of string names of all OD commodities/industries indentified
  • min_max_exist – List of string names of commodity/industry columns for which min-max tonnage column names already exist
  • gdf_edges – GeoDataFrame of network edge set
  • save_csv – Boolean condition to tell code to save created edge csv file
  • save_shapes – Boolean condition to tell code to save created edge shapefile
  • shape_output_path – Path where the output shapefile will be stored
  • csv_output_path – Path where the output csv file will be stored

atra.utils module

Shared plotting functions

class Style

Bases: tuple

Style(color, zindex, label): class to hold an element’s styles

Used to generate legend entries, apply uniform style to groups of map elements (See network_map.py for example.)

color

Alias for field number 0

label

Alias for field number 2

zindex

Alias for field number 1

assign_value_in_area_proportions(poly_1_gpd, poly_2_gpd, poly_attribute)[source]
assign_value_in_area_proportions_within_common_region(poly_1_gpd, poly_2_gpd, poly_attribute, common_region_id)[source]
count_points_in_polygon(x, points_sindex)[source]

Count points in a polygon

Parameters:
  • x – row of dataframe
  • points_sindex – spatial index of dataframe with points in the region to consider
Returns:

Return type:

Number of points in polygon

extract_gdf_values_containing_nodes(x, sindex_input_gdf, input_gdf, column_name)[source]
extract_nodes_within_gdf(x, input_nodes, column_name)[source]
extract_value_from_gdf(row, gdf_sindex, gdf, column_name)[source]
Inputs are:
row – row of dataframe gdf_sindex – spatial index of dataframe of which we want to extract the value gdf – GeoDataFrame of which we want to extract the value column_name – column that contains the value we want to extract
Outputs are:
extracted value from other gdf
gdf_clip(shape_in, clip_geom)[source]
Inputs are:
shape_in – path string to shapefile to be clipped
Outputs are:
province_geom – shapely geometry of province for what we do the calculation
gdf_geom_clip(gdf_in, clip_geom)[source]

Filter a dataframe to contain only features within a clipping geometry

Parameters:
  • gdf_in – geopandas dataframe to be clipped in
  • province_geom – shapely geometry of province for what we do the calculation
Returns:

Return type:

filtered dataframe

generate_weight_bins(weights, n_steps=9, width_step=0.01, interpolation='linear')[source]

Given a list of weight values, generate <n_steps> bins with a width value to use for plotting e.g. weighted network flow maps.

get_axes(extent=(-74.04, -52.9, -20.29, -57.38), epsg=None)[source]

Get map axes

Default to Argentina extent // Lambert Conformal projection

get_data(filename)[source]

Read in data (as array) and extent of each raster

get_nearest_node(x, sindex_input_nodes, input_nodes, id_column)[source]

Get nearest node in a dataframe

Parameters:
  • x – row of dataframe
  • sindex_nodes – spatial index of dataframe of nodes in the network
  • nodes – dataframe of nodes in the network
  • id_column – name of column of id of closest node
Returns:

Return type:

Nearest node to geometry of row

get_nearest_node_within_region(x, input_nodes, id_column, region_id)[source]
legend_from_style_spec(ax, styles, loc='lower left')[source]

Plot legend

line_length(line, ellipsoid='WGS-84')[source]

Length of a line in meters, given in geographic coordinates.

Adapted from https://gis.stackexchange.com/questions/4022/looking-for-a-pythonic-way-to-calculate-the-length-of-a-wkt-linestring#answer-115285

Parameters:
Returns:

Length of line in kilometers.

load_config()[source]

Read config.json

load_labels(data_path, include_regions)[source]
plot_basemap(ax, data_path, focus='ARG', neighbours=('CHL', 'BOL', 'PRY', 'BRA', 'URY'), country_border='white', plot_regions=True)[source]

Plot countries and regions background

plot_basemap_labels(ax, data_path, labels=None, include_regions=False, include_zorder=2)[source]

Plot countries and regions background

round_sf(x, places=1)[source]

Round number to significant figures

save_fig(output_filename)[source]
scale_bar(ax, length=100, location=(0.5, 0.05), linewidth=3)[source]

Draw a scale bar

Adapted from https://stackoverflow.com/questions/32333870/how-can-i-show-a-km-ruler-on-a-cartopy-matplotlib-plot/35705477#35705477

Parameters:
  • ax (axes) –
  • length (int) – length of the scalebar in km.
  • location (tuple) – center of the scalebar in axis coordinates (ie. 0.5 is the middle of the plot)
  • linewidth (float) – thickness of the scalebar.
set_ax_bg(ax, color='#c6e0ff')[source]

Set axis background color

transform_geo_file(source_file, sink_file, sink_schema, transform_record)[source]

Transform a fiona-readable file

Parameters:
  • source_file (str) – source file path
  • sink_file (str) – destination file path
  • sink_schema (dict) – fiona schema for output
  • transform_record (function) – function that accepts a fiona record and returns a fiona record or None
voronoi_finite_polygons_2d(vor, radius=None)[source]

Reconstruct infinite voronoi regions in a 2D diagram to finite regions.

Source: https://stackoverflow.com/questions/36063533/clipping-a-voronoi-diagram-python

Parameters:
  • vor (Voronoi) – Input diagram
  • radius (float, optional) – Distance to ‘points at infinity’
Returns:

  • regions (list of tuples) – Indices of vertices in each revised Voronoi regions.
  • vertices (list of tuples) – Coordinates for revised Voronoi vertices. Same as coordinates of input vertices, with ‘points at infinity’ appended to the end

within_extent(x, y, extent)[source]

Test x, y coordinates against (xmin, xmax, ymin, ymax) extent