Content
This document is meant as a tutorial to assist in the process of creating a river network object from widely available geodatasets. The network-based description of river systems greatly facilitates the computation of indices to assess riverscape connectivity and fragmentation. In this tutorial, the package riverconn is used for the calculation of such indices. You can learn more on the package functionalities and architecture by reading the Baldan et al. (2022) paper. If you use this tutorial, do not forget to cite this paper.
1. Introduction
This document is a step-by-step guide to create an
igraph
object starting from commonly available geodata.
Here, an example is provided using data for the Ebro river (Spain),
based on widely available geodata from HydroSHEDS and public
repositories from the Hydrographic Confederation of the Ebro river CHE
.
The source code used to generate this tutorial is available on github in
the file tutorial_network_creation.Rmd
; additional
functions used in the tutorial are stored in the file
functions_network_creation.R
.
After this tutorial is completed, the resulting igraph
object is meant to be used as input to functions to assess riverscape
fragmentation. The riverconn
package (available on CRAN and on github) allows for
the calculation of many fragmentation/connectivity indices. Most graph
theory-based indices to asses fragmentation and connectivity of rivers
(see Jumani et al., 2020 for a review) conceptualize the riverscape as a
graph where vertices (nodes) represent reaches, i.e. distintictly
separated river sections; while edges (links) represent splitting
objects, such as confluences (where two reaches join), barriers (items
that disrupt the longitudial connectivity of the river), or habitat
patches boundaries (borders between sections of the river with
relatively uniform habitat conditions). This tutorial focuses on
barriers (dams), but in principle other fragmentation items can be dealt
with similarly.
The igraph
package implements routines for graphs and network analysis. It can
handle large graphs very well and provides functions for generating
random and regular graphs, graph visualization, centrality methods and
much more. The packages allows for easy construction
of igraph
objects based on symbolic description of edges
and vertices, edges and vertices lists, or adjacency matrices. The book
Statistical
Analysis of Network Data with R by Kolaczyk and Csardi (2014) offers
a comprehensive tutorial on the possibilities offered by the
igraph
package, including network data manipulation and
visualization, descriptive analysis of network graphs, mathematical and
statistical modeling of graphs, and network topology inference.
1.1 Packages needed
This guide relies on several packages for data processing (the links direct to useful resources and tutorials):
sf
,lwgeom
,rgdal
,raster
,elevatr
,RANN
packages for obtaining and processing spatial data;igraph
package for performing network analyses;tidyverse
packages for efficient and easy-to-read data pipeline (see here and here for the most frequently used functions);ggnetwork
for producing easy, ggplot-like, plot of classigraph
objects andggspatial
for easy, ggplot-like plot, of spatial data;corrmorant
to plot nice correlation plots;
riverconn
to calculate the fragmentation indices for the riverscape.
Make sure those libraries are installed, updated, and loaded.
library(tidyverse)
library(sf)
library(raster)
library(ggspatial)
library(viridis)
library(igraph)
library(riverconn)
library(elevatr)
library(gridExtra)
library(ggnetwork)
library(lwgeom)
library(gridExtra)
library(corrmorant)
library(RANN)
library(ggpubr)
library(cowplot)
Most of the geospatial processing in this tutorial occurs on
sf
objects. Thanks to their data.frame-like structure,
spatial operations (e.g., selecting and updating columns and rows, join
with other data frames) are “easier” on such objects. Most spatial
objects are converted to this class as soon as possible in the
script.
1.2 Workflow steps
The workflow suggested in this document follows several steps:
- data retrieval from existing datasets;
- data pre-processing;
- network creation;
- indices calculation and comparison.
1.3 Sources of the spatial data
The construction of the igraph
object requires at least
some information on the spatial geometry of the river system and the
position of the barriers.
The river network shapefile can be retrieved from HydroSHEDS, a global hydrographic atlas with geo-referenced datasets derived from spaceborn elevation data. HydroSHEDS is widely used for global to regional analyses (Lehner et al., 2008). River network data (HydroRIVERS) can be downloaded from here. Basins and catchments shapefiles (HydroBASINS) are also available with 12 detail levels (Lehner et al., 2013). The Ebro catchment border can be selected from the polygons available in the HydroBASINS layer (use level 5 basins shapefile for the identification) and the corresponding network can be extracted from the HydroSHEDS layer.
Other options are available for catchments and rivers. The European Catchments and Rivers network system (ECRINS) dataset provides good quality data covering most of Europe. The ECRINS data for rivers are structured with segments and nodes, making the creation of a network quite straight-forward. ECRINS has a higher resolution than HydroRIVERS. The recent Hydrography90m dataset can also be used. Local sources (e.g. hydrography layers from local authorities) are also generally available, but particular care should be taken in preprocessing to correct the presence of disconnected segments and loops in the shapefile. Different HydroSHEDS and HydroRIVERS versions are available. This tutorial uses HydroRIVERS v1.1: note that the previous versions were slighthly different in terms of attributes (for further details see the HydroRIVERS technical documentation).
The shapefile with the geographic position of the dams was obtained from the Ebro river public Geographic Information System CHE website.
The AMBER dataset described in Belletti et al. (2020) contains barriers data compiled from different national and international sources. Metadata on barriers type (weir, dam, culvert, etc) and height are available but most barriers are unclassified.
The corresponding shapefiles are available here and can be imported in R (note that the river shapefile “Ebre_riv_modif.shp” has been sligtly modified from the original hydrosheds shapefile -see next sections on the checks carried out on the shapefiles. The original shapefile is saved in the Ebro_shape_original/ folder for comparison).
shape_river <- st_read("Ebro_shape_corrected/Ebro_rivers.shp")
## Reading layer `Ebro_rivers' from data source
## `C:\Users\Damiano\Dropbox\riverconn_tutorial\Ebro_shape_corrected\Ebro_rivers.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 4985 features and 14 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -4.3625 ymin: 40.3625 xmax: 2.116667 ymax: 43.15417
## Geodetic CRS: WGS 84
shape_basin <- st_read("Ebro_catchment/Ebro_mask.shp")
## Reading layer `Ebro_mask' from data source
## `C:\Users\Damiano\Dropbox\riverconn_tutorial\Ebro_catchment\Ebro_mask.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 13 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -4.4 ymin: 40.34167 xmax: 2.154167 ymax: 43.17917
## Geodetic CRS: WGS 84
shape_dams <- st_read("Ebro_dams/Embalses/Embalses.shp")
## Reading layer `Embalses' from data source
## `C:\Users\Damiano\Dropbox\riverconn_tutorial\Ebro_dams\Embalses\Embalses.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 224 features and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 408868.8 ymin: 4505766 xmax: 903207.1 ymax: 4766629
## Projected CRS: ETRS89 / UTM zone 30N
1.4 Functions to support data processing
Some custom functions were developed and were used in the data
processing. The source code and the documentation for these functions is
found in Appendix B. Before starting, make sure those functions are
loaded in the global environment: get_outlet
,
create_network
, snap_to_river
,
headwaters_dam
, multiple_confluences
,
gaps_detection
, check_components
. Detailed
information on the use of these functions is presented in the next
sections.
2. Data pre-processing
Some data preprocessing is needed to prepare the input data for network construction.
2.1 Shapefiles processing
Shapefiles derived from the HydroRIVERS layer have a high detail level: a network with a high detail level will have a large amount of nodes and links, making the algorithms for network analysis slow. Thus it makes sense to reduce the size of the network by pruning the most upstream reaches. The information on the upstream area (suquare kilometers) is stored in the field ‘UPLAND_SKM’.
Note that different HydroRIVERS versions are available. This tutorial uses HydroRIVERS v1.1, wuere different fields are available for pruning the river network based on the size (e.g. ORD_STRA, ORD_CLAS, ORD_FLOW). Previous versions have different attributes. For instance, in HydroRIVERS v1.0 the field UP_CELLS (number of upstream cells) can be used to prune the river network. Check the dataset documentation for further details.
# Set a threshold of 80 square kilometers
threshold = 80
# Prune HydroRIVERS network based on upstream area
shape_river_small <- shape_river[as.numeric(shape_river$UPLAND_SKM) > threshold,]
Next, the dams shapefile can be converted to a shapefile and a unique id can be given to each point. As the dams dataset is a polygon shapefile the centroid of each poligon has to be calculated to obtain a “point” layer. The reference system of the dams layer is changed to WGS84, the same reference system of HydroSHEDS and HydroRIVERS.
dams_to_points <- shape_dams %>%
st_as_sf %>%
st_centroid %>%
mutate(id = 1:nrow(.))%>%
dplyr::select(id)%>%
st_transform(crs = "+proj=longlat +datum=WGS84")
The shapefiles can be plotted to check there are no errors in the geolocalization of the data.
ggplot() +
coord_fixed() +
theme_minimal() +
ggspatial::layer_spatial(shape_basin, fill = NA, color = "gray90") +
ggspatial::layer_spatial(shape_river_small, aes(color = log10(UPLAND_SKM)) )+
ggspatial::layer_spatial(dams_to_points) +
scale_color_viridis(direction = -1, name= "upstream area \n(log10[Km^2])") +
theme(legend.position = "bottom") +
ggspatial::annotation_scale(location = "bl", style = "ticks") +
ggspatial::annotation_north_arrow(location = "br")+
labs(caption = "Black dots are the position of the dams")
2.2 Confluences processing
Every river network contains confluences, i.e. points where two
reaches merge into one. In the river graph defined above, confluences
are edges (links) between nodes. To define confluences the pruned river
shapefile is first simplified and its geometry is casted to “POINT”.
Note that for large shapefiles the function
rgeos::gLineMerge
might be more efficient than
sf::st_union
.
# Simplify river shapefile
shape_river_simple <- shape_river_small %>%
st_as_sf %>%
st_union()
# Convert shapefile to point object
river_to_points <- shape_river_simple %>%
st_as_sf %>%
st_cast("POINT") %>%
mutate(id = 1:nrow(.))
Since points are created at the extremes of every segment in the original shapefile, a confluence is delineated where three (or more) of such nodes are overlapping.
# Check points that overlap
joins_selection <- river_to_points %>%
st_equals() %>%
Filter(function(x){length(x) > 2}, .) %>%
lapply(., FUN = min) %>%
unlist() %>%
unique()
# Filter original point shapefile to retain only confluences
river_joins <- river_to_points %>%
filter(id %in% joins_selection)
The last step is the splitting of the simplified shapefile where it
intersects a confluence. This allows to identify unequivocally the river
sections that are comprised between two confluences. Such elements will
then became vertices (nodes) of the graph. The function
lwgeom::st_split
is used. This function splits an object of
class ‘sf’ with ‘MULTILINESTRING’ geometry into and object of class ‘sf’
containing multiple ‘LINESTRING’ geometries records based on the
position of the points. Note that for this function to work properly,
the point object must perfectly overlap with the polyline (this can be
checked with the function sf::st_distance
applied to the
points and the polyline, which must return a vector of zeroes).
# Split polyline
shape_river_simplified <- lwgeom::st_split(shape_river_simple, river_joins) %>%
st_collection_extract(.,"LINESTRING") %>%
data.frame(id = 1:length(.), geometry = .) %>%
st_as_sf() %>%
mutate(length = st_length(.))
Results can be checked with a plot.
ggplot() +
coord_fixed() +
ggspatial::layer_spatial(shape_river_simplified, aes(color = id))+
scale_color_viridis(direction = -1, name= "Reach ID") +
ggspatial::layer_spatial(river_joins, shape = 1)+
theme_minimal() +
theme(legend.position = "bottom")+
ggspatial::annotation_scale(location = "bl", style = "ticks") +
ggspatial::annotation_north_arrow(location = "br")+
labs(caption = "Hollow points are the position of the junctions")
2.3 Dams processing
Dams are processed in a way similar to confluences. The general idea is to obtain a shapefile that is sliced in points where dams or confluences are located. However, dams shapefiles require special care in the processing of the data. First, dams need to be snapped to the river network. Second, dams that are too far away from the river network need to be removed from the analysis because they are probably located in smaller streams that were removed beforehand.
The function snap_to_river
(see Appendix B) is used to
snap the dams points to the river network based on a tolerance threshold
(here, 1000 m were used). After this function, the distance of snapped
dams from the river network can be checked to make sure the algorithm
worked properly.
# Snap dams
dams_snapped <- snap_to_river(dams_to_points,
shape_river_simple %>% st_sf(),
max_dist = 1000)
# Retain dams that were snapped
dams_snapped_reduced <-
dams_snapped[st_contains(shape_river_simple %>% st_sf(), dams_snapped, prepared = FALSE)[[1]],]
# Check if dams were snapped properly to the network (all the distances should be zero)
st_distance(dams_snapped_reduced, shape_river_simple) %>% sum
## 0 [m]
The final step is to check if two dams were snapped in the same geographic point, and to assign passablity values to the obtained snapped points. Here, uniform (not dam-dependent) passabilities were assigned: 0.8 for downstream passability, and 0.1 for upstream passability. An id is assigned to each of the geographic points. Finally, each dam is assigned the value of the upstream area with a spatial join with the original river shapefile (HydroSheds expresses the upstream area as number of cells).
dams_snapped_reduced_joined <- dams_snapped_reduced %>%
mutate(cluster =
st_equals(dams_snapped_reduced, dams_snapped_reduced) %>%
sapply(., FUN = function(x) paste0(x, collapse = "_"))) %>%
group_by(cluster) %>%
slice(1) %>%
ungroup() %>%
mutate(id_dam = as.character(1:nrow(.)), pass_u = 0.1, pass_d = 0.8) %>%
as.data.frame %>%
st_as_sf() %>%
st_join(., shape_river_small, join = st_is_within_distance, dist = 10 ) %>%
group_by(id) %>%
slice(which.max(UPLAND_SKM)) %>%
ungroup()
The function headwaters_dam
(see Appendix B) can
identify dams that are located in the headwaters, meaning that they do
not have any river network segment upstream. When no upstream segment is
present, the network can not be created (because the dam would be an
edge without an upstream vertex). When calling the flag_headwater you
should get a column with only “FALSE” statements.
headwaters_checking <- headwaters_dam(dams_snapped_reduced_joined, shape_river_simple)
head(headwaters_checking$flag_headwater)
## [,1]
## [1,] FALSE
## [2,] FALSE
## [3,] FALSE
## [4,] FALSE
## [5,] FALSE
## [6,] FALSE
Results can be checked with a plot.
ggplot() +
coord_fixed() +
ggspatial::layer_spatial(shape_river_simple, color = "gray70")+
ggspatial::layer_spatial(dams_snapped_reduced, shape = 1) +
theme_minimal() +
theme(legend.position = "bottom")+
ggspatial::annotation_north_arrow(location = "br")+
ggspatial::annotation_scale(location = "bl", style = "ticks") +
labs(caption = "Hollow points are the position of the dams")
2.4 Shapefile slicing
The last step is to slice the river shapefile based on both the
position of dams and confluences. First a combined ‘junctions’ dataset
is created (the network_links object), then the function
lwgeom::st_split
is applied again. note that the
confluences are assigned NA values to the fields present in the dams
shapefile. Note also that in the river network shapefile, the field
length is calculated. Finally, a spatial join is performed with the
original river shapefile to assign to each reach the corresponding
upstream area.
The field NodeID is created in the river_net_simplified object. This field will be used later for linking graph properties back to the shapefile for further plotting.
# Create junction point shapefile
network_links <- rbind(
dams_snapped_reduced_joined %>%
mutate(type = "dam", id_barrier = id_dam) %>%
dplyr::select(type, id_barrier, pass_u, pass_d),
river_joins %>% mutate(type = "joint") %>%
dplyr::select(type) %>%
mutate(id_barrier = NA, pass_u = NA, pass_d = NA) %>%
rename(geometry = x)) %>%
mutate(id_links = 1:nrow(.))
# Split river network
river_net_simplified <- lwgeom::st_split(shape_river_simple, network_links) %>%
st_collection_extract(.,"LINESTRING") %>%
data.frame(NodeID = 1:length(.), geometry = .) %>%
st_as_sf() %>%
mutate(length = st_length(.)) %>%
st_join(., shape_river_small, join = st_is_within_distance, dist = 0.01 ) %>%
group_by(NodeID) %>%
slice(which.max(UPLAND_SKM)) %>%
ungroup()
A plot can reveal if the process was successful.
ggplot() +
coord_fixed() +
ggspatial::layer_spatial(river_net_simplified, color = "gray70")+
ggspatial::layer_spatial(network_links, aes(shape = type))+
scale_shape_manual(name = "Splitting points", values=c("dam" =17,"joint" = 23))+
theme_minimal() +
theme(legend.position = "bottom")+
ggspatial::annotation_north_arrow(location = "br")+
ggspatial::annotation_scale(location = "bl", style = "ticks")
Sometimes, the HydroSHEDS-derived shapefiles might have confluences
where four or more rivers join. Such type of confluences can cause
problems in the next steps. The function
multiple_confluences
(Appendix B) identifies such
problematic confluences so that they can be corrected in a GIS
environment. The correction consists in editing the river shapefile to
move re-locate one of the reaches (either upstream or downstream) on the
intersections that have more than three reaches (this can be easily
achieved once these intersections are located by the following
function).
Once edited, re-run the tutorial and check again that there are no more four reaches confluences.
confluences <- multiple_confluences(river_net_simplified)
head(confluences)
## Simple feature collection with 6 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -4.047917 ymin: 42.78958 xmax: -1.364583 ymax: 42.97292
## Geodetic CRS: WGS 84
## # A tibble: 6 × 4
## NodeID geometry n_confluences flag_confluences
## <int> <POINT [°]> <int> <lgl>
## 1 1 (-3.46875 42.95208) 3 FALSE
## 2 2 (-4.047917 42.97292) 3 FALSE
## 3 5 (-3.552083 42.96875) 3 FALSE
## 4 7 (-1.364583 42.81042) 3 FALSE
## 5 10 (-1.835417 42.90625) 3 FALSE
## 6 11 (-3.352083 42.78958) 3 FALSE
Another common problem in HydroSHEDS-derived shapefiles is the
presence of gaps between segments. This issue can cause problems in the
network_creation
function, which expects a shapefile with
no gaps. The functions gaps_detection
and
check_components
(Appendix B) can be used to detect such
problematic points in the shapefile. In particular, the function
check_components
returns an sf file that is a copy of the
original river_net_simplified with the additional field component that
contains a progressive number identifying multiple components. A correct
shapefile should contain only ‘1’ values in this field. A plot can help
identifying the location of those components and a manual correction in
a gis environment can fix the problem.
Once edited, re-run the tutorial and check again that the shapefile has only one component.
shp_check <- check_components(network_links, river_net_simplified)
head(shp_check)
## Simple feature collection with 6 features and 2 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -4.277083 ymin: 42.93958 xmax: -3.464583 ymax: 43.11458
## Geodetic CRS: WGS 84
## # A tibble: 6 × 3
## NodeID geometry compo…¹
## <int> <LINESTRING [°]> <dbl>
## 1 1 (-3.58125 43.11458, -3.572917 43.10625, -3.572917 43.10208, -3… 1
## 2 2 (-4.277083 43.01042, -4.272917 43.01458, -4.264583 43.01458, -… 1
## 3 3 (-3.95625 43.00625, -3.960417 43.00625, -3.964583 43.00208, -3… 1
## 4 4 (-3.927083 43.01458, -3.94375 43.01458, -3.947917 43.01042, -3… 1
## 5 5 (-3.602083 43.01458, -3.597917 43.01042, -3.589583 43.01042, -… 1
## 6 6 (-3.74375 43.00208, -3.739583 42.99792, -3.739583 42.98958, -3… 1
## # … with abbreviated variable name ¹component
2.5 Adding additional attributes to the river shapefile
Additional attributes useful for the network calculation can be added
to the river shapefile. Elevation is an example. Here we use the
function elevatr::get_elev_raster
to get the elevation
raster. Then, the raster is converted to data frame, and elevation is
joined with the shapefile based on the nearest point. This procedure is
approximated, but is fine for most of the large-scale analyses.
# get DEM and transform to data frame with coordinates
elevation <- get_elev_raster(shape_basin, z = 8)
catchment_DEM <- raster::as.data.frame(elevation, xy = TRUE)
# Get coordinates of the river network segments
river_net_simplified_centroids <- river_net_simplified %>%
st_as_sf() %>%
st_centroid()
# Get coordinates of both elements for joining
Coord_Edges <- st_coordinates(river_net_simplified_centroids) #Coordinates of the joins
Coord_DEM <- catchment_DEM[,1:2] #Coordinates of the dams
# Matching each centroid with its closer altittude point to later obtain the altitudes
matching_altitudes <- RANN::nn2(data=Coord_DEM, query = Coord_Edges, k=1, radius = 1)[[1]]
# Get values and add to the river shapefile
catchment_DEM <- catchment_DEM[matching_altitudes,3]
river_net_simplified <- river_net_simplified %>%
mutate(alt = catchment_DEM)
# To avoid issues in the network creation process, retain only the important columns
river_net_simplified <- river_net_simplified %>%
dplyr::select(NodeID, length, alt, DIST_DN_KM, UPLAND_SKM)
Plotting can show weird behaviors of the join operation.
ggplot() +
coord_fixed() +
ggspatial::layer_spatial(river_net_simplified, aes(color = alt))+
scale_color_viridis(name = "Elevation")+
theme_minimal() +
theme(legend.position = "bottom")+
ggspatial::annotation_north_arrow(location = "br")+
ggspatial::annotation_scale(location = "bl", style = "ticks")
3. Creation of the igraph
object
The igraph
object can be created using the functions
get_outlet
and create_network
(see Appendix
B). The function get_outlet
identifies the outlet of the
catchment based on the intersection between the river network shapefile
and the catchment border. This function is meant to be used with
hydroSHEDS-derived data, but the identification can be done manually on
any GIS software. The function create_network
uses the
outlet information, the properly sliced river network shapefile and the
junctions shapefile.
It is important to note that: * for using this function the fields
with barriers informations must be named ‘id_barrier’, ‘pass_u’ and
‘pass_d’; * in case several fields are available in the
river_net_simplified
and network_links
shapefiles, they should have fairly different names (e.g. fields named
elevation
and elev
should not be present,
because a partial match based on field names is done internally in the
create_network
function). The functions
dplyr::select
and dplyr::rename
can be used to
retain only the few fields that are relevant for the next steps.
#outlet <- get_outlet(river_net_simplified, shape_basin, distance = 1)
outlet <- river_net_simplified$NodeID[river_net_simplified$DIST_DN_KM == 0 ]
river_graph <- create_network(network_links, river_net_simplified, outlet)
river_graph
is an object of class igraph
that keeps the attributes present in the shapefile and in the
junction.
# Check igraph object
river_graph
## IGRAPH 3d12149 DN-- 648 647 --
## + attr: name (v/n), length (v/n), alt (v/n), DIST_DN_KM (v/n),
## | UPLAND_SKM (v/n), type (e/c), id_links (e/n), id_barrier (e/c),
## | pass_u (e/n), pass_d (e/n)
## + edges from 3d12149 (vertex names):
## [1] 517->518 522->518 521->522 498->522 276->160 153->160 160->164 162->161
## [9] 161->164 644->642 648->642 643->645 647->645 640->637 642->637 638->631
## [17] 646->631 634->633 641->633 631->630 633->632 632->630 630->629 629->626
## [25] 636->635 635->626 625->628 623->625 598->625 639->628 624->622 627->622
## [33] 595->598 589->598 618->594 645->618 619->618 621->620 620->594 612->606
## [41] 622->606 611->615 613->615 626->610 637->610 608->602 617->616 616->602
## + ... omitted several edges
# check river_graph edges
igraph::get.edge.attribute(river_graph) %>% names
## [1] "type" "id_links" "id_barrier" "pass_u" "pass_d"
# check river_graph vertices
igraph::get.vertex.attribute(river_graph) %>% names
## [1] "name" "length" "alt" "DIST_DN_KM" "UPLAND_SKM"
3.1 Editing the igraph
object
The edges and verices attributes can be further edited. For instance, the ‘length’ vertices attributes is expressed in meters. The value is changed to tens of kilometers for the following calculations.
# update length attribute
V(river_graph)$length <- V(river_graph)$length / 10000
hist(V(river_graph)$length)
Next, the name attribute is converted from numeric to charachter to
avoid issues with the index_calculation
function.
# update length attribute
V(river_graph)$name <- as.character(V(river_graph)$name)
Another step is to add some fields with habitat suitability information. Two different habitat curves are implemented to link elevation and habitat preference, one for high-altitude organisms and one for low-altitude organisms. Relative reach-specific habitat suitability indices (HSI) are added as attributes of the vertices. Weighted usable length (WUL) is also calculated as the product of length and HSI as the reach-specific fraction of length that is available for organisms.
# Function for organism that prefers high elevation
suit_fun_high <- function(x){dnorm(x, mean = 1500, sd = 500)*410*sqrt(3*pi)}
# Function for organism that prefers low elevation
suit_fun_low <- function(x){exp(- 0.001*x)}
# Calculate HSI for the igraph nodes
V(river_graph)$HSI_low <- suit_fun_low(V(river_graph)$alt)
V(river_graph)$HSI_high <- suit_fun_high(V(river_graph)$alt)
# Calculate weighted usable length for igraph nodes
V(river_graph)$WUL_low <- V(river_graph)$HSI_low * V(river_graph)$length
V(river_graph)$WUL_high <- V(river_graph)$HSI_high * V(river_graph)$length
# Plot the two response functions
1:10:2500 %>%
data.frame("Elevation" = .,
"Low" = suit_fun_low(.),
"High" = suit_fun_high(.)) %>%
pivot_longer(c("Low", "High"),
names_to = "Type", values_to = "HSI") %>%
ggplot() +
geom_line(aes(x = Elevation, y = HSI, color = Type))+
theme_bw() + xlab("Elevation (m)") + ylab("Habitat Suitability Index (HSI)")
3.2 Plotting the igraph
object
The ggnetwork
package can be used to fortify
igraph
objects for ggplot-friendly plots. For a nice
network layout that resembles the geometry of the starting network, the
centroids of the reaches are extracted. The habitat suitabilities are
plotted.
# Extract reaches centroids
river_net_simplified_centroids <- river_net_simplified %>%
st_as_sf() %>%
st_centroid()
# get the centroids coordinates
coordinates <- river_net_simplified_centroids %>%
dplyr::mutate(lat = sf::st_coordinates(.)[,1], lon = sf::st_coordinates(.)[,2]) %>%
dplyr::select(lat, lon) %>%
st_set_geometry( NULL)
# fortify the igraph object
gg0 <- ggnetwork(river_graph, layout = coordinates %>% as.matrix(), scale = FALSE)
grid.arrange(
ggplot(gg0, aes(x = x, y = y, xend = xend, yend = yend)) +
coord_fixed() +
geom_nodes(aes(color = HSI_high)) +
geom_edges(alpha = 0.5) +
scale_color_viridis()+
theme_blank()+
ggtitle("High-altitude organism") +
labs(caption = "Network directionality not shown"),
ggplot(gg0, aes(x = x, y = y, xend = xend, yend = yend)) +
coord_fixed() +
geom_nodes(aes(color = WUL_high)) +
geom_edges(alpha = 0.5) +
scale_color_viridis()+
theme_blank()+
ggtitle("High-altitude organism") +
labs(caption = "Network directionality not shown"),
ggplot(gg0, aes(x = x, y = y, xend = xend, yend = yend)) +
coord_fixed() +
geom_nodes(aes(color = HSI_low)) +
geom_edges(alpha = 0.5) +
scale_color_viridis()+
theme_blank()+
ggtitle("Low-altitude organism") +
labs(caption = "Network directionality not shown"),
ggplot(gg0, aes(x = x, y = y, xend = xend, yend = yend)) +
coord_fixed() +
geom_nodes(aes(color = WUL_low)) +
geom_edges(alpha = 0.5) +
scale_color_viridis()+
theme_blank()+
ggtitle("Low-altitude organism") +
labs(caption = "Network directionality not shown"),
ncol=2, nrow=2)
4. Indices calculations
Here some example of indices for river network connectivity applied to river graphs are presented.
4.1 Reach-scale indices
The funtion index_calculation
from the riverconn
package is used to calculate reach-scale connectivity indices (RCI).
Refer to the package vignette for details on the calculation of the
indices. For comparison purposes, different indices are calculated: (A)
Dendritic connectivity index with symmetric dams passabilities; (B)
Dendritic connectivity index with asymmetric dams passabilities; (C)
Integral Index of connectivity (dams are not passable and a threshold on
dispersal distance is used); (D) Integral Index of connectivity with
uniform weights; (E) lowland-preferring organisms with active aquatic
dispersal (e.g. fish), (F) upland organisms with active aquatic
dispersal (e.g. fish), (G) lowland-preferring organism with passive
aquatic dispersal (e.g., invertebrate larval stage), (H) upland organism
with passive aquatic dispersal (e.g., invertebrate larval stage,
bivalve). Note that for all those indices, the option
index_mode = "from"
was used, meaning paths exiting from
the node are considered for calculating the coincidence probabilities.
Thus, such indices can be interpreted as the potential for each node to
support dispersal of organisms to the whole river network. An
alternative could be to specify index_mode = "to"
. In this
case, only incoming links are considered for the calculation of the
index, yielding a prioritization based on the node potential for
recolonization. Note that when symmetric dispersal and symmetric
barriers passability are used, the two indices should overlap.
Before calculating indices, you should consider carefully the size of
the network and its complexity. The index_calculation
function might fail for large networks on comuters with limited
available RAM because R would not be able to allocate enough memory to
efficiently store the \(n * n\)
matrices generated in the process. For example, the function was failing
with a network size of 10^5 nodes on a computer with 8 GB RAM, but was
succesfull on a computer with 32 GB RAM. Few tests might be needed to
ballance complexity and computational resources.
# Initialize list where to store all the index calculation outputs
index <- list()
lab_index <- list()
letter_index <- list()
# 1: Symmetric Dendritic Connectivity Index (no biotic effects)
lab_index[[1]] <- "Symmetric DCI"
letter_index[[1]] <- "A"
index[[1]] <- index_calculation(graph = river_graph,
weight = "length",
B_ij_flag = FALSE,
index_type = "reach",
index_mode = "from")
# 2: Asymmetric Dendritic Connectivity Index (no biotic effects)
lab_index[[2]] <- "Asymmetric DCI"
letter_index[[2]] <- "B"
index[[2]] <- index_calculation(graph = river_graph,
weight = "length",
B_ij_flag = FALSE,
dir_fragmentation_type = "asymmetric",
index_type = "reach",
index_mode = "from")
## Before calculating IIC a binary passability has to be defined
E(river_graph)$pass_u_bin <- ifelse(is.na(E(river_graph)$pass_u), NA, 0)
E(river_graph)$pass_d_bin <- ifelse(is.na(E(river_graph)$pass_d), NA, 0)
# 3: Symmetric Integral Index of Connectivity
lab_index[[3]] <- "IIC"
letter_index[[3]] <- "C"
index[[3]] <- index_calculation(graph = river_graph,
weight = "length",
pass_u = "pass_u_bin",
pass_d = "pass_d_bin",
param = 3,
disp_type = "threshold",
index_type = "reach",
index_mode = "from")
## Defining uniform weights for IIC
V(river_graph)$unif_w <- 1
# 4: Integral Index of Connectivity with uniform weights
lab_index[[4]] <- "IIC with uniform weights"
letter_index[[4]] <- "D"
index[[4]] <- index_calculation(graph = river_graph,
weight = "unif_w",
dir_fragmentation_type = "asymmetric",
pass_u = "pass_u_bin",
pass_d = "pass_d_bin",
param = 3,
disp_type = "threshold",
index_type = "reach",
index_mode = "from")
# 5: Population Connectivity Index for lowland fish
lab_index[[5]] <- "PCI lowland fish"
letter_index[[5]] <- "E"
index[[5]] <- index_calculation(graph = river_graph,
weight = "WUL_low",
param = 0.8,
index_type = "reach",
index_mode = "from")
# 6: Population Connectivity Index for upland fish
lab_index[[6]] <- "PCI upland fish"
letter_index[[6]] <- "F"
index[[6]] <- index_calculation(graph = river_graph,
weight = "WUL_high",
param = 0.8,
index_type = "reach",
index_mode = "from")
# 7: Population Connectivity Index for lowland passive drifter
## note that units of param_d must be consistent with length field (10s of km)
lab_index[[7]] <- "PCI lowland passive"
letter_index[[7]] <- "G"
index[[7]] <- index_calculation(graph = river_graph,
weight = "WUL_low",
dir_distance_type = "asymmetric",
disp_type = "threshold",
param_u = 0,
param_d = 3,
index_type = "reach",
index_mode = "from")
# 8: Population Connectivity Index for upland passive drifter
## note that units of param_d must be consistent with length field (10s of km)
lab_index[[8]] <- "PCI upland passive"
letter_index[[8]] <- "H"
index[[8]] <- index_calculation(graph = river_graph,
weight = "WUL_high",
dir_distance_type = "asymmetric",
disp_type = "threshold",
param_u = 0,
param_d = 3,
index_type = "reach",
index_mode = "from")
A quick check to the variation range.
data.frame("index" = do.call(rbind, lab_index),
"min" = t(sapply(index, sapply, min))[,4],
"mean" = t(sapply(index, sapply, mean))[,4],
"max" = t(sapply(index, sapply, max))[,4] )
## index min mean max
## 1 Symmetric DCI 0.000136068138828111 0.089094209 0.183348672040763
## 2 Asymmetric DCI 0.000198840008657542 0.243700325 0.575848057363519
## 3 IIC 4.21802688098917e-05 0.007121115 0.0209559901505482
## 4 IIC with uniform weights 0.00154320987654321 0.008921087 0.0308641975308642
## 5 PCI lowland fish 5.49335271641571e-05 0.016012598 0.0536956831924345
## 6 PCI upland fish 0.000182545268578881 0.009541135 0.0410567699684218
## 7 PCI lowland passive 3.79438573745419e-05 0.003693739 0.020852867904853
## 8 PCI upland passive 1.66202468527347e-05 0.003000132 0.0353297075947445
Resulting maps can be plotted to visually compare the results. For comparisons between different indices, reach ranks are plotted. The lowest rank is the total number of reaches (ca. 650). Reaches with higher rankings offer higher potential for being dispersal sources.
# Initialize empty list
plot_list <- list()
# iterate through list length
for (i in 1:length(index)) {
# Join d_i information with dams shapefile
river_plot <- river_net_simplified %>%
mutate(name = as.character(NodeID)) %>%
left_join(index[[i]]) %>%
mutate(rank = rank(desc(index)))
# plot
plot_iter <- ggplot() +
coord_fixed() +
ggspatial::layer_spatial(river_plot, aes(color = rank))+
scale_color_viridis(name = "Ranking", direction = -1)+
theme_void()+
guides(size = FALSE) +
theme(legend.position = "bottom")+
ggtitle(paste0(letter_index[[i]], ") ", lab_index[[i]]))
legend <- cowplot::get_legend(plot_iter)
plot_list[[i]] <- plot_iter +
theme(legend.position = "none")
}
plot_list[[length(index)+1]] <- legend
plot_final <- ggpubr::ggarrange( plotlist = plot_list)
plot_final
ggsave(plot = plot_final, "Figures/habitat_prioritization.jpeg",
width = 18, height = 12, units = "cm")
A correlation plot helps identifying the correlated indices
d_index_chart <- data.frame("d_i_1" = index[[1]]$index,
"d_i_2" = index[[2]]$index,
"d_i_3" = index[[3]]$index,
"d_i_4" = index[[4]]$index,
"d_i_5" = index[[5]]$index,
"d_i_6" = index[[6]]$index,
"d_i_7" = index[[7]]$index,
"d_i_8" = index[[8]]$index)
colnames(d_index_chart) <- c(letter_index[[1]], letter_index[[2]], letter_index[[3]],
letter_index[[4]], letter_index[[5]], letter_index[[6]],
letter_index[[7]], letter_index[[8]])
# Spearman corrlations between rankings calculated with different priorities
cor(d_index_chart, method = "spearman")
## A B C D E F G
## A 1.0000000 0.91224397 0.5542838 0.4941878 0.8789122 0.19375360 0.3691311
## B 0.9122440 1.00000000 0.4955071 0.4322901 0.8570346 -0.08416734 0.3894249
## C 0.5542838 0.49550709 1.0000000 0.8277318 0.7113399 0.41582814 0.5582716
## D 0.4941878 0.43229005 0.8277318 1.0000000 0.7109633 0.30121569 0.4021395
## E 0.8789122 0.85703457 0.7113399 0.7109633 1.0000000 0.13442831 0.5122386
## F 0.1937536 -0.08416734 0.4158281 0.3012157 0.1344283 1.00000000 0.1011033
## G 0.3691311 0.38942489 0.5582716 0.4021395 0.5122386 0.10110333 1.0000000
## H -0.2105778 -0.34538754 0.1183752 -0.1075774 -0.2894423 0.55415117 0.4008371
## H
## A -0.2105778
## B -0.3453875
## C 0.1183752
## D -0.1075774
## E -0.2894423
## F 0.5541512
## G 0.4008371
## H 1.0000000
corrmorant::corrmorant(d_index_chart, corr_method = "spearman", style = "binned")+
theme(legend.position = "bottom") +
scale_color_viridis(name = "Correlation", option = "E", direction = -1, limits = c(-1, 1)) +
theme(axis.text = element_text(size=5),
axis.text.x = element_text(angle = 45) )
ggsave("Figures/ranking_habitat_comparison.jpeg", width = 15, height = 15, units = "cm")
4.2 Barriers prioritization
The function d_index_calculation
from the riverconn
package is used to identify barriers whose passability improvement can
best benefit the catchment connectivity. Refer to the package vignette
for details on the calculation of the indices. Check the package
vignette for details on the indices used to this end. Here we use the
same setups described above (section 4.1).
First dams metadata are to be defined. The object is saved as a dataframe that stores the id of the dams to be individually tested in a column, and the relative improvements in upstream and downstream passabilies (here improvement is assumed to be uniformly 1).
barriers_metadata <- data.frame("id_barrier" = E(river_graph)$id_barrier[!is.na(E(river_graph)$id_barrier)],
"pass_u_updated" = 1,
"pass_d_updated" = 1)
head(barriers_metadata)
## id_barrier pass_u_updated pass_d_updated
## 1 76 1 1
## 2 72 1 1
## 3 7 1 1
## 4 6 1 1
## 5 78 1 1
## 6 51 1 1
Next, the dams prioritization can be executed. The calculation may
take a while depending on the computer where it is run. There is the
option to parallelize the process (marking ->
parallel=TRUE
and the number of cores devoted -> ncores=
). It takes 4 minutes with parallel=FALSE
for this example
in a computer with this procesor: 11th Gen Intel Core i7, 3.30 GHz and
32 GbRAM. On the other hand it takes 1.5 minutes when parallelizing with
7 cores (parallel = TRUE
, ncores = 7
). To
parallelize calculations, the doParallel
package is
required.
# Initialize list where to store all the index calculation outputs
d_index <- list()
lab_d_index <- list()
letter_d_index <- list()
# 1: Symmetric Dendritic Connectivity Index (no biotic effects)
lab_d_index[[1]] <- "Symmetric DCI"
letter_d_index[[1]] <- "A"
d_index[[1]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
B_ij_flag = FALSE,
parallel = FALSE)
# 2: Asymmetric Dendritic Connectivity Index (no biotic effects)
lab_d_index[[2]] <- "Asymmetric DCI"
letter_d_index[[2]] <- "B"
d_index[[2]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
B_ij_flag = FALSE,
parallel = FALSE,
dir_fragmentation_type = "asymmetric")
## Before calculating IIC a binary passability has to be defined
E(river_graph)$pass_u_bin <- ifelse(is.na(E(river_graph)$pass_u), NA, 0)
E(river_graph)$pass_d_bin <- ifelse(is.na(E(river_graph)$pass_d), NA, 0)
# 3: Symmetric Integral Index of Connectivity
lab_d_index[[3]] <- "Symmetric IIC"
letter_d_index[[3]] <- "C"
d_index[[3]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "WUL_low",
pass_u = "pass_u_bin",
pass_d = "pass_d_bin",
param = 3,
disp_type = "threshold",
parallel = FALSE)
## Defining uniform weights for IIC
V(river_graph)$unif_w <- 1
# 4: Integral Index of Connectivity with uniform weights
lab_d_index[[4]] <- "IIC with uniform weights"
letter_d_index[[4]] <- "D"
d_index[[4]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "unif_w",
dir_fragmentation_type = "asymmetric",
pass_u = "pass_u_bin",
pass_d = "pass_d_bin",
param = 3,
disp_type = "threshold",
parallel = FALSE)
# 5: Population Connectivity Index for lowland fish
lab_d_index[[5]] <- "PCI lowland fish"
letter_d_index[[5]] <- "E"
d_index[[5]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "WUL_low",
param = 0.8,
parallel = FALSE)
# 6: Population Connectivity Index for upland fish
lab_d_index[[6]] <- "PCI upland fish"
letter_d_index[[6]] <- "F"
d_index[[6]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "WUL_high",
param = 0.8,
parallel = FALSE )
# 7: Population Connectivity Index for lowland passive drifter
## note that units of param_d must be consistent with length field (10s of km)
lab_d_index[[7]] <- "PCI lowland passive"
letter_d_index[[7]] <- "G"
d_index[[7]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "WUL_low",
dir_distance_type = "asymmetric",
disp_type = "threshold",
param_u = 0,
param_d = 3,
parallel = FALSE)
# 8: Population Connectivity Index for upland passive drifter
## note that units of param_d must be consistent with length field (10s of km)
lab_d_index[[8]] <- "PCI upland passive"
letter_d_index[[8]] <- "H"
d_index[[8]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "WUL_high",
dir_distance_type = "asymmetric",
disp_type = "threshold",
param_u = 0,
param_d = 3,
parallel = FALSE)
# 9: Catchment Area Fragmentation Index
## note that units of param_d must be consistent with length field (10s of km)
lab_d_index[[9]] <- "CAFI"
letter_d_index[[9]] <- "I"
d_index[[9]] <- d_index_calculation(graph = river_graph,
barriers_metadata = barriers_metadata,
weight = "UPLAND_SKM",
dir_distance_type = "symmetric",
B_ij_flag = FALSE,
parallel = FALSE)
A quick check to the variation range of CCI for the baseline.
data.frame("Baseline CCI" = do.call(rbind, lab_d_index),
"min" = t(sapply(d_index, sapply, min))[,4], # baseline in is column 4
"mean" = t(sapply(d_index, sapply, mean))[,4],
"max" = t(sapply(d_index, sapply, max))[,4] )
## Baseline.CCI min mean max
## 1 Symmetric DCI 0.0909335515845068 0.092566524 0.126107269330024
## 2 Asymmetric DCI 0.25182200888365 0.254232665 0.290384217177615
## 3 Symmetric IIC 0.00810297139863181 0.008114263 0.00817098728639399
## 4 IIC with uniform weights 0.00892108672458467 0.008945654 0.00905445054107606
## 5 PCI lowland fish 0.0173825827122186 0.017454205 0.0183293467389933
## 6 PCI upland fish 0.0149304263813881 0.014969916 0.0155246121675143
## 7 PCI lowland passive 0.00494555720870629 0.004950211 0.00497254428181167
## 8 PCI upland passive 0.00811555022541386 0.008120642 0.00819221692543936
## 9 CAFI 0.203495022267748 0.206691818 0.299794408043562
Resulting maps can be plotted to visually compare the results. For comparisons between different indices, barriers ranks are plotted. The lowest rank is the total number of barriers (ca. 80). Barriers with higher ranks should be prioritized for passability improvement.
# Initialize empty list
plot_list <- list()
# iterate through list length
for (i in 1:length(d_index)) {
# Join d_i information with dams shapefile
network_links_plot <- network_links %>%
filter(type == "dam") %>%
left_join(d_index[[i]]) %>%
mutate(d_rank = rank(desc(d_index)))
# plot
plot_iter <- ggplot() +
coord_fixed() +
ggspatial::layer_spatial(river_net_simplified, color = "gray70")+
ggspatial::layer_spatial(network_links_plot,
aes(color = d_rank, size = 1/d_rank), alpha = 0.8)+
scale_color_viridis(name = "Ranking", direction = -1)+
theme_void()+
guides(size = FALSE) +
theme(legend.position = "bottom")+
ggtitle(paste0(letter_d_index[[i]], ") ", lab_d_index[[i]]))
# legend <- cowplot::get_legend(plot_iter)
plot_list[[i]] <- plot_iter #+ theme(legend.position = "none")
}
# plot_list[[length(d_index)+1]] <- legend
plot_final <- ggpubr::ggarrange( plotlist = plot_list,
common.legend = TRUE, legend = "bottom")
plot_final
ggsave(plot = plot_final, "Figures/barriers_prioritization.jpeg",
width = 18, height = 12, units = "cm")
A corrleation plot hepls identifying the correlated indices
d_index_chart <- data.frame("d_i_1" = d_index[[1]]$d_index,
"d_i_2" = d_index[[2]]$d_index,
"d_i_3" = d_index[[3]]$d_index,
"d_i_4" = d_index[[4]]$d_index,
"d_i_5" = d_index[[5]]$d_index,
"d_i_6" = d_index[[6]]$d_index,
"d_i_7" = d_index[[7]]$d_index,
"d_i_8" = d_index[[8]]$d_index,
"d_i_9" = d_index[[9]]$d_index)
colnames(d_index_chart) <- c(letter_d_index[[1]], letter_d_index[[2]], letter_d_index[[3]],
letter_d_index[[4]], letter_d_index[[5]], letter_d_index[[6]],
letter_d_index[[7]], letter_d_index[[8]], letter_d_index[[9]])
# Spearman corrlations between rankings calculated with different priorities
cor(d_index_chart, method = "spearman")
## A B C D E F G
## A 1.0000000 0.8636478 0.637634905 0.26204708 0.9408875 0.3693169 0.64327608
## B 0.8636478 1.0000000 0.648567309 0.29080852 0.8753639 0.2381299 0.67678719
## C 0.6376349 0.6485673 1.000000000 0.54545734 0.7551617 0.2080936 0.97127579
## D 0.2620471 0.2908085 0.545457344 1.00000000 0.3773260 0.2458078 0.47141151
## E 0.9408875 0.8753639 0.755161746 0.37732603 1.0000000 0.2660974 0.73870015
## F 0.3693169 0.2381299 0.208093618 0.24580782 0.2660974 1.0000000 0.23057434
## G 0.6432761 0.6767872 0.971275792 0.47141151 0.7387001 0.2305743 1.00000000
## H -0.2640295 -0.3216590 0.009140793 0.03524209 -0.3429779 0.5990565 0.05841429
## I 0.8891517 0.9147256 0.640364507 0.30417163 0.8815929 0.1295633 0.65949971
## H I
## A -0.264029452 0.8891517
## B -0.321659038 0.9147256
## C 0.009140793 0.6403645
## D 0.035242088 0.3041716
## E -0.342977925 0.8815929
## F 0.599056538 0.1295633
## G 0.058414289 0.6594997
## H 1.000000000 -0.3937625
## I -0.393762511 1.0000000
corrmorant::corrmorant(d_index_chart, corr_method = "spearman", style = "binned")+
theme(legend.position = "bottom") +
scale_color_viridis(name = "Correlation", option = "E", direction = -1, limits = c(-1, 1)) +
theme(axis.text = element_text(size=5),
axis.text.x = element_text(angle = 45))
ggsave("Figures/ranking comparison.jpeg", width = 15, height = 15, units = "cm")
Appendix A: R session information
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.1 (2022-06-23 ucrt)
## os Windows 10 x64 (build 19045)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate English_United States.utf8
## ctype English_United States.utf8
## tz Europe/Berlin
## date 2023-03-02
## pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## ! package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.2.0)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.1)
## backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
## bookdown 0.32 2023-01-17 [1] CRAN (R 4.2.2)
## broom 1.0.1 2022-08-29 [1] CRAN (R 4.2.1)
## bslib 0.4.1 2022-11-02 [1] CRAN (R 4.2.2)
## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.1)
## callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.2)
## car 3.1-1 2022-10-19 [1] CRAN (R 4.2.2)
## carData 3.0-5 2022-01-06 [1] CRAN (R 4.2.1)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.1)
## class 7.3-20 2022-01-16 [2] CRAN (R 4.2.1)
## classInt 0.4-8 2022-09-29 [1] CRAN (R 4.2.2)
## cli 3.4.0 2022-09-08 [1] CRAN (R 4.2.1)
## codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.1)
## colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.1)
## corrmorant * 0.0.0.9007 2022-10-13 [1] Github (r-link/corrmorant@e968860)
## cowplot * 1.1.1 2020-12-30 [1] CRAN (R 4.2.1)
## crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.2)
## curl 4.3.3 2022-10-06 [1] CRAN (R 4.2.2)
## DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.1)
## dbplyr 2.2.1 2022-06-27 [1] CRAN (R 4.2.1)
## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.2)
## digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2)
## dodgr 0.2.18 2022-12-07 [1] CRAN (R 4.2.2)
## doParallel 1.0.17 2022-02-07 [1] CRAN (R 4.2.1)
## dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.1)
## e1071 1.7-12 2022-10-24 [1] CRAN (R 4.2.2)
## elevatr * 0.4.2 2022-01-07 [1] CRAN (R 4.2.2)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.1)
## evaluate 0.19 2022-12-13 [1] CRAN (R 4.2.1)
## fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.1)
## farver 2.1.1 2022-07-06 [1] CRAN (R 4.2.1)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.1)
## forcats * 0.5.2 2022-08-19 [1] CRAN (R 4.2.1)
## foreach 1.5.2 2022-02-02 [1] CRAN (R 4.2.1)
## fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.1)
## gargle 1.2.1 2022-09-08 [1] CRAN (R 4.2.1)
## generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.1)
## ggnetwork * 0.5.10 2021-07-06 [1] CRAN (R 4.2.2)
## ggplot2 * 3.4.0 2022-11-04 [1] CRAN (R 4.2.2)
## ggpubr * 0.5.0 2022-11-16 [1] CRAN (R 4.2.2)
## ggsignif 0.6.4 2022-10-13 [1] CRAN (R 4.2.2)
## ggspatial * 1.1.7 2022-11-24 [1] CRAN (R 4.2.2)
## glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.1)
## googledrive 2.0.0 2021-07-08 [1] CRAN (R 4.2.1)
## googlesheets4 1.0.1 2022-08-13 [1] CRAN (R 4.2.1)
## gridExtra * 2.3 2017-09-09 [1] CRAN (R 4.2.1)
## gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.1)
## haven 2.5.1 2022-08-22 [1] CRAN (R 4.2.1)
## highr 0.9 2021-04-16 [1] CRAN (R 4.2.1)
## hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.1)
## htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.2)
## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.2.1)
## httpuv 1.6.6 2022-09-08 [1] CRAN (R 4.2.1)
## httr 1.4.4 2022-08-17 [1] CRAN (R 4.2.1)
## igraph * 1.3.5 2022-09-22 [1] CRAN (R 4.2.2)
## iterators 1.0.14 2022-02-05 [1] CRAN (R 4.2.1)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.1)
## jsonlite 1.8.4 2022-12-06 [1] CRAN (R 4.2.2)
## KernSmooth 2.23-20 2021-05-03 [2] CRAN (R 4.2.1)
## knitr 1.41 2022-11-18 [1] CRAN (R 4.2.2)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.2.0)
## later 1.3.0 2021-08-18 [1] CRAN (R 4.2.1)
## lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.1)
## lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2)
## lubridate 1.9.0 2022-11-06 [1] CRAN (R 4.2.2)
## lwgeom * 0.2-10 2022-11-19 [1] CRAN (R 4.2.2)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.1)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.1)
## mime 0.12 2021-09-28 [1] CRAN (R 4.2.0)
## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.2)
## modelr 0.1.10 2022-11-11 [1] CRAN (R 4.2.2)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.1)
## osmdata 0.1.10 2022-06-09 [1] CRAN (R 4.2.1)
## pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.1)
## pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.2.2)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.1)
## pkgload 1.3.2 2022-11-16 [1] CRAN (R 4.2.2)
## plyr 1.8.8 2022-11-11 [1] CRAN (R 4.2.2)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.1)
## processx 3.8.0 2022-10-26 [1] CRAN (R 4.2.2)
## profvis 0.3.7 2020-11-02 [1] CRAN (R 4.2.2)
## progress 1.2.2 2019-05-16 [1] CRAN (R 4.2.1)
## progressr 0.12.0 2022-12-13 [1] CRAN (R 4.2.1)
## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.1)
## proxy 0.4-27 2022-06-09 [1] CRAN (R 4.2.1)
## ps 1.7.2 2022-10-26 [1] CRAN (R 4.2.2)
## purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.2)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.1)
## ragg 1.2.4 2022-10-24 [1] CRAN (R 4.2.2)
## RANN * 2.6.1 2019-01-08 [1] CRAN (R 4.2.2)
## raster * 3.6-11 2022-11-28 [1] CRAN (R 4.2.2)
## Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.2.1)
## D RcppParallel 5.1.5 2022-01-05 [1] CRAN (R 4.2.1)
## readr * 2.1.3 2022-10-01 [1] CRAN (R 4.2.2)
## readxl 1.4.1 2022-08-17 [1] CRAN (R 4.2.1)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.1)
## reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.1)
## reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.2.1)
## rgdal 1.6-2 2022-11-09 [1] CRAN (R 4.2.2)
## riverconn * 0.3.23 2022-12-23 [1] local
## rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2)
## rmarkdown 2.18 2022-11-09 [1] CRAN (R 4.2.2)
## rmdformats 1.0.4 2022-05-17 [1] CRAN (R 4.2.2)
## rstatix 0.7.1 2022-11-09 [1] CRAN (R 4.2.2)
## rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.1)
## rvest 1.0.3 2022-08-19 [1] CRAN (R 4.2.1)
## s2 1.1.1 2022-11-17 [1] CRAN (R 4.2.2)
## sass 0.4.4 2022-11-24 [1] CRAN (R 4.2.2)
## scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.2)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.2)
## sf * 1.0-9 2022-11-08 [1] CRAN (R 4.2.2)
## shiny 1.7.3 2022-10-25 [1] CRAN (R 4.2.2)
## slippymath 0.3.1 2019-06-28 [1] CRAN (R 4.2.2)
## sp * 1.5-1 2022-11-07 [1] CRAN (R 4.2.2)
## stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.1)
## stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.2.2)
## systemfonts 1.0.4 2022-02-11 [1] CRAN (R 4.2.2)
## terra 1.6-47 2022-12-02 [1] CRAN (R 4.2.2)
## textshaping 0.3.6 2021-10-13 [1] CRAN (R 4.2.2)
## tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.1)
## tidyr * 1.2.1 2022-09-08 [1] CRAN (R 4.2.1)
## tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.2)
## tidyverse * 1.3.2 2022-07-18 [1] CRAN (R 4.2.2)
## timechange 0.1.1 2022-11-04 [1] CRAN (R 4.2.2)
## tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.1)
## units 0.8-1 2022-12-10 [1] CRAN (R 4.2.2)
## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.2)
## usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.2)
## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.1)
## vctrs 0.5.1 2022-11-16 [1] CRAN (R 4.2.2)
## viridis * 0.6.2 2021-10-13 [1] CRAN (R 4.2.1)
## viridisLite * 0.4.1 2022-08-22 [1] CRAN (R 4.2.1)
## withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.1)
## wk 0.7.1 2022-12-09 [1] CRAN (R 4.2.2)
## xfun 0.35 2022-11-16 [1] CRAN (R 4.2.2)
## xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.1)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.1)
## yaml 2.3.6 2022-10-18 [1] CRAN (R 4.2.2)
##
## [1] C:/Users/Damiano/AppData/Local/R/win-library/4.2
## [2] C:/Program Files/R/R-4.2.1/library
##
## D ── DLL MD5 mismatch, broken installation.
##
## ──────────────────────────────────────────────────────────────────────────────
Appendix B: Source code of the custom functions used to support the processing the data
#' Function to identify river network outlet
#'
#' @param river_network a polyline shapefile containing the river network
#' (must be of class sf)
#' @param catchment_mask a polygon shapefile with the extent of the catchment
#' (must be of class sf)
#'
#' @return the index of the river network that is corresponding to the outlet
#' (index points to the row in the river_network_shapefile)
#' @export
#'
#' @examples
#'
get_outlet <- function(river_network, catchment_mask, distance) {
catchment_mask %>%
st_union() %>%
st_cast("LINESTRING") %>%
st_is_within_distance(river_network, .,
dist = distance, sparse = FALSE) %>%
match(TRUE, .)
}
#' Function to create an object of class igraph with the river network attributes and
#' the right directionality
#'
#' @param network_links a point shapefile containing the river joins and dams
#' (must be of class sf)
#' @param river_net_simplified a polyline shapefile containing the river
#' network already sliced (must be of class sf). Must contain an ID column named 'NodeID'.
#' @param outlet integer indicating the reach in river_net_simplified that is
#' the outlet of the catchment
#'
#' @return 'igraph' object of the river network
#' @export
#'
#' @examples
#'
create_network <- function(network_links, river_net_simplified, outlet) {
# - use spatial join to get information about the links
# - create a distance matrics in long format
# - create temporaty network: it includes triangular cliques corresponding
# to the joints
# - loop over the triangles to remove the edges that are not good for having a
# nicely directed network
# - create a new undirected graph without loops
# - define the outlet
# - create a directed graph by setting directions based on the distance to the outlet
# Check where river_net_simplified is overlapping with network links,
# retain the information and save to network_links
network_links_df <- st_is_within_distance(network_links, river_net_simplified,
dist = 0.01) %>%
lapply(.,
FUN = function(x){data.frame("from" = x[1], "to" = x[2], "to2" = x[3]) }) %>%
do.call(rbind,.) %>%
cbind(network_links %>% st_drop_geometry())
# Get a full distance matrix
# - note: directions are messed up!
# - note: for the joints I am creating triangles (i.e. there are 3 nodes and 3 links):
# this is to be corrected afterwards when I decide the directionality
full_net_links_df <- rbind(
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-to2) %>%
rename(from = from, to = to),
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-to) %>%
rename(from = from, to = to2),
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-from) %>%
rename(from = to, to = to2) ,
network_links_df %>%
filter(is.na(to2)) %>%
dplyr::select(-to2)
) %>%
#dplyr::select(from, to, type, id_dam, pass_u, pass_d) %>%
mutate(id_links = 1:nrow(.))
# Create vertices vector
vertices <- river_net_simplified %>%
st_drop_geometry %>%
mutate(name = NodeID) %>%
mutate(length = as.numeric(length))
# Create a graph based on this distance matrix and adjust directions
# note: add vertex name so it's easier to identify them in the next steps
river_temp_2 <- igraph::graph_from_data_frame(
d = full_net_links_df,
v = vertices,
directed = FALSE) %>%
igraph::simplify(remove.loops = FALSE, remove.multiple = TRUE, edge.attr.comb="first")
# Remove triangles from graph: this way the joints are keeping the right directionality
cliq3 <- cliques(river_temp_2, 3, 3)
links_to_remove <- list()
for (i in 1:length(cliq3)){
full_dist <- distances(river_temp_2, v =cliq3[[i]], to = outlet)
links_to_remove[[i]] <- data.frame(
"from" = rownames(full_dist)[full_dist != min(full_dist)],
"to" = rownames(full_dist)[full_dist != min(full_dist)] %>% rev)
}
links_to_remove <- do.call(rbind, links_to_remove)
# Create a graph with the edges to remove and subtract it from the original graph
graph_to_remove <- graph_from_data_frame(d = links_to_remove,
directed = FALSE)
river_temp <- difference(river_temp_2, graph_to_remove )
# Make the graph directional
# the algorithms checks edges distances to outlet and assigns direction
river_temp_df <- igraph::as_data_frame(river_temp, what = "edges") %>%
mutate(from = as.numeric(from), to = as.numeric(to))
out <- list()
for (iii in 1:nrow(river_temp_df)) {
df_iter <- river_temp_df[iii, ]
delta <- distances(river_temp, v = df_iter$from, to = outlet) -
distances(river_temp, v = df_iter$to, to = outlet)
if (delta > 0) {from = df_iter$from; to = df_iter$to}
if (delta < 0) {from = df_iter$to; to = df_iter$from}
out[[iii]] <- data.frame("from" = from, "to" = to,
"type" = df_iter$type,
"id_links" = df_iter$id_links,
"id_barrier"=df_iter$id_barrier,
"pass_u" = df_iter$pass_u,
"pass_d" = df_iter$pass_d)
}
river_graph_df <- do.call(rbind, out)
river_graph <- graph_from_data_frame(
d = river_graph_df,
directed = TRUE,
v = vertices)
# Return result
return(river_graph)
}
#' Function to snap points to a dense polyline
#'
#' @param x a point shapefile that contains the points to be snapped
#' (must be of class sf)
#' @param y a polyline shapefile that contains the river network
#' (must be of class sf)
#' @param max_dist maximum distance for snapping points
#'
#' @return a data frame that mirrors x, but with the geometries snapped to y
#' (the result of st_distance(x, y) is a vector of 0s)
#' @export
#'
#' @details this function needs a dense polyline shapefile
# (i.e. a shapefile that has small -short- lines).
# This shapefile can be built in ArcGIS using the functions to generate
# equally spaced points along a polyline and then to
# split line at points (points should be sampled with 100 - 500 m intervals )
#'
#' @examples
#'
snap_to_river <- function(x, y, max_dist = 1000){
## NOTE: This function is needed because sf is not good at snapping
# Simplify y to check distance
y_simple <- y %>% st_union()
# Make the river shapefile dense: extract points at each vertex
y_points <- y %>%
st_as_sf %>%
st_cast("POINT") %>%
mutate(id = 1:nrow(.))
# Make the river shapefile dense: add splits at each vertex
y_complex <- lwgeom::st_split(y, y_points) %>%
st_collection_extract(.,"LINESTRING") %>%
st_as_sf() %>%
mutate(length = st_length(.))
## Check the distribution length of the simplified reaches
#hist(y_complex$length, nclass = 300)
# Initialize output list
out <- list()
# Loop over x rows
for (i in 1:nrow(x)){
# Select the point that will be snapped
x_i.snap <- x[i,]
# Extract the nearest feature
nf <- y_complex[st_nearest_feature(x[i,], y_complex),]
# Check the distance between the point to snap and the nearest feature
# if yes, then snap, otherwise do not snap and keep the original point
if ( as.numeric(st_distance(x[i,], nf)) < max_dist) {
# # Snap the point to the nearest feature NOTE: st_snap is not working properly
# x_i.snap <- st_snap(x[i,], nf %>% st_cast("POINT"), tolerance = max_dist)
# Transform the nearest polyline to points
nf.points <- st_cast(nf, "POINT")
# Select the point that has the minimum distance to the point to snap
nf.points.min <- nf.points[ which.min(st_distance(x[i,],nf.points)),]
# Substitute the geometry of the point to snap with
# the geometry of the closest point in the nearest feature (nf) object
x_i.snap$geometry <- nf.points.min$geometry
# # Check the distance
# st_distance(x_i.snap, y_simple)
}
# Create output data frame
out[[i]] <- x_i.snap
#cat(paste0(i, " "))
}
# out_x should contain the same metadata of dams
out_x <- do.call(rbind, out)
return(out_x)
}
#' Check if dams are located on the headwaters of the network
#'
#' @param dams a sf point shapefile that is snapped to shape
#' @param shape a sf polyline shapefile
#'
#' @return a data.frame that contains all the attributes in the dams shapefile plus two new columns.
#' The new columns are: 'n_intersections', reporting how many segments of the polyline are touching the dam,
#' and dam_headwater that flags dams that are located in the headwaters (if value is TRUE). In this column,
#' a TRUE value identifies dams that should be removed from the analysis.
#' @export
#'
#' @description this functions identifies dams that are located in the headwaters. The workflow is as follows:
#' 1) a circular buffer is created around each dam; 2) sf::st_intersect is used to check how many sections of the
#' river polyline intersect the buffer; 3) if the dam does not have any section of the polyline upstream (i.e. the
#' buffer intersects the polyline only once), then it is located in the headwaters. These dams are flagged in the
#' output so the user can remove them
#'
#' @examples
headwaters_dam <- function(dams, shape){
# Create circular buffer around dams
dams_buffer <- st_buffer(dams, 1)
# Break shapefile at points
shape_points <- shape %>%
st_as_sf %>%
st_cast("POINT") %>%
mutate(id = 1:nrow(.))
shape_complex <- lwgeom::st_split(shape, shape_points) %>%
st_collection_extract(.,"LINESTRING") %>%
st_as_sf()
# Intersect with polyline
intersections_mat <- st_intersects(dams_buffer, shape_complex)
# Check if some element in intersection_mat is smaller than 2
# This means that the buffer is crossed only one time by the river network,
# meaning that the dam does not have an upstream segment
dams_df <- dams %>%
st_drop_geometry() %>%
mutate(
n_intersections = do.call(rbind, lapply(intersections_mat, FUN = length)),
flag_headwater = ifelse(n_intersections == 1, TRUE, FALSE))
return(dams_df)
}
#' Check if there are issues with confluences in hydrosheds shapefile
#'
#' @param shape a sf polyline
#'
#' @return a sf point object with the confluences and two additional fields: n_confluences, containing the number
#' of reaches that enter and exit the confluence, and flag_confluences
#' that flags the 'suspect' confluences with a TRUE value
#' @export
#'
#' @examples
multiple_confluences <- function(shape){
# Break shapefile at points
river_to_points <- shape %>%
mutate(NodeID = 1:nrow(.)) %>%
st_as_sf %>%
st_cast("POINT") %>%
mutate(id = 1:nrow(.))
joins_selection <- river_to_points %>%
st_equals() %>%
Filter(function(x){length(x) > 2}, .) %>%
lapply(., FUN = min) %>%
unlist() %>%
unique()
river_joins <- river_to_points %>%
dplyr::filter(id %in% joins_selection) %>%
dplyr::select(NodeID) %>%
st_as_sf
# Create buffer around shape_points
river_joins_buffer <- river_joins %>%
st_buffer(1)
# Intersect with polyline
intersections_mat <- st_intersects(river_joins_buffer, shape)
# Add information to the joints shapefile
river_joins$n_confluences <- as.vector(do.call(rbind, lapply(intersections_mat, FUN = length)) )
river_joins$flag_confluences <- ifelse(river_joins$n_confluences <=3, FALSE, TRUE)
return(river_joins)
}
#' Detects gaps in the river network polyline that might cause the function network_creation to fail
#'
#' @param shape the river network shapefile to be used as input to the network_creation function
#'
#' @return a point shapefile with the position of the gaps in the river network
#'
gaps_detection <- function(shape){
# Break shapefile at points
river_to_points <- shape %>%
st_as_sf %>%
st_cast("POINT") %>%
mutate(id = 1:nrow(.))
# Retain all confluences, both good and 'broken' ones
joins_selection <- river_to_points %>%
st_is_within_distance(dist = 0.0001) %>%
Filter(function(x){length(x) > 2 }, .) %>%
lapply(., FUN = min) %>%
unlist() %>%
unique()
all_river_joins <- river_to_points %>%
filter(id %in% joins_selection)
# Select casted points that are next to each confluence
close_points <- st_is_within_distance(all_river_joins, river_to_points, dist = 1)
# Loop over each points triplet and calculate the distance
out_dist <- list()
for (i in 1:length(close_points)){
out_dist[[i]] <- river_to_points %>%
filter(id %in% close_points[[i]]) %>%
st_distance() %>%
max()
}
# Problematic points are extracted
problematic_points <- close_points[which(out_dist > 0)] %>%
unlist() %>%
unique()
points_out <- river_to_points %>%
filter(id %in% problematic_points)
return(points_out)
}
#' Creates a river network shapefile with the component attribute that
#' can be useful for identifying disconnected chunks
#'
#' @param network_links a point shapefile containing the river joins and dams
#' (must be of class sf)
#' @param river_net_simplified a polyline shapefile containing the river
#' network already sliced (must be of class sf). Must contain an ID column named 'NodeID'.
#' @return sf object that copies river_net_simplified with an additional
#' 'component' field that can be plotted to spot isolated components
#' @export
#'
#' @examples
#'
check_components <- function(network_links, river_net_simplified) {
# - use spatial join to get information about the links
# - create a distance matrics in long format
# - create temporaty network: it includes triangular cliques corresponding
# to the joints
# - loop over the triangles to remove the edges that are not good for having a
# nicely directed network
# - create a new undirected graph without loops
# - extract the graph components and join with original shapefile
network_links_df <- st_is_within_distance(network_links,
river_net_simplified,
dist = 0.01) %>%
lapply(.,
FUN = function(x){data.frame("from" = x[1], "to" = x[2], "to2" = x[3]) }) %>%
do.call(rbind,.) %>%
cbind(network_links %>% st_drop_geometry())
# Get a full distance matrix
# - note: directions are messed up!
# - note: for the joints I am creating triangles (i.e. there are 3 nodes and 3 links):
# this is to be corrected afterwards when I decide the directionality
full_net_links_df <- rbind(
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-to2) %>%
rename(from = from, to = to),
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-to) %>%
rename(from = from, to = to2),
network_links_df %>%
filter(!is.na(to2)) %>%
dplyr::select(-from) %>%
rename(from = to, to = to2) ,
network_links_df %>%
filter(is.na(to2)) %>%
dplyr::select(-to2)
) %>%
#dplyr::select(from, to, type, id_dam, pass_u, pass_d) %>%
mutate(id_links = 1:nrow(.))
# Create vertices vector
vertices <- river_net_simplified %>%
st_drop_geometry %>%
mutate(name = 1:nrow(.)) %>%
mutate(length = as.numeric(length))
# Create a graph based on this distance matrix and adjust directions
# note: add vertex name so it's easier to identify them in the next steps
river_temp_2 <- igraph::graph_from_data_frame(
d = full_net_links_df,
v = vertices,
directed = FALSE) %>%
igraph::simplify(remove.loops = FALSE, remove.multiple = TRUE, edge.attr.comb="first")
# Get the components of river_temp
river_net_simplified_comp <- river_net_simplified %>%
mutate(NodeID = 1:nrow(.)) %>%
dplyr::select(NodeID)
river_net_simplified_comp$component <- as.vector(components(river_temp_2)$membership)
# Return output
return(river_net_simplified_comp)
}
References and key literature
Baldan, D., Cunillera-Montcusí, D., Funk, A., & Hein, T. (2022). ‘Riverconn’: An R package to assess river connectivity indices. Environmental Modelling & Software, 105470.
Lehner, B., Verdin, K., Jarvis, A. (2008): New global hydrography derived from spaceborne elevation data. Eos, Transactions, AGU, 89(10): 93-94.
Lehner, B., Grill G. (2013): Global river hydrography and network routing: baseline data and new approaches to study the world’s large river systems. Hydrological Processes, 27(15): 2171–2186. Data is available at www.hydrosheds.org.
Belletti, Barbara, et al. “More than one million barriers fragment Europe’s rivers.” Nature 588.7838 (2020): 436-441.
Verdin, K. L., & Verdin, J. P. (1999). A topological system for delineation and codification of the Earth’s river basins. Journal of Hydrology, 218(1-2), 1-12.
Cote, D., Kehler, D. G., Bourne, C., & Wiersma, Y. F. (2009). A new measure of longitudinal connectivity for stream networks. Landscape Ecology, 24(1), 101-113.
Kolaczyk, E. D., & Csárdi, G. (2014). Statistical analysis of network data with R (Vol. 65). New York: Springer.
Jumani, S., Deitch, M. J., Kaplan, D., Anderson, E. P., Krishnaswamy, J., Lecours, V., & Whiles, M. R. (2020). River fragmentation and flow alteration metrics: a review of methods and directions for future research. Environmental Research Letters.