Integration & Summary


The greatest challenge in developing a large-scale biogeographic assessment is the synthesis and subsequent analysis of spatial data collected at different scales for varied objectives (Gotway and Young 2002). This is particularly true when attempting to describe meso-scale (10s - 100s of kilometers) spatial patterns using data for a range of taxa that were each collected using different sampling techniques. The taxon-specific sections of this document describe spatial patterns of community structure for marine birds, marine mammals, and marine fishes. The intent of this section is to coalesce these results and construct a unified and biologically relevant assessment of the biogeographic patterns observed.

There are a number of ways to address the challenge of integrating results for multiple taxa, and this section contains results for three (of many) reasonable options. This integration effort has been tailored to the NMSP mission of "...enhancing biodiversity, ecological integrity, and cultural heritage", and specifically focuses on the notion of biodiversity in describing the overall biogeography of the region.

After a thorough assessment of the spatial data for each taxon, it was concluded that the marine mammal data were not robust enough in present form to include in the integration process. As such, only birds and fish were considered here. Additional efforts to reconcile outstanding issues in the marine mammal data are ongoing. The integration alternatives provided in this section include:

Integration: Option 1 Species Diversity map
Option 1: A co-occurrence analysis of diversity hot spots for marine birds and fishes.
Integration: Option 2 Species Density map
Option 2: A co-occurrence analysis of density hot spots for marine birds and marine fishes.
Integration: Option 3 Diversity and Density map
Option 3: A co-occurrence analysis of hot spots of diversity and density for marine birds and fishes (options 1 and 2 combined).

Download Integration Shapefiles

Download Integration ArcView Grids

In the first of these approaches, only patterns of species diversity were analyzed. This index was relatively simple to calculate using the data available for birds and fishes, and represents a common metric for integration. The second option focuses on spatial patterns of density. Density is a more intuitive measure than diversity, and it highlights regions of highest marine bird concentrations (abundance). An added attraction of density is that it is only weakly influenced by effort. The third approach incorporates the two metrics for marine birds and fishes simultaneously by combining results of options one and two.

Metrics used in these three options were chosen to best define the biogeography for each taxon based on the available data. Once each integration parameter was mapped, patterns of community structure were superimposed and interpreted in the context of various biological and physical covariates. These spatial covariates were used to better understand general biogeographic patterns, and through interpretation, suggest reasons for the observed spatial trends. For example, results indicated that a portion of highest observed bird and fish diversity occured adjacent to the shelf/slope interface. It is well documented that strong upwelling of deep ocean waters consistently occurs in areas along the slope. Nutrients in these waters support high phytoplankton productivity, which stimulates a cascade of productivity at all levels of the marine food web (Bolin and Abbott, 1963; Ryther, 1969; Malone, 1971; Barber and Chavez, 1983; Chavez 1995, 1996; Bakun, 1996).

Furthermore, by combining multiple parameters across taxa (option 3), it was possible to link results presented in earlier sections to an integrated composite. This approach provides a clear and tractable interpretation that the reader can follow as a logical end point to the preceding series of analyses. The combination of diversity and density presents an inclusive view of important areas across taxa, and is less likely to overlook regions of potential importance when compared to maps depicting a single estimate (e.g., options 1 and 2). This is a critical point as the most diverse patch in a seascape is not necessarily the most productive. In addition to the general patterns observed for each metric, the spatial coincidence of hot spots among taxa is emphasized to provide a view into the integrated ecosystem. The metrics used in this section are similar to those described in sections 2.1 (fish) and 2.2 (marine birds); however, data were interpolated to produce a continuous modeled surface rather than estimates per 5 minute grid. This approach takes into consideration the spatial structure in the data to model the gradient of the metric between any given pair of sampling points. This results in smoothed surfaces that permit easier visualization of biologically significant areas. Resulting large-scale patterns have been described in the context of sanctuary boundaries to provide insights that may enhance management efficacy in these protected areas.

Data and Analysis

Integration Metrics. There are a number of ways by which ecologists measure diversity. The simplest metric is a count of the total number of unique species in a community, also called species richness (S). This is a straightforward, though potentially misleading, measure of diversity. Sampling must be conducted at all locations with the same amount of effort for this estimate to be comparable across a study region or between data sets. Unfortunately, this was not the case with any of the source data available for integration. For example, marine bird observation transects were far more numerous (more effort) near shore, and declined dramatically with distance from shore. Because this is often the case with biological sampling, a number of diversity indexes have been developed that are, in theory, more independent of sample size. These are based on the relationship between species richness and the total number of individuals observed (n), both of which increase as a function of effort, and, ideally, cancel out the effect of effort on the resulting index (Ludwig and Reynolds, 1988). Here the Shannon index of diversity (Shannon and Weaver, 1949) was chosen, as this index is the most widely used in community ecology and has relatively small statistical bias when sample sizes are large (as is the case with our source data).

Diversity may be thought of as being composed of two distinct components: 1) species richness, and 2) species evenness. Evenness is defined as how the number of individuals is distributed among the species. For example, for a community comprised of five species with 70% of the individuals belonging to one species and 30% distributed among the remaining four species, the evenness component would be lower than if there were a more even distribution of individuals among the five species (Ludwig and Reynolds. 1988). Maximum diversity for a given number of species and individuals is achieved where equal numbers are found for each species in a community. For consistency, data for all taxa included in this section were summarized by five minute grids (see sections 2.1, 2.2., 2.3). Total diversity was estimated within each grid cell using the Shannon diversity index (H');

Shannon diversity index

where ni is the number of individuals belonging to the species (S) in the sample (5 minute grid), and n is the total number of individuals in the sample (Ludwig and Reynolds 1988). Diversity was calculated independently for birds and fishes using all species observed within a grid cell.

Once diversity was calculated for each taxon in each sample, a continuous map surface was interpolated to predict diversity patterns throughout the study area. The same process was used to model the bird density surface (see below for detailed methods).

Spatial Modeling. This section details the procedure used to process input data for the integration analyses. While technical in nature, it provides the information necessary for NMSP and others to generate results identical to this study using data provided in the appendix to this document (CD-ROM), and to explore results of alternate modeling options. The observed patterns in diversity and density were found to be robust to changes in model parameters; however, calculations of the aerial extent of persistent patterns may be more sensitive. For example, the location of areas of high bird diversity tends to be relatively constant, regardless of model parameters. The quantity (e.g., square kilometers) of these high areas that fall inside sanctuary boundaries, however, may change.

For interpolation and calculation of spatial autocorrelation statistics, data for each 5 minute grid cell were assigned to the cell centroid. All data were analyzed in the Universal Transverse Mercator (UTM) projection. Projection is necessary to ensure that the value of x and y units is equivalent and constant across the study region. The spatial modeling process to generate an interpolated surface consisted of the following sequence of operations:

1) Checking for Spatial Autocorrelation: Prior to interpolation, all data were tested for the presence of spatial autocorrelation. Positive autocorrelation (where values for neighboring pairs of points are more similar to one another than are distant ones) is important for accurate interpolation. Moran's I and Geary's C statistics were calculated for each interpolated variable to test for the presence of significant spatial autocorrelation using CrimeStat (Levine, 2002). Moran's I is the standard autocorrelation statistic and provides a global (i.e. across the study area) test of spatial autocorrelation. Geary's C is more sensitive to autocorrelation within small neighborhoods. Confirmation of statistically significant spatial autocorrelation suggests that point data are suitable for interpolation. As such, interpolation was performed only where this was true for both autocorrelation statistics.

2) Detrending: Detrending is done to "standardize" the estimate across the analysis extent, and is a prerequisite for the interpolation procedure used here. After interpolation, the removed trend is added back into the model results. Each interpolated variable was plotted against Northing and Easting, and a linear trend was fit to each plot. When significant trend (p < 0.05) was present for either Northing or Easting, the data were detrended (first order) before variogram modeling and kriging.

3) Variogram Modeling: Empirical variograms show the decrease in relatedness between pairs of points as a function of distance. In order to calculate the empirical variogram, pairs of points must be binned by distance, and an average value (diversity, density) calculated for all pairs within a given bin. The size of the bin is referred to as the lag size. A variogram model is fit to the empirical variogram and its parameters are later used in interpolation. Empirical variograms were calculated using the default lag size and number, as well as for 1km, 5km, and 10km lag sizes. The appropriate lag size and number of lags were chosen to optimize variogram coherence. Directional variograms were then plotted to investigate possible anisotropy not removed by detrending. Spherical variogram models were fit to the empirical variograms. A spherical model was chosen based on the pattern of the empirical variograms and the lack of data at short lag distances (due to the five minute minimum separation between points), which are necessary to differentiate between spherical and Gaussian models.

4) Surface Interpolation: The interpolation method used is termed "ordinary kriging". Kriging is a linear interpolation method that allows predictions of unknown values of a random function from observations at known locations (Kaluzny et al., 1998). Ordinary kriging is the kriging method generally used for interpolation of a single continuous variable of unknown mean. Kriging is preferred over other interpolation methods because: 1) weights are based on an empirical assessment of the data's spatial structure (the variogram), 2) kriging is an unbiased predictor, and 3) for many variables, kriging has been shown to outperform other interpolation methods, such as inverse distance weighting (IDW) and triangulated irregular networking (TIN) (Guan et al., 1999). Before kriging can be applied, two assumptions must be checked. The first is stationarity; the mean (and ideally the variance) must be constant across the spatial extent of the data. That is, any large scale trend must be removed (see #2 above). The second assumption is isotropy of the variogram. The covariance between any two points is assumed to be a function only of the distance between the points, not of their location or angle. This assumption can be examined and, if necessary, corrected for during the variogram modeling stage (see #3 above). Trend analysis was conducted using JMP statistical software (SAS Institute), while detrending, variogram modeling, and kriging were conducted using the ArcView (GIS) Geostatistical Analyst Extension (ESRI Inc.).

The kriging neighborhood was set to the twenty nearest neighbors with a minimum of five neighbors for each 90 degree angular sector for the fish data, and reduced to eight and five for birds in order to capture small scale variability. Cross validation was conducted to assess model accuracy by regressing observed versus predicted values. Maps of the kriging standard error were also generated and used to restrict the analysis extent. In order to avoid unsupported extrapolation into poorly sampled areas, the interpolated maps were clipped to remove areas of higher standard error. Interpolated maps were clipped so that only grid cells for which the standard error was in the lowest 20% were used for subsequent display and analysis.

5) Correcting for Effort: Total effort was calculated as the total length of trawls falling within a grid cell for the NMFS trawl data and as the total area surveyed within a grid cell for the marine bird survey data. Although diversity is less related to effort than other metrics, some statistical correlation (p<0.05) between the two was found for both fish and birds. When such a correlation exists, maps of diversity may simply reflect the distribution of effort. In order to correct for differences in effort across the study region, the following technique was applied: A second order polynomial regression of diversity on effort was conducted and the residuals were interpolated as described above. The interpolated map of residuals depicts areas of higher or lower diversity relative to that expected given the amount of local effort. This map was overlayed on the interpolated map of diversity to visualize the impact of effort on the observed patterns in diversity. Although significantly correlated with effort, fish diversity showed nearly identical patterns as residual map of fish diversity. Fish diversity was therefore not corrected for effort. Patterns of marine bird diversity; however, differed substantially from patterns of the bird diversity residuals, indicating that differences in effort are responsible for some of the observed pattern. Marine bird diversity hot spots, as represented by the top 20% of diversity cell values, are therefore presented, along with an overlay of the lower third of the diversity residuals for marine birds. Areas of high marine bird diversity that overlap with low residuals should be interpreted with caution, as these hot spots may simply reflect areas of unusually high effort. Since bird and fish density were only weakly correlated with effort, no attempt was made to correct the density maps.


Spatial Statistics. The table below summarizes the results of spatial autocorrelation tests, variogram fitting, and kriging cross validation. All variables were found to be significantly positively spatially autocorrelated (p < 0.05) by both the Moran's I and Geary's C statistics, for fish and marine birds. Spatial autocorrelation was more pronounced in the marine bird data than in the fish data.

  Sample Size Moran's I Geary's C Detrending? Lag size (km) Number of lags Range (km) (minor range for anisotropic model) Nugget Partial Sill Cross-validation r2 Neighbors (total, minimum per sector)
Bird diversity 1,163 0.141** 0.830** Yes 20.451 12 99.851 0.25 0.117 0.541 8,2
Bird diversity residual 1,163 0.067** 0.893** Yes 9.966 12 88.062 0.199 0.125 0.407 8,2
Bird density 1,403 0.058** 0.891** Yes 5 30 149.54 1,189.30 1,189.50 0.253 20,5
Fish diversity 301 0.018** 0.973* No 5 20 29.595 0.148 0.046 0.025 20,5
Fish diversity residual 301 0.013* 0.975* No 5 10 18.394 0.11 0.061 0.076 20,5
Fish density 301 0.020** 0.969* No 5 8 37.929
0.0046 0.001 0.074 20,5

**indicates significance at p = 0.01, * indicates significance at p = 0.05