Estimate the DGB Distribution Parameters

Estimates the parameters of the DGB distribution (also known as the Beta-Cocho distribution) first defined in Mansilla et al. (2007) doi:10.1016/j.joi.2007.01.001 and further characterized in Martinez-Mekler et al. (2009) doi:10.1371/journal.pone.0004791 for a given set of data. BC_param calculates the log of the data and estimates the abundance from the ranking using a linear model. The coefficients of the linear model are then scaled for future use.

Usage

BC_param(
  df_abundance = NULL,
  column = NULL,
  confidence_interval = 0.95,
  nls = FALSE,
  nls_loop = 1,
  nls_algo = "lm",
  nls_control = list(scale = "more"),
  BC_rank_object = NULL,
  ...
)

Arguments

df_abundance: A data frame that contains abundance data.
column: Either a string with the name of the column or the number of the column that stores the abundances in the data frame.
confidence_interval: Numeric. The confidence interval to calculate for the DGB distribution.
nls: Logical. Set to TRUE to use a non-linear least squares fitting method from the 'gslnls' package.
nls_loop: Numeric. Set to values higher than 1 to repeat the nls method as many times as needed, reseeding with the last value. This improves the fit with diminishing returns.
nls_algo: String. The algorithm to use for the gsl_nls function.
nls_control: List. Provided for the control argument for the gsl_nls function.
BC_rank_object: Optional. A previous object generated by BC_rank.
...: passes arguments to BC_rank.

Value

A list with the following elements: The input data frame with added processed ranking data and predicted confidence interval values, the adjusted parameters, the confidence interval of the parameters, the linear model and a summary of the model.

Examples

paramBillionaires <- BC_param(df_abundance=DGBD::Billionaires, column= 2, confidence_interval=0.99)
head(paramBillionaires[[1]])
#>   pre_numerator pre_denominator      lwr      upr BC_rank rank abundance
#> 1          2640               1 496524.8 522807.6       1    1    211000
#> 2          2639               2 298181.3 312223.3       2    2    180000
#> 3          2638               3 221270.4 230940.5       3    3    114000
#> 4          2637               4 179058.0 186455.2       4    4    107000
#> 5          2636               5 151943.5 157940.0       5    5    106000
#> 6          2635               6 132865.4 137909.4       6    6    104000
paramBillionaires[2:5]
#> [[1]]
#>            A            a            b 
#> 1.946909e+05 7.396223e-01 1.221052e-01 
#> 
#> [[2]]
#>                    0.5 %       99.5 %
#> (Intercept) 1.849913e+05 2.048991e+05
#> log_den     7.437022e-01 7.355424e-01
#> log_num     1.180254e-01 1.261851e-01
#> 
#> [[3]]
#> 
#> Call:
#> stats::lm(formula = log_abundance ~ log_den + log_num)
#> 
#> Coefficients:
#> (Intercept)      log_den      log_num  
#>     12.1792      -0.7396       0.1221  
#> 
#> 
#> [[4]]
#> 
#> Call:
#> stats::lm(formula = log_abundance ~ log_den + log_num)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.88157 -0.03868  0.00415  0.03921  0.55573 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 12.179169   0.019826  614.31   <2e-16 ***
#> log_den     -0.739622   0.001583 -467.30   <2e-16 ***
#> log_num      0.122105   0.001583   77.15   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.06127 on 2637 degrees of freedom
#> Multiple R-squared:  0.9944,	Adjusted R-squared:  0.9944 
#> F-statistic: 2.355e+05 on 2 and 2637 DF,  p-value: < 2.2e-16
#> 
#> 

paramMOMv3.3 <- BC_param(df_abundance=DGBD::MOMv3.3, column=7, nls=TRUE)
head(paramMOMv3.3[[1]])
#>   pre_numerator pre_denominator      lwr      upr BC_rank Continent Status
#> 1          3961             399 4.506527 4.531698     399        AF extant
#> 2          3921             439 4.414756 4.439505     439        AF extant
#> 3          4105             255 4.918282 4.946076     255        AF extant
#> 4          3819             541 4.207317 4.231185     541        AF extant
#> 5          3890             470 4.348100 4.372560     470        AF extant
#> 6          3854             506 4.274844 4.298994     506        AF extant
#>          Order  Family      Genus       Species abundance Combined mass
#> 1 Artiodactyla Bovidae      Addax nasomaculatus      4.85       70000.3
#> 2 Artiodactyla Bovidae  Aepyceros      melampus      4.72       52500.1
#> 3 Artiodactyla Bovidae Alcelaphus    buselaphus      5.23      171001.5
#> 4 Artiodactyla Bovidae Ammodorcas       clarkei      4.45       28049.8
#> 5 Artiodactyla Bovidae Ammotragus        lervia      4.68       48000.0
#> 6 Artiodactyla Bovidae Antidorcas   marsupialis      4.59       39049.9
#>   Reference
#> 1        60
#> 2    63, 70
#> 3    63, 70
#> 4        60
#> 5        75
#> 6        60
paramMOMv3.3[2:5]
#> [[1]]
#>          A          a          b 
#> 0.02024824 0.13548540 0.75075212 
#> 
#> [[2]]
#>       2.5 %     97.5 %
#> A 0.0185832 0.02191329
#> a 0.1332815 0.13768931
#> b 0.7418200 0.75968426
#> 
#> [[3]]
#> Nonlinear regression model
#>   model: abundance ~ A * ((max(BC_rank) + 1 - BC_rank)^b)/(BC_rank^a)
#>    data: ranked_frame
#>       A       a       b 
#> 0.02025 0.13549 0.75075 
#>  residual sum-of-squares: 218.3
#> 
#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)
#> 
#> Number of iterations to convergence: 33 
#> Achieved convergence tolerance: 0
#> 
#> [[4]]
#> 
#> Formula: abundance ~ A * ((max(BC_rank) + 1 - BC_rank)^b)/(BC_rank^a)
#> 
#> Parameters:
#>    Estimate Std. Error t value Pr(>|t|)    
#> A 0.0202482  0.0008493   23.84   <2e-16 ***
#> a 0.1354854  0.0011242  120.52   <2e-16 ***
#> b 0.7507521  0.0045560  164.78   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.2239 on 4356 degrees of freedom
#> 
#> Number of iterations to convergence: 33 
#> Achieved convergence tolerance: 0
#> 
#>