Abundance of animals in an area (i.e. density) is one of the most basic population parameters desired in applied ecology. Many methods have been developed to estimate abundance within an area such as mark-recapture; however, it is often unclear the exact area within which the sampled population is contained. The issue is that animals move. If you imagine that each animal in the population has a region of activity (i.e. home range) within which it moves, you can further imagine how animals whose region of activity existing on the boundaries of the sampling area may move in and out of the sampling area between sampling events (Figure 1).
Figure 1. An example of a sampling reach in a river. The distributions over each fish represent their home range with the dashed vertical line representing the center of activity.
The effects of this movement can be thought of from two perspectives. One way of thinking about the problem is that the area sampled actually extends beyond the boundaries of the defined sampling area….but how far beyond? How far beyond depends on how much the animals actually move or, in other words, how large the average home range size is. Another way to think about the problem is that the movement of animals in and out of the sampling area affectively dilutes the proportion of tagged to untagged animals in the sampling area resulting in fewer animals available to be recaptured. This causes the abundance estimate to be biased high for the sampling area. From either perspective, density calculated as the abundance estimate divided by the sampling area will be biased high and, at the end of a typical mark-recapture survey, the amount of bias is unknown.
The magnitude of the bias depends on two factors…the size of the average home range and the size of the sampling area. If the home range size is large relative to the sampling area, the bias increases. Alternatively, if the sampling area is increased relative to the home range size, the bias will be reduced. At some point, if the sampling area is large enough relative to the home range, the system will start to approximate a closed population (Figure 2).
Figure 2. An example of two sampling reaches in a river. The dilution of tagged fish in the larger sampling reach will be less relative to the abundance in the reach resulting in less biased density estimates.
This is a key point because the interplay between home range size and sampling area size can provide researchers opportunity to design mark-recapture experiments that minimize bias in resulting density estimates. What is needed to inform the design is some information on home range size or movement patterns. You can find an example of how to achieve this in the attached paper, “Evaluating mark-recapture sampling designs for fish in an open riverine system” (Gwinn et al. 2011). In this paper we evaluated optimal sampling area sizes for Murray cod using published home range information applied in an individual-based model. The results showed a trade-off between the number of sampling events and the length of the sampling reach for achieving bias and precision goals (Figure 3).
Figure 3. A figure taken from Gwinn et al. (2011) showing the trade-off in sampling design elements for Murray cod.
Below is sample R code that runs the simulation. It consists of a series of sections. The first section is a function that simulates a fake data set; the second section is a function that fits the mark-recapture model to the data set; the third section contains the simulation input parameter values; the fourth section runs the simulation for each sampling reach size and each sampling design (i.e. number of sampling events); and the last section is a simple graph of the results. This code requires no additional R package and so, should run if paste it into your R terminal. This particular example simulates a one dimension sampling area such as a river where the system is only open to movement on the upstream and downstream boundaries; however, modifying the code to represent a two dimensional system should be fairly easy.
################################################################################ # A function that simulates mark-recapture data # ################################################################################ Datagen = function(N,pcap,LU,UU,LS,US,HRsd,Event){ #defines the center of each cods home range during each capture event hr_cent = runif(N,LU,UU) #defines the location of each cod during each capture event loc = t(replicate(Event,rnorm(N,hr_cent,HRsd),simplify=T) ) #flags animals within the study area during each capture event vul = ifelse(loc>LS,ifelse(loc<US,1,0),0) #flags captured animals in each capture event catch = matrix(NA,N,Event) for(i in 1:Event){catch[,i] = rbinom(N,vul[i,],pcap)} #Flags recaptured animals in each recapture event recap=matrix(0,N,Event) for (j in 2:Event){ for (i in 1:N){ recap[i,j] = ifelse(sum(catch[i,1:(j-1)])>0,ifelse(catch[i,j]==1,1,0),0)}} #Builds matrix of data where the first row is recaptures, the second row is # catch of unmarked individuals, and the third row is the true abundundance in # the sampling reach datgen = matrix(NA,3,Event) datgen[1,] = colSums(recap)[1:Event] datgen[2,] = colSums(catch)[1:Event]-datgen[1,] datgen[3,] = sum(ifelse(hr_cent>LS,ifelse(hr_cent<US,1,0),0)) datgen } ################################################################################ # A function that fits the mark-recapture model and returns a vector of outputs # Outputs: 1) "bias"=flag that indicates if bias standards have been met, 2)"cv" # =coeficient of variation of Nhat, 2) "convergence"= flag indicating that model # converged, and 3) "recaptures"=flag indicating when there were no recaptures # in the simulated data set. ################################################################################ Solution = function(dat){ UM = dat[2,] R = dat[1,] TrueN = dat[3,1] theta<-c(Ncap=sum(UM)+20) #vector of parameters and starting vlaues small = exp(-10) MRC <- function(theta){ #mark-recapture model function Nhat=theta phat=(UM+R)/Nhat UMp<-vector(length=length(UM)) #Predicted unmarked fish in sample UMp[1]=Nhat*phat[1]; for (j in 1:(length(UM)-1)){UMp[j+1]=(Nhat- sum(UM[1:j]))*phat[j+1]} RCp<-vector(length=length(UM))#Predicted recaptures in sample RCp[1]=0; for (j in 2:length(UM)){RCp[j]=(sum(UM[1:(j-1)]))*phat[j]} nLL = -1*(sum(-(UMp+small)+(UM+small)*log(UMp+small))+sum(-(RCp+small)+ (R+small)*log(RCp+small))) #Negative log like (Poisson kernal) nLL } fit <- try(optim(theta,MRC, method="Nelder-Mead",hessian=T),T)#fits data covar <- try( sqrt(solve(fit$hessian))/(fit$par),T)#coeficient of variation conf <- try( 1.96*sqrt(solve(fit$hessian)),T) #95% confident interval +- parameter = if(inherits(conf,"try-error")){#flags if the true N is within NA #the confidence intervals of the Nest } else { ifelse(abs(TrueN-fit$par)<conf,0,100) } cofvar = if(inherits(covar,"try-error")){ #calculates cv NA } else { covar } converg = if(inherits(fit,"try-error")){#flags model runs that don't converge 999 } else { fit$convergence } recaptures = ifelse(sum(R)==0,1,0) #flags when recaptures are zero c(bias=parameter,cv=cofvar,convergence=converg,recaps=recaptures) #output } ################################################################################ # Simulation input parameters ################################################################################ N = 110 #abundance in universe pcap = 0.20 #detection probability of animals in sampling universe LU = 0 #lower boundary of universe UU = 2000 #upper boundary of universe LS = 200 #lower boundary of sampling universe HR = 20 #Home range diameter HRsd = HR/2/1.96 #standard deviation that places 95% CI at HR boundaries # vector of samping events representing different sampling designs Event = seq(2,6) # vector of upper boudaries of sampling univers representing different spatial # designs US = seq(300,1700,by=100) ################################################################################ # This part runs the simulation ################################################################################ iter = 500 # matrix for results. This matrix will contain ones and zeros. A one indicates # when the iteration resulted in an abundance estimate that met flagmatrix= array(NA,dim=c(iter,length(US),5)) ptm=proc.time() #start time for (l in 1:length(US)){ for (m in 1:length(Event)){ doit= function(){Solution(Datagen(N,pcap,LU,UU,LS,US[l],HRsd,Event[m]))} mat = replicate(iter,doit(),simplify = T) mat2 = matrix(NA,2,iter) for (i in 1:iter){ mat2[1,i] = ifelse(sum(mat[3:4,i])==0,mat[1,i],NA) mat2[2,i] = ifelse(sum(mat[3:4,i])==0,mat[2,i],NA)} flagmatrix[,l,m]=ifelse(mat2[1,]==0,ifelse(mat2[2,]<=0.25,1,0),0) }} endtime=proc.time()-ptm endtime[3]/60 #------------------------------------------------------------------------------# # Graph of results #------------------------------------------------------------------------------# flagmat <- ifelse(flagmatrix[,,]=="NA",0,flagmatrix[,,]) sucess = colMeans(flagmatrix,na.rm=T) xvec = US-LS plot(xvec,sucess[,1],ylim=c(0,1),type="l",xlab="Length of sampling reach", ylab="Probability of acceptable estimate",lty=1) for(i in 2:(max(Event)-1)){lines(xvec,sucess[,i],lty=i)}
This is a fairly simple simulation to perform and is more informative than some of the heuristics published in the past such as adding half a home range distance to the sampling area. Additionally, this may be a more user friendly method for designing mark-recapture studies than using more complex designs. For example, spatially-explicit mark-recapture models account for movement of animals by employing trap grids and explicitly modeling the movement of animals among traps to estimate home range size. While this class of model eliminates the biases discussed above, employing trap grids may not be feasible or the expertise to fit these complex models may not be available.
Hey Dan, Interesting stuff and I’m glad you’ve got the website up and running. I’d be interested if you are planning on putting up some of your more basic simulations and code. So far as I can tell, I don’t see that on the website anywhere here. But it would be nice to be able to share these simulation and analytical tools with others up here in Calgary. -Kyle
Hey Thanks Kyle. I added some example code to the post. I think adding code to these posts is a really good suggestion. Particularly because there are so many simulations I build just to learn something about the behavior of an analysis I’m learn. This would be a good place to share those. Hope you are doing well in Calgary! (looking forward to spring, I imagine:)
Spring can’t get here soon enough! Seriously, 2 days ago it was -30C… tomorrow it might be +13C… make a choice on weather Canada!
I read over Ben Bolker’s website from time to time (http://rpubs.com/bbolker) and I love that style. I think that your website here could be a similarly valuable tool for statistical ecologists and managers to refer to for guidance (Ben’s style focuses more towards the theoretical/conceptual ecology side of things). You do have a great way of describing both the “boots-on-the-ground” reality and the statistical techniques needed to improve our inference. That’s one of the reasons I think example code is really helpful; it’ll help bridge that gap towards the applied managers and ecologists who aren’t particularly quantitative or skilled in programming. Please keep it up, cheers!