You can choose to see or hide all the R code of this paper with the button below

Abstract

Introduction

Light-level geolocators are a well-established technology used to study bird migration. Relying on a simple light sensor and time clock, these devices provide location data based on sunrise and sunset. Thanks to their extremely light weight (0.5 grams), they are currently the main tracking technology available to study migration patterns of very small birds (Bridge et al. 2011; McKinnon and Love 2018). Geolocators have already helped advance our understanding of bird migration on a number of levels: identifying migration routes and non-breeding locations (e.g. Salewski et al. 2013; Smith et al. 2014; Liechti et al. 2015; Kralj et al. 2020), as well as understanding migration strategies (e.g. Adamík et al. 2016; Briedis et al. 2019; Hahn et al. 2020) and migratory connectivity (e.g. Finch et al. 2015; Procházka et al. 2017; McKinnon and Love 2018), among others.

As habitat destruction and climate change accelerate and adversely affect migrant birds, it is urgent to better understand migration patterns to effectively protect migratory bird populations (e.g. Simmons et al. 2004; Sekercioglu 2010; Şekercioĝlu, Primack, and Wormworth 2012; Vickery et al. 2014). Geolocators can be instrumental in helping us understand the routes, timing, triggers and variability of migration, as well as identify breeding, non-breeding and stopover sites to protect. This is of particular relevance for long-distance Afro-tropical migrant whose migration patterns still remain largely unknown (e.g. Bennun 2000; Benson 1982; Bussière, Underhill, and Altwegg 2015; Cox et al. 2011; Nwaogu and Cresswell 2016; Osinubi 2018). In addition, thanks to their low-cost relative to GPS solutions (e.g., Bridge et al. 2011), geolocators are particularly well-suited to projects with limited budgets.

Though the material cost for geolocator studies may be low, the fieldwork can be particularly resource-intensive, especially since the birds equipped must be recaptured to retrieve the data. Surveys therefore need to be designed in a way that optimizes both the data collected and the resources used to do so (Hauser and McCarthy 2009; Moore and McCarthy 2016; Smart et al. 2016). The design of efficient and effective survey protocols has received the much attention in capture recapture studies (e.g., Devineau, Choquet, and Lebreton 2006; Lindberg 2012). This typically involves varying the levels of effort (survey duration, number of individuals surveyed, etc.) to estimate population level characteristic traits such as survival (e.g. Lieury et al. 2017). Optimizing geolocator surveys, on the other hand, generally comes down to maximizing the number of devices retrieved. Indeed, since geolocator studies seek to uncover bird movement information at the individual level, success relies exclusively on capturing enough tracks.

We base our optimization method on the fact that geolocator studies often take place within the context of existing ringing efforts. Indeed, assessing the suitability of the species and the optimal number of individuals to be equipped with logging devices requires basic information on recapture probability. This minimizes not only costs, but also potential negative impacts on a population (Brlík et al. 2020).This initial assessment can be carried out with relatively small datasets with variable effort, or even with data from similar locations or species. Though such historical data could also be leveraged to plan geolocator fieldwork, simple tools and methods to do so are currently lacking. Drawing on a case study carried out on the Red-capped Robin-chat, an Afro-tropical migrant in Kenya, we demonstrate how performing a pre-deployment analysis using an existing ringing database can improve the planning and implementation of geolocator deployment. In particular, we show how this pre-analysis can inform two questions: (1) How many birds can we expect to equip during a full season for a given ringing schedule? Ordering geolocators requires accurately estimating the number of individuals that can realistically be captured per ringing season. Since geolocators are configured to start collecting data for a specific year when assembled, overestimating the individuals captured would result in wasting devices, while underestimating would lead to missed opportunities and inefficient field work. An accurate estimation of number of birds captured also helps design an optimal ringing effort (in terms of number of sessions, duration of sessions, number of nets, etc). (2) How can we maximize bird recapture by equipping specific classes of birds (e.g., sex, age) and targeting specific periods of the year? The ringing database provides the recapture rate of a bird as a function of the equipment date but also of its age and weight, allowing the ringer to make an informed decision about whether to equip a given bird. This decision should also account for the number of geolocators available, the number of sessions left, and the specific research question asked.

Materials and methods

Case study species

In this case study, geolocators were placed on Red-capped Robin-chats Cossypha natalensis (RCRC) (16–17 cm; 24–40 g, (Collar 2020)), a terrestrial thrush living in the forest understory, shrubland and savannas. It tends to reside in forests where the canopy is not too dense, as it relies on low shrubs. Because both resident and intra-African migrant populations coexist, the migratory patterns of this species have been difficult to understand (Collar 2020). Further complexity is added by the fact that three sub-species are found in Africa: larischi in Western Africa (Nigeria and Angola), natalensisis in South Africa and intensahas ranging from N. South Africa to Somalia as well as central Africa.

On the coast of Kenya, RCRC are present from April to October, yet their breeding location and migration routes remain unknown (Nussbaumer and Jackson 2020). They are known to hold territory on their non-breeding sites and show site fidelity as demonstrated by the high recapture rate of 42% on the study site, making them an ideal species for a geolocator study.

Capture site and database

The A Rocha Kenya Conservation centre is located on the coast of Kenya and in the middle of the Northern Zanzibar-Inhambane Coastal Forest Mosaic ecoregion (3°22’36.3”S 39°59’16.9”E). This region is recognized for its high biodiversity value (Marris 2010) yet faces increasing habitat fragmentation due to the expansion of agriculture and charcoal burning (Burgess and Clarke 2000). The Conservation centre is located on a residential coastal scrub/forest that has been protected from limited habitat change over the last 50 years (Alemayehu 2016), in the effort to preserve the ecosystems for tourism. Mist nets are placed in a nature trail that runs through a small patch of forest managed by the Conservation centre.

specie_name <- "Red-capped Robin Chat"

capture <- read.csv("data/ringing_data.csv") %>%
  filter(Location == "Mwamba Plot 28, Watamu" | Location == "Mwamba plot 28") %>%
  filter(!NettingSite %in% c("Beach", "boat shed", "DINING ROOM", "Compost", "by compost", "Kilifi Bofa", "Mwamba Dining Room", "by volunteer rooms", "Spring trap by lamp post", "Office", "Net by bird bath.", "compost site", "Bird Bath")) %>%
  filter(!SessionNotes %in% c("Net at birdbath; BC = Bernard Chege", "1 net behind birdbath")) %>%
  mutate(DayOfYear = Julian) %>%
  mutate(
    NetsOther = str_replace(NetsOther, "11 total", "11"),
    NetsOther = str_replace(NetsOther, "10mx3", "30"),
    NetsOther = str_replace(NetsOther, "10m x3", "11"),
    NetsOther = str_replace(NetsOther, "2 10m", "20"),
    NetsOther = str_replace(NetsOther, "1x10m", "10"),
    NetsOther = str_replace(NetsOther, "10m x1", "10"),
    NetssOther = str_replace(NetsOther, "10m x 1", "10"),
    NetsOther = str_replace(NetsOther, "10mx2", "20"),
    NetsOther = str_replace(NetsOther, "10mx1", "10"),
    NetsOther = str_replace(NetsOther, "10m x 1", "10"),
    NetsOther = str_replace(NetsOther, "10x1", "10"),
    NetsOther = str_replace(NetsOther, "14x1, 10", "24"),
    NetsOther = str_replace(NetsOther, "14x1,10", "24"),
    NetsOther = str_replace(NetsOther, "10x3", "30"),
    NetsOther = replace_na(NetsOther, "0"),
    NetsOther = as.numeric(NetsOther),
    NetsLength = Nets6m * 6 + Nets9m * 9 + Nets12m * 12 + Nets18m * 18 + NetsOther,
    NetsLength = ifelse(NetsLength == 0, NA, NetsLength),
    NetsDuration = as.POSIXct(NetsClosed) - as.POSIXct(NetsOpen),
    NetsOpen = hour(as.POSIXct(NetsOpen)) + minute(as.POSIXct(NetsOpen)) / 60,
    WeatherCat = ifelse(!is.na(Weather), "none", NA),
    WeatherCat = ifelse(grepl("rain", Weather), "little", WeatherCat),
    WeatherCat = ifelse(grepl("drizzle", Weather), "little", WeatherCat),
    WeatherCat = ifelse(grepl("shower", Weather), "little", WeatherCat),
    WeatherCat = ifelse(grepl("heavy", Weather), "strong", WeatherCat),
  ) %>%
  mutate(
    NetsLength = ifelse(SessionID == 971, NA, NetsLength), # remove aberant data
    NetsOpen = ifelse(SessionID == 517, 6, NetsOpen), # correct 18:00->6
    NetsDuration = ifelse(SessionID == 637, NA, NetsDuration), # correct 18:00->6
    NetsOpen = ifelse(NetsOpen > 7 | NetsOpen < 5.5, NA, NetsOpen),
    #   NetsDuration = ifelse(!(abs(NetsDuration - median(NetsDuration, na.rm = T)) < 3*sd(NetsDuration, na.rm = T)),NA,NetsDuration),
  ) %>%
  arrange(Date) %>%
  select(SessionID, RingNo, Age, Sex, CommonName, Date, Year, Month, DayOfYear, NetsLength, NetsOpen, NetsDuration, WeatherCat, Weight) %>%
  group_by(Year) %>% # compute the number of RCRC captured this year so far
  mutate(
    n = ifelse(is.na(CommonName) | CommonName != specie_name, 0, 1),
    CountOfYear = cumsum(n)
  ) %>% #
  ungroup() %>%
  group_by(RingNo) %>%
  mutate(isFirstCaptureOfYear = !(Year == lag(Year, default = FALSE))) %>% # first capture of RCRC of this year
  ungroup()

session0 <- capture %>%
  group_by(Date) %>%
  summarize(
    Date = as.Date(first(Date)),
    DayOfYear = median(DayOfYear),
    Year = median(Year),
    Count = sum(CommonName == specie_name, na.rm = TRUE),
    CountAd = sum(CommonName == specie_name & Age != 0 & Age == 4, na.rm = TRUE),
    CountJuv = sum(CommonName == specie_name & Age != 0 & Age != 4, na.rm = TRUE),
    CountFoY = sum(CommonName == specie_name & isFirstCaptureOfYear, na.rm = TRUE),
    CountFoYAd = sum(CommonName == specie_name & Age != 0 & Age == 4 & isFirstCaptureOfYear, na.rm = TRUE),
    CountFoYJuv = sum(CommonName == specie_name & Age != 0 & Age != 4 & isFirstCaptureOfYear, na.rm = TRUE),
    NetsLength = median(NetsLength),
    NetsOpen = median(NetsOpen),
    NetsDuration = as.numeric(median(NetsDuration)),
    WeatherCat = as.factor(first(WeatherCat)),
    .groups = "drop_last"
  ) %>%
  group_by(Year) %>% # compute the number of RCRC capture so far this year
  mutate(CumCountFoY = cumsum(CountFoY) - ifelse(is.na(CountFoY), 0, CountFoY)) %>%
  ungroup() %>%
  arrange(Date)

session <- session0 %>%
  filter(DayOfYear > 100 & DayOfYear < 340)

tf <- function(t) {
  paste0(
    "M=", format(as.POSIXct(Sys.Date() + mean(t, na.rm = T) / 24), "%H:%M", tz = "UTC"), " ; SD=",
    format(as.POSIXct(Sys.Date() + sd(t, na.rm = T) / 24), "%H:%M", tz = "UTC")
  )
}

In this study, we use the ringing dataset from capture sessions conducted regularly from 2002 to present. Up to early 2019, the dataset consists of 3372 entries of 2532 rings covering 96 species collected during 315 sessions. The ringing effort presents some temporal variability, as well as variability in the metadata recorded (see Appendix 1. Data extends). In general, sessions start at sunrise (M=06:12 ; SD=00:14) and last until bird activity slows down (session duration M=04:08 ; SD=01:01). On average, a total of 160 m (SD=54 m) of nets were used. Descriptive notes on weather conditions were also included and later classified according to their expected influence on the capture rate (none, little, large). We manually checked extreme values in the dataset and removed those that could not be verified. In addition, we present the ringing data of 2020, when the geolocators were first deployed. We did not include this data in the fitting of the models but rather used it for comparison and discussion purposes.

In addition, we present the ringing data of 2020, when the geolocators were first deployed. We did not include this data in the fitting of the models but rather used it for comparison and discussion purposes.

capture2020 <- read.csv("data/ringing_mwamba_rcrc_2020.csv") %>%
  filter(Location == "Mwamba") %>%
  mutate(
    Date = dmy(date),
    CloseTime = hour(hm(CloseTime)) + minute(hm(CloseTime)) / 60,
    NetsOpen = hour(hm(OpenTime)) + minute(hm(OpenTime)) / 60,
    NetsDuration = CloseTime - NetsOpen,
    n = !(Notes == "No RCRC"),
    CountOfYear = cumsum(n),
    isFirstCaptureOfYear = !NewRetrap == "R",
  ) %>%
  select(RingNo, Age, Date, NetsOpen, NetsDuration, Notes, isFirstCaptureOfYear, CountOfYear)

session2020 <- capture2020 %>%
  group_by(Date) %>%
  summarize(
    DayOfYear = as.numeric(format(Date, "%j")),
    Count = sum(!(Notes == "No RCRC"), na.rm = T),
    CountFOY = sum(!(Notes == "No RCRC") & isFirstCaptureOfYear, na.rm = T),
    NetsOpen = median(NetsOpen),
    NetsDuration = median(NetsDuration),
    .groups = "drop_last"
  ) %>%
  unique() %>%
  mutate(
    CumCountFoY = cumsum(CountFOY),
  )

Predicting the number of birds to be equipped

To address the question of how many birds can be equipped, we first model the number of new RCRC captured per session and then compute the total number of RCRC per year given various ringing scenarios. As an individual RCRC can only be equipped once a year, we must estimate the number of birds that have not yet been captured in the same year (i.e., new bird) rather than the count all RCRC. In the first step, we model this number using a generalized additive model (GAM), assuming the number of captures follows a Poisson distribution. To avoid zero counts from being too frequent, we only fitted the model on the surveys performed between the 10th of April to the 6th of December. The predictor variables tested in the model are (1) year, (2) day-of-year, (3) duration of the session, (4) total length of nets, (5) starting time, (6) weather conditions and (7) number of unique RCRC previously captured during the year. After testing several parametrizations of the model (see SM2), we retained the capture model that included day-of year as a smooth function, year as a random fixed effect and total length of nets, total duration, and number of unique RCRC as linear terms: CountFoY ~ s(Year) + s(DayOfYear) + NetsDuration + NetsLength + CumCountFoY.

To overcome the lack of sufficient data for the duration, start time and length of nets, we used multiple imputation methods (Azur et al. 2011) to generate 30 sets of data without any missing values. For each of these sets, a GAM model was fitted.

session.imputed <- session %>%
  mice(print = F, m = 30, seed = 123456789) %>%
  complete("all")

mod <- list()
modAd <- list()
modJuv <- list()
i <- 0
for (d in session.imputed) {
  i <- i + 1
  a <- d %>% mutate(Year = factor(Year))
  mod[[i]] <- gam(CountFoY ~ s(Year, bs = "re") + s(DayOfYear) + NetsDuration + NetsLength + CumCountFoY, family = poisson(), data = a)
  modAd[[i]] <- gam(CountFoYAd ~ s(Year, bs = "re") + s(DayOfYear) + NetsDuration + NetsLength + CumCountFoY, family = poisson(), data = a)
  modJuv[[i]] <- gam(CountFoYJuv ~ s(Year, bs = "re") + s(DayOfYear) + NetsDuration + NetsLength + CumCountFoY, family = poisson(), data = a)
}

In the second step, we predict the total number of RCRC that can be captured over one year as a function of a ringing scenario. A ringing scenario consists of a schedule of ringing sessions over the year together with specific characteristics (session duration, length of nets, etc.). We estimate the total number as the cumulative sum of the estimated new RCRC of the GAM model fitted above for each successive session. This operation has been done iteratively because the GAM model depends on the number of unique RCRC previously captured during the year. In order to plan an optimal ringing scenario, we evaluate the impact of various components of the ringing effort on the total number of birds captured. The different scenarios tested include:

default: 4 hr-ringing sessions every 10 days with 156 m of nets
6 hrs: same as default with longer ringing sessions (from 4 to 6 hrs)
200m: same as default with one longer total additional nets length (156 to to 200 m)
optimized schedule: same as default with weekly ringing sessions from mid-May to early July, bimonthly sessions otherwise

Finally, in order to validate the model, we compare the actual number of RCRC captured in 2020 with the corresponding model prediction using the same schedule, session duration and length of nets.

predictf <- function(modf, dayf, netsLengthf, netsDurationf, scenariof) {
  scenario <- data.frame(
    Year = 0,
    DayOfYear = dayf,
    NetsLength = netsLengthf,
    NetsDuration = netsDurationf,
    scenario = scenariof,
    CountFoY = 0,
    CumCountFoY = 0
  )

  for (i in 1:nrow(scenario)) {
    scenario$CountFoY[i] <- lapply(modf, function(x) {
      predict(x, scenario[i, ], type = "response")
    }) %>%
      unlist() %>%
      mean()
    if (i < nrow(scenario)) {
      scenario$CumCountFoY[i + 1] <- scenario$CountFoY[i] + scenario$CumCountFoY[i]
    }
  }
  scenario
}

bind_pred <- bind_rows(
  predictf(mod, dayf = seq(100, 365, 10), netsLengthf = 156, netsDurationf = 4, scenariof = "default"),
  predictf(mod, dayf = seq(1, 365, 10), netsLengthf = 156, netsDurationf = 6, scenariof = "6h"),
  predictf(mod, dayf = seq(1, 365, 10), netsLengthf = 200, netsDurationf = 4, scenariof = "200m"),
  predictf(mod, dayf = c(seq(1, 127, 14), seq(130.5, 183, 7), seq(190, 365, 14)), netsLengthf = 156, netsDurationf = 6, scenario = "optimized"),
  predictf(mod,
    dayf = session2020$DayOfYear, netsLengthf = 156, netsDurationf = session2020$NetsDuration,
    scenario = "2020"
  ),
)
pred_Ad <- predictf(modAd, dayf = seq(100, 365, 10), netsLengthf = 156, netsDurationf = 4, scenariof = "default")
pred_Juv <- predictf(modJuv, dayf = seq(100, 365, 10), netsLengthf = 156, netsDurationf = 4, scenariof = "default")

Maximizing geolocator retrieval

In this study we choose to focus on two parameters to maximize bird retrieval: the time of year and the age class (adult or juvenile). In this study, we carried out analyses comparing only adults and juveniles because it is not possible to determine the sex of a bird in hand. The retrieval probability is estimated by modelling the binomial response of whether a captured bird is retrieved in the following years: we consider that an individual is retrieved if the bird has been recaptured at least once in any of the following years, and this is independent from whether it was already captured in the past. We modelled the count of adults and juveniles per session separately to reveal the influence of age on recapture rates. The weight of the bird was left out of the model as it showed little to no effect on the recapture rate (see Figure 12). Additionally, we also compare the recapture rate for RCRC captured for the first time and those which have already been equipped.

history <- capture %>%
  filter(CommonName == specie_name) %>%
  arrange(Date) %>%
  group_by(RingNo) %>%
  mutate(
    n = 1,
    retrap_i = cumsum(n) - 1,
    isRetrap = if_else(is.na(lead(retrap_i) > retrap_i), 0, 1),
    duration_next_capture = as.numeric(difftime(lead(Date), Date, units = "days")),
    duration_last_capture = as.numeric(difftime(last(Date), Date, units = "days")),
    nextSeason = last(Year) > Year,
    Yearsince = Year - first(Year),
    isAdult = ifelse(Age == 4, TRUE, FALSE),
    isARetrap = first(Year) < Year
  ) %>%
  select(RingNo, Date, Year, DayOfYear, retrap_i, isRetrap, isARetrap, nextSeason, duration_next_capture, duration_last_capture, isAdult, Yearsince, Weight) %>%
  arrange(RingNo)

modr <- gam(history, formula = nextSeason ~ s(DayOfYear), family = "binomial")
modrA <- gam(history %>% filter(isAdult), formula = nextSeason ~ s(DayOfYear), family = "binomial")
modrJ <- gam(history %>% filter(!isAdult), formula = nextSeason ~ s(DayOfYear), family = "binomial")
modrO <- gam(history %>% filter(retrap_i > 0), formula = nextSeason ~ s(DayOfYear), family = "binomial")
modrN <- gam(history %>% filter(retrap_i == 1), formula = nextSeason ~ s(DayOfYear), family = "binomial")

predictmodr <- data.frame(DayOfYear = seq(min(history$DayOfYear), max(history$DayOfYear))) %>%
  mutate(
    r.fit = predict(modr, ., type = "response"),
    r.se.fit = predict(modr, ., type = "response", se.fit = T) %>% .$se.fit,
    rA.fit = predict(modrA, ., type = "response"),
    rA.se.fit = predict(modrA, ., type = "response", se.fit = T) %>% .$se.fit,
    rJ.fit = predict(modrJ, ., type = "response"),
    rJ.se.fit = predict(modrJ, ., type = "response", se.fit = T) %>% .$se.fit,
  )

All computation was performed on R (R Core Team 2013), using the MCGV package (Wood 2017) for GAM, and the Mice Package (Buuren and Groothuis-Oudshoorn 2011) for the imputation.

Results

Predicting the number of birds to be equipped

As any typical migrant, the number of RCRC captured per session shows a strong dependence with the day of year (P<.001), with a bi-modal shape including a peak passage in early June with almost three RCRC per session and a second smaller peak in mid-September with two birds per session on average (Figure [11]). The number of new captures per session was found to be dependent with year (P<.001), length of the nets (P<.001), duration of sessions (P<.01) and number of previously captured RCRC (P<.1). Starting time was found to be not significant (P<1) and was therefore excluded from the final model as RCRC are generally active and caught equally throughout the morning and no session was held later in the morning or afternoon. Weather category was also found to be not significant (P<1) and consequently excluded. Five different ringing scenarios were used to (1) estimate the total number of unique RCRC which can be captured in a year and (2) compare different ringing strategies and define an optimal scenario (Figure [1]).

Compared with the total number of birds caught with the default scenario (19 birds), increasing the duration of capture sessions by two hours (‘6hr’) or adding 44m of additional nets (‘200m’) results in respectively 1 and 2 more birds caught. This is likely because RCRC are mostly active in the early hours of the day and additional nets are placed in sub-optimal habitats. However, the optimized scenario yields many more birds, with a total of 23 birds captured in fewer sessions (31 instead of 27). In 2020, 41 capture sessions were held with 156 m of mist nets with an average duration of 3 hours 45 minutes (SD: 0:59). As of 1 November, a total of 30 RCRC were captured from 22 unique individuals. For the exact same information, the model predicted an expected total of 22 unique RCRC. The arrival date proved to be earlier than the average and the numbers appeared to be higher than average at the beginning of the season. This could be due to a particularly good breeding season as attested by the high number of juveniles caught during this period.

p <- bind_pred %>%
  ggplot() +
  # geom_ribbon( aes(group = scenario, color = factor(scenario), x=DayOfYear, ymin = fit.cumsum-se.fit.cumsum, ymax = fit.cumsum+se.fit.cumsum), alpha=0.1, color = NA) +
  # geom_step(   aes(group = scenario, color = factor(scenario), x=DayOfYear, y=fit.cumsum), size=1)+
  # geom_ribbon( aes(group = scenario, color = factor(scenario), x=DayOfYear, ymin = fit.cumsum.FOY-3*se.fit.cumsum.FOY, ymax = fit.cumsum.FOY+3*se.fit.cumsum.FOY), alpha=0.1, color = NA) +
  geom_line(aes(group = scenario, color = factor(scenario), x = DayOfYear, y = CumCountFoY), size = 1) +
  geom_point(aes(group = scenario, color = factor(scenario), x = DayOfYear, y = CumCountFoY), size = 2) +
  geom_line(data = session2020, aes(x = DayOfYear, y = CumCountFoY), size = 1) +
  geom_point(data = session2020, aes(x = DayOfYear, y = CumCountFoY), size = 2) +
  labs(x = "Day of Year", y = "Cumulative number of unique RCRC captured") +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    expand = c(0, 0),
    limits = c(100, 365)
  ) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16, legend.position = "top")

# ggsave(filename = 'figures/figure3.pdf', plot = print(p), width = 16, height=9, units = "cm")
# print(p)
ggplotly(p, width = wid, height = hei)

Figure 1: Model predictions of the total unique RCRC caught along a year following different scenarios. The default scenario consists of 4hr ringing sessions using 156m of nets every 10 days. ‘6h’ and ‘225m’ are modifications of the default scenario, and ‘optimized’ increases the number of sessions (to every week) during the peak passage (mid-May – July) and decreases them (to every 2 weeks) during the rest of the year. Finally, using the exact date, duration, and net length used in 2020, the model prediction ‘2020 model’ can be compared to the actual data (‘2020 data’).

Maximizing geolocator retrieval

ring <- capture %>%
  filter(CommonName == specie_name) %>%
  arrange(Date) %>%
  group_by(RingNo) %>%
  mutate(
    n = 1,
    retrap_i = cumsum(n) - 1,
    isRetrap = if_else(is.na(lead(retrap_i) > retrap_i), 0, 1),
    nextSeason = last(Year) > Year,
    Yearsince = Year - first(Year),
    isAdult = ifelse(Age == 4, TRUE, FALSE),
    isARetrap = first(Year) < Year
  ) %>%
  select(RingNo, Date, Year, DayOfYear, retrap_i, isRetrap, isARetrap, nextSeason, isAdult, Yearsince, Weight) %>%
  arrange(RingNo)

Over the 161 unique RCRC individuals captured in the dataset, 67 (42%) were recaptured at least once (including recapture the same year). When considering the 301 capture events (including same individuals), the general recapture rate increases to 47%. However, looking at captures with recapture in any subsequent year, the recapture rate is 30%. In this study, we consider the retrieval rate as a recapture any subsequent years (latter definition), similar to the procedure employed with geolocators.


    2-sample test for equality of proportions with continuity
    correction

data:  c(sum(ring %>% filter(isAdult) %>% .$nextSeason), sum(ring %>% filter(!isAdult) %>% .$nextSeason)) out of c(nrow(ring %>% filter(isAdult)), nrow(ring %>% filter(!isAdult)))
X-squared = 1.3316, df = 1, p-value = 0.1243
alternative hypothesis: greater
95 percent confidence interval:
 -0.02686599  1.00000000
sample estimates:
   prop 1    prop 2 
0.3385827 0.2701149

In general, adults show a slightly higher recapture rate (34% ; n=127) than juveniles (27% ; n=174), but statistically not significant (test of proportions, P<.1). When modelled over the day of year (Figure 2a), the recapture rate in subsequent years shows that birds equipped later in the year are twice as likely to be recaptured, with a recapture rate increasing from 25% to almost 50%. Separating adults from juveniles allows us to identify further trends. The increase in juveniles’ recapture rate is not significant. By contrast, adults show a clear increase from May to August, before stabilising from September to October. Modelling the number of captures of adults and juveniles (Figure 2b), we observe an earlier arrival of juveniles (late May vs early June for adults) and earlier departure in August, while adults show a second peak early October.

Finally, a RCRC which has already been captured in the past has a higher recapture rate (36%) compared to a bird without a ring (24). We did not find a significant effect of weight on retrieval rate (see Figure 12) which is known to usually impact survival rate (e.g., Saether 1989).

p1 <- ggplot() +
  geom_point(data = history, aes(x = DayOfYear, y = as.numeric(nextSeason), color = isAdult), size = 2) +
  geom_line(data = predictmodr, aes(x = DayOfYear, y = rJ.fit), color = wes_palette("Cavalcanti1")[1], size = 1) +
  geom_ribbon(data = predictmodr, aes(x = DayOfYear, ymin = rJ.fit - 3 * rJ.se.fit, ymax = rJ.fit + 3 * rJ.se.fit), alpha = 0.3, fill = wes_palette("Cavalcanti1")[1]) +
  geom_line(data = predictmodr, aes(x = DayOfYear, y = rA.fit), color = wes_palette("Cavalcanti1")[2], size = 1) +
  geom_ribbon(data = predictmodr, aes(x = DayOfYear, ymin = rA.fit - rA.se.fit, ymax = rA.fit + rA.se.fit), alpha = 0.3, fill = wes_palette("Cavalcanti1")[2]) +
  geom_line(data = predictmodr, aes(x = DayOfYear, y = r.fit), color = wes_palette("Cavalcanti1")[4], size = 2) +
  geom_ribbon(data = predictmodr, aes(x = DayOfYear, ymin = r.fit - r.se.fit, ymax = r.fit + r.se.fit), alpha = 0.3, fill = wes_palette("Cavalcanti1")[4]) +
  scale_color_manual(values = wes_palette("Cavalcanti1")) +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    # expand = c(0,0),
    minor_breaks = c(),
    limits = c(100, 350)
  ) +
  scale_y_continuous(
    minor_breaks = c(),
    # expand = c(0,0),
    labels = scales::percent
  ) +
  theme(aspect.ratio = 9 / 16, legend.position = "none") +
  labs(y = "Probability of Recapture", color = "Adult", x = "Day of Year")

p2 <- session %>%
  ggplot() +
  geom_point(aes(x = DayOfYear, ifelse(CountAd > 2, 2, CountAd), size = CountAd), color = wes_palette("Cavalcanti1")[2]) +
  geom_point(aes(x = DayOfYear, y = ifelse(CountJuv > 2, 2, CountJuv), size = CountJuv), color = wes_palette("Cavalcanti1")[1]) +
  geom_line(data = pred_Juv, aes(x = DayOfYear, y = CountFoY), color = wes_palette("Cavalcanti1")[2], size = 1) +
  geom_line(data = pred_Ad, aes(x = DayOfYear, y = CountFoY), color = wes_palette("Cavalcanti1")[1], size = 1) +
  # geom_point(data = dm2020s, aes( x=DayOfYear, y=ifelse(CountJuv>2,2,CountJuv), size=CountJuv),stroke=0, colour=wes_palette('Cavalcanti1')[5])+
  # geom_point(data = dm2020s, aes( x=DayOfYear, y=ifelse(CountAd>2,2,CountAd), size=CountAd),stroke=0, colour=wes_palette('Cavalcanti1')[4] )+
  labs(x = "Day of Year", y = "Count") +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    minor_breaks = c(),
    # expand = c(0,0),
    limits = c(100, 365)
  ) +
  scale_y_continuous(
    limits = c(0, 2),
    # expand = c(0,0),
    minor_breaks = c()
  ) +
  theme(aspect.ratio = 9 / 16, legend.position = "none")

p <- arrangeGrob(p1, p2)

subplot(ggplotly(p1, width = wid, height = hei * 2), ggplotly(p2, width = wid, height = hei * 2), shareX = T, nrows = 2)

Figure 2: Comparison of adult (green) and juvenile (yellow) trend in (a) recapture rate in subsequent year and (b) number of captures per session throughout the year.

Informing geolocator deployment

We illustrate how the results above have informed the practical deployment of geolocators on RCRC in 2020 and following years. In light of the results, rather than increasing session duration or net length, we favoured increasing the frequency of ringing sessions when numbers are highest (mid-May to early July). Results show that while waiting for July/August seemed preferable to increase the retrieval rate, the number of birds captured decreases strongly in this period. We thus sought to find a trade-off between deploying all the devices and equipping as many as possible later in the year by releasing some birds captured earlier in the year without a device. In order to test the hypothesis of variable departure/arrival dates based on age, we wanted to equip both juveniles and adults. We therefore only equipped six RCRC (of a total of 15 geolocators) before mid-June, when juveniles were more common to keep enough geolocators for July and August, when adults were more common.

Discussion

The new method presented in this study leverages ringing datasets to enable fact-based planning of a geolocator studies. The main objective is to support cost-effective planning of fieldwork and to determine the optimal number of logging devices, taking into account the specific research question. This relatively simple method does not replace or compete with the various more complex existing capture-recapture models (McCrea and Morgan 2014).

Predicting the number of birds to be equipped

Varying the different components of the survey design can greatly affect the number of captures, and in turn the cost-efficiency of a study (e.g., Lieury et al. 2017). Our method allows to test the effect of each effort component on the total number of captures, thus enabling researchers to refine the survey design in view of optimizing number of captures. In the RCRC case study, we found that increasing net length or session duration carried limited added-value compared to increasing the frequency of ringing sessions when bird numbers are highest (mid-May to early July). The pre-analysis presented here is particularly useful in contexts where the ringing database is limited, or past ringing effort has been variable (e.g., (Ruiz-Gutiérrez et al. 2012).

Maximizing geolocator retrieval

For the geolocator study to be successful, not only do we need to maximize the number of birds equipped (with minimal ringing effort), but we also need to optimize the device retrieval rate. Modelling the geolocator retrieval probability does not need to separate the survival, recapture and emigration rates. Therefore, a complex capture recapture model (McCrea and Morgan 2014) was not required. Instead, we use a simple approach able to identify the time of year and class of bird which yield the highest geolocator retrieval rates. In the case of the Red-capped Robin-chats, retrieval is highest among adults, which is typically the case for birds (e.g., Gardali et al. 2003), and among birds who have already been ringed in the past. The results provided useful conclusions to adjust the timing and intensity of the field work, almost doubling the expected logger retrieval rate. The approach followed in this paper contains some limitations. Firstly, a bird was considered retrieved if it was captured again in any year following the initial capture. However, for a geolocator study, the retrieval needs to happen within the duration of the study. The retrieval rate of the full dataset (30%) reduces to 19% when restricting to retrievals taking place the following year, 26% for the following two years, and 28% for the following three years. This suggests that ringing should continue for at least two years following equipment to benefit from maximal geolocator data. Secondly, by using retrieval data from the ringing database as a proxy for retrievals involving geolocators, we ignore the effect geolocators may have on survival compared to ringed birds (e.g., Brlík et al. 2020; Streby et al. 2015; Weiser et al. 2015). Therefore, it does not replace the need for a control group. The potential weight dependent impact of geolocator on survival (e.g., Brlík et al. 2020) would not be accounted for in this current analysis and would result in preference of equipment of heavier bird. Finally, equipping specific classes of birds needs to be carefully considered in light of the research question. Geolocator studies are inherently biased in that we only learn about birds that (1) have initially been captured in the mist net, (2) have survived until the subsequent year, and (3) have come back to the exact same area. Our approach relies on the fact that individual bird respond differently to these effects based on their age, sex or weight. Targeting specific classes can only accentuates the biases inherent to geolocator studies. This should be duly acknowledged in the study and accounted for in the analysis.

Outlook

Although the model and results of this study are tailored to the specific case of the RCRC on coastal Kenya, the application of this methodology can be extended to other situations where some historical ringing data is available for a given study site and species. In general, our methodology is only applicable for cases where the deployment of geolocators is performed with a similar protocol (e.g., mist net, nest trap, spring nest trap) and context (e.g., place, time, general ringing effort) as the ringing database. In this study, we carried out analyses comparing only adults and juveniles because it is not possible to determine the sex of a bird in hand. The same analysis can be performed on any class of bird identifiable in hand (sex, molt stage, breeding status, subspecies). The approach presented can be applied to any study relying on the recapture of specific individuals (e.g., archival GPS, (Hallworth and Marra 2015)).

Appendix

Appendix 1. Data extends

The capture sessions are relatively well-spread throughout the year (y-axis in Figure 3), although with a slightly higher intensity in March-April than June-July or December-January. The distribution is more heterogeneous when comparing different years (x-axis in Figure 3): there is very good coverage between 2003 and 2007, variable from 2008 to 2012, and relatively stable since then.

d_sesMY <- capture %>%
  group_by(SessionID) %>%
  summarise(Month = first(Month), Year = first(Year), .groups = "drop") %>%
  group_by(Month, Year) %>%
  summarise(nb = n(), .groups = "drop")

p1 <- ggplot(d_sesMY, aes(x = Year, y = Month, color = nb)) +
  geom_point(size = 10, shape = 15) +
  scale_colour_gradientn(colours = brewer.pal(9, "YlGnBu")) +
  coord_fixed() +
  scale_y_continuous(breaks = 1:12) +
  scale_x_continuous(breaks = min(d_sesMY$Year):max(d_sesMY$Year))
p3 <- ggplot(d_sesMY, aes(x = Year)) +
  geom_histogram(bins = length(min(d_sesMY$Year):max(d_sesMY$Year))) +
  scale_x_continuous(breaks = min(d_sesMY$Year):max(d_sesMY$Year))
p2 <- ggplot(d_sesMY, aes(x = Month)) +
  geom_histogram(bins = 12) +
  coord_flip() +
  scale_x_continuous(breaks = 1:12)

p <- arrangeGrob(p1, p2, p3, layout_matrix = cbind(c(2, 2, 2, 6), c(1, 1, 1, 3), c(1, 1, 1, 3), c(1, 1, 1, 3)))

# ggsave(filename = 'figures/figureSM1.pdf', plot = p, width = 50, height=50, units = "cm")
grid.arrange(p)

Figure 3: Distribution of the ringing sessions according to year and month. Colour scale indicates the number of ringing sessions.

# subplot(ggplotly(p1), ggplotly(p2),ggplotly(p3), nrows=1)

Additional information for each session was available for some sessions: start time (data available for 75% of the sessions), closing time (36%), sum of net lengths (21%), weather conditions (48%).

plot1 <- session %>% ggplot() +
  geom_histogram(aes(x = NetsLength), binwidth = 25) +
  scale_x_continuous(breaks = seq(0, 250, 50), minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0))
plot2 <- session %>% ggplot() +
  geom_histogram(aes(x = NetsDuration), binwidth = 0.5) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0))
plot3 <- session %>%
  filter(!is.na(WeatherCat)) %>%
  ggplot() +
  geom_histogram(aes(x = WeatherCat), stat = "count") +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0))
plot4 <- session %>% ggplot() +
  geom_histogram(aes(x = NetsOpen), binwidth = 0.25) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0))
p <- arrangeGrob(plot1, plot2, plot3, plot4, ncol = 2)

# ggsave(filename = 'figures/figureSM2.pdf', plot = p, width = 16, height=9, units = "cm")
# grid.arrange(p)
subplot(ggplotly(plot1, width = wid, height = hei), ggplotly(plot2, width = wid, height = hei), ggplotly(plot3, width = wid, height = hei), ggplotly(plot4, width = wid, height = hei), nrows = 2)

Figure 4: Histograms of the metadata collected for each ringing session (N=317) of (a) total length of nets (N=73), (b) duration of ringing session (N=124), (c) weather category (N=143) and (d) time of session start (N=235).

DT::datatable(session %>% arrange(Date),
  width = "100%",
  extensions = "Buttons",
  options = list(scrollX = TRUE, dom = "Bfrtip", buttons = c("csv"))
)

Figure 5: Sessions dataset

DT::datatable(capture %>% arrange(Date),
  width = "100%",
  extensions = "Buttons",
  options = list(scrollX = TRUE, dom = "Bfrtip", buttons = c("csv"))
)

Figure 6: Capture dataset

DT::datatable(capture2020 %>% arrange(Date),
  width = "100%",
  extensions = "Buttons",
  options = list(scrollX = TRUE, dom = "Bfrtip", buttons = c("csv"), pageLength = 5)
)

Figure 7: 2020 dataset

Appendix 2. Capture model

Before fitting the GAM modelling the count of new RCRC captured per session, we explored the response of each of the variables separately with a GAM or GLM (Figure 8).

Year (Figure8a). A general decline in the overall number of birds is observed over the 20 years of the dataset. However, this trend was not estimated to be realistic but possibly due to change in survey effort or net location. Year was included as a random fixed effect.
Day-of-year (Figure 8b). Day-of-year has a strong influence on the number of captures and varies non-linearly. This variable is thus included in the model as a smoothing term.
Duration (Figure 8c). The duration of the session computed as the difference between closing time and opening time shows a positive correlation with the number of captures. It is thus included in the model as a linear term.
Net opening time (Figure 8d). The fit of the opening time seems to indicate a higher capture rate for sessions starting later. This relationship is contrary to common knowledge and considered non-meaningful. It is thus not retained for the model.
Sum of net lengths (Figure 8e). Between 50 and 200m, the fit shows an increase of captures as the total length of the nets increases. Yet, above 200m, the fit shows a stabilisation of the count. This is explained by the fact that the nets added above 200m are located in habitats which are not ideal for RCRC and thus do not contribute to an increase in capture. This term is included as a smoothing term.
Weather categories (Figure 8f). The weather categories do not show a clear pattern and are thus not included in the model.

plot1 <- # ggplot(data=predictmod %>% filter(date<'2019-01-01' & date>'2002-01-01'),aes(x=date)) +
  ggplot(data = session %>% filter(Date > "2002-01-01"), aes(x = Year, y = Count)) +
  geom_smooth(method = "gam", formula = y ~ s(x), method.args = list(family = "poisson")) +
  # geom_point(data= session, aes(Year, Count)) +
  geom_point(data = (session %>% filter(Date > "2002-01-01") %>% mutate(Countt = ifelse(Count > 6, 6, Count)) %>% group_by(Year, Countt) %>% tally() %>% filter(!is.na(Year))), aes(x = Year, y = Countt, size = n, color = n)) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour"), limits = c(0, 25)) +
  geom_point(data = session2020 %>% mutate(Count = ifelse(Count > 6, 6, Count)) %>% group_by(Count) %>% tally(), aes(x = 2020, y = Count, size = n)) +
  ylab("Number of RCRC capture per session") +
  xlab("(a) Year") +
  theme(aspect.ratio = 9 / 16)

plot2 <- session %>% ggplot(aes(y = Count, x = DayOfYear)) +
  geom_smooth(method = "gam", formula = y ~ s(x), method.args = list(family = "poisson")) +
  geom_point(data = (session %>% mutate(nlg = round(DayOfYear / 31) * 31, Countt = ifelse(Count > 6, 6, Count)) %>% group_by(nlg, Countt) %>% tally() %>% filter(!is.na(nlg))), aes(x = nlg, y = Countt, size = n, color = n)) +
  scale_x_continuous(
    minor_breaks = c(), expand = c(0, 0),
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b")
  ) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour"), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16) +
  xlab("(b) DayOfYear")


plot3 <- session %>% ggplot(aes(y = Count, x = NetsDuration)) +
  geom_point(data = (session %>% mutate(nlg = round(NetsDuration / 0.5) * 0.5, Countt = ifelse(Count > 6, 6, Count)) %>% group_by(nlg, Countt) %>% tally() %>% filter(!is.na(nlg))), aes(x = nlg, y = Countt, size = n, color = n)) +
  geom_smooth(method = "glm", formula = y ~ x, method.args = list(family = "poisson")) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour"), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16) +
  xlab("(c) Duration")

plot4 <- session %>% ggplot(aes(y = Count, x = NetsOpen)) +
  geom_smooth(formula = y ~ x, method = "glm", method.args = list(family = "poisson")) +
  geom_point(data = (session %>% mutate(nlg = round(NetsOpen / 0.125) * 0.125, Countt = ifelse(Count > 6, 6, Count)) %>% group_by(nlg, Countt) %>% tally() %>% filter(!is.na(nlg))), aes(x = nlg, y = Countt, size = n, color = n)) +
  scale_x_continuous(minor_breaks = seq(5.5, 7, 0.125), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour"), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16) +
  xlab("(d) Net opening time")

plot5 <- session %>% ggplot(aes(y = Count, x = NetsLength)) +
  geom_smooth(method = "gam", formula = y ~ s(x, k = 2), method.args = list(family = "poisson")) +
  geom_point(data = (session %>% mutate(nlg = round(NetsLength / 25) * 25, Countt = ifelse(Count > 6, 6, Count)) %>% group_by(nlg, Countt) %>% tally() %>% filter(!is.na(nlg))), aes(x = nlg, y = Countt, size = n, color = n)) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour"), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16) +
  xlab("(e) Sum of net lengths")

plot6 <- session %>%
  filter(!is.na(WeatherCat)) %>%
  ggplot(aes(x = Count, fill = WeatherCat)) +
  geom_histogram(binwidth = 1) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(-1, 7)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 25)) +
  theme(aspect.ratio = 9 / 16) +
  coord_flip() +
  xlab("(f) Weather categories")

p <- arrangeGrob(plot1, plot2, plot3, plot4, plot5, plot6, ncol = 2, nrow = 3)

# ggsave(filename = 'figures/figureSM3.pdf', plot = p, width = 16*2, height=9*3, units = "cm")

# grid.arrange(p)
subplot(ggplotly(plot1, width = wid, height = hei * 3 / 2),
  ggplotly(plot2, width = wid, height = hei * 3 / 2),
  ggplotly(plot3, width = wid, height = hei * 3 / 2),
  ggplotly(plot4, width = wid, height = hei * 3 / 2),
  ggplotly(plot5, width = wid, height = hei * 3 / 2),
  ggplotly(plot6, width = wid, height = hei * 3 / 2),
  nrows = 3
)

Figure 8: Number of RCRC captured by session as a function of (a) total length of nets, (b) duration of ringing session, (c) weather category and (d) time of session start. The red line with shaded area is a smoothed curved fitted on the data (GAM or GLM)

Model fit summary for the final model

summary(mod[[1]])


Family: poisson 
Link function: log 

Formula:
CountFoY ~ s(Year, bs = "re") + s(DayOfYear) + NetsDuration + 
    NetsLength + CumCountFoY

Parametric coefficients:
              Estimate Std. Error z value Pr(>|z|)  
(Intercept)  -0.592754   0.410193  -1.445   0.1484  
NetsDuration -0.098981   0.063432  -1.560   0.1187  
NetsLength    0.002588   0.001263   2.049   0.0404 *
CumCountFoY  -0.010199   0.016051  -0.635   0.5252  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf Ref.df Chi.sq p-value    
s(Year)      11.739 17.000  62.59  <2e-16 ***
s(DayOfYear)  7.644  8.449  77.21  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.382   Deviance explained = 47.7%
UBRE = 0.33766  Scale est. = 1         n = 218

par(mfrow = c(3, 2), mar = c(2, 2, 0, 0))
visreg(mod[[1]], scale = "response", ylim = c(0, 6))
# plot(mod[[1]],all.terms = T, scale=0, shade=T)

Figure 9: Model fit

Model fit summary for the model including all possible variables for comparison

a <- session.imputed[[1]]
tmp <- gam(CountFoY ~ s(Year) + s(DayOfYear) + NetsDuration + NetsLength + NetsOpen + WeatherCat + CumCountFoY, family = poisson(), data = a)
summary(tmp)


Family: poisson 
Link function: log 

Formula:
CountFoY ~ s(Year) + s(DayOfYear) + NetsDuration + NetsLength + 
    NetsOpen + WeatherCat + CumCountFoY

Parametric coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)      -1.134424   2.466119  -0.460 0.645513    
NetsDuration     -0.234604   0.069266  -3.387 0.000707 ***
NetsLength        0.003592   0.001361   2.640 0.008297 ** 
NetsOpen          0.171697   0.394989   0.435 0.663789    
WeatherCatlittle  0.787836   0.189103   4.166  3.1e-05 ***
WeatherCatstrong  0.353602   0.247061   1.431 0.152363    
CumCountFoY      -0.045877   0.013645  -3.362 0.000773 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
               edf Ref.df Chi.sq p-value    
s(Year)      1.000  1.000  36.24  <2e-16 ***
s(DayOfYear) 7.998  8.673  51.50  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.421   Deviance explained = 46.5%
UBRE = 0.29663  Scale est. = 1         n = 218

par(mfrow = c(4, 2), mar = c(2, 2, 0, 0))
visreg(tmp, scale = "response", ylim = c(0, 9))

Figure 10: Model fit including all variables

p <- ggplot() +
  geom_point(data = session %>% mutate(CountT = ifelse(Count > 6, 6, Count)), aes(x = DayOfYear, y = CountT, size = Count, color = Year), stroke = 0, alpha = .6) +
  geom_point(data = session2020, aes(x = DayOfYear, y = Count, size = Count), stroke = 0) +
  # geom_ribbon(data = bind_pred %>% filter(scenario=="default"), aes(x=DayOfYear, ymin = lwrY, ymax = uprY ), color = NA, alpha=0.3) +
  geom_line(data = bind_pred %>% filter(scenario == "default"), aes(x = DayOfYear, y = CountFoY), size = 1) +
  scale_color_gradientn(colours = wes_palette(pal.name, type = "continuous"), aesthetics = c("colour")) +
  xlab("Day of Year") +
  ylab("Number of Capture per Session") +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    minor_breaks = c(),
    # expand = c(0,0),
    limits = c(100, 365)
  ) +
  scale_y_continuous(
    breaks = 0:6,
    minor_breaks = c(),
    labels = c(0, 1, 2, 3, 4, 5, "6+"),
    # expand = c(0,0),
    # limits = c(0,6)
  ) +
  theme(aspect.ratio = 9 / 16) +
  scale_size(limits = c(-2, 11))

# ggsave(filename = 'figures/figure1.pdf', plot = print(p), width = 16, height=9, units = "cm")
ggplotly(p, width = wid, height = hei)

Figure 11: Seasonal variation in the number of captured RCRC per session. Points represent actual data points while the black line is the model estimate for the default/average case (156m, 4hr) with its corresponding uncertainty illustrated as shaded area. Sessions with 6 or more RCRC are illustrated on the “6+” line but with proportionate size and colour. Black points correspond to the 2020 data, when the geolocators were equipped.

session %>% ggplot() +
  geom_line(aes(group = Year, color = factor(Year), x = DayOfYear, y = CumCountFoY), size = 1)

bind_pred_y <- data.frame()
for (y in unique(session$Year)) {
  mody <- list()
  i <- 0
  for (d in session.imputed) {
    i <- i + 1
    a <- d %>%
      filter(Year != y) %>%
      mutate(Year = factor(Year))
    mody[[i]] <- gam(CountFoY ~ s(Year, bs = "re") + s(DayOfYear) + NetsDuration + NetsLength + CumCountFoY, family = poisson(), data = a)
  }
  tmp <- session.imputed[[1]] %>% filter(Year == y)
  bind_pred_y <- bind_rows(
    bind_pred_y,
    predictf(mody, dayf = tmp$DayOfYear, netsLengthf = tmp$NetsLength, netsDurationf = tmp$NetsDuration, scenariof = y)
  )
}

p <- left_join(
  session %>% group_by(Year) %>%
    summarise(
      session = max(CumCountFoY)
    ),
  bind_pred_y %>% mutate(Year = scenario) %>% group_by(Year) %>%
    summarise(
      model = max(CumCountFoY)
    ),
  by = "Year"
) %>%
  ggplot() +
  geom_point(aes(x = session, y = model, color = Year)) +
  geom_abline(slope = 1, intercept = 0) +
  coord_fixed() +
  labs(x = "Actual data", y = "Model estimation") +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 30)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 30)) +
  theme(aspect.ratio = 1, legend.position = "top")

# ggsave(filename = 'figures/figureSM8.pdf', plot = print(p), width = 9, height=9, units = "cm")
ggplotly(p, width = wid, height = hei)

Appendix 3. Recapture model

p1 <- ring %>%
  filter(Weight < 50) %>%
  ggplot(aes(x = Weight, colour = nextSeason)) +
  geom_histogram(binwidth = 1) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 50)) +
  theme(aspect.ratio = 9 / 16, legend.position = "none")


p2 <- ring %>%
  filter(Weight < 50) %>%
  ggplot(aes(y = as.numeric(nextSeason), x = Weight)) +
  geom_smooth(method = "glm", formula = y ~ x, method.args = list(family = "binomial")) +
  scale_x_continuous(minor_breaks = c(), expand = c(0, 0)) +
  scale_y_continuous(minor_breaks = c(), expand = c(0, 0), limits = c(0, 1), labels = scales::percent) +
  theme(aspect.ratio = 9 / 16)

p <- arrangeGrob(p1, p2)

# ggsave(filename = 'figures/figureSM4.pdf', plot = p, width = 16, height=9*2, units = "cm")
# grid.arrange(p)
subplot(ggplotly(p1, width = wid, height = hei * 2), ggplotly(p2, width = wid, height = hei * 2), nrows = 2)

Figure 12: Histograms of the weight of RCRC recaptured in a following year and those not recaptured, together with the model fit. The uncertainty of the model shows that weight has an unclear influence on the recatpure rate.

p <- bind_pred %>%
  filter(scenario == "default") %>%
  left_join(pred_Juv, by = c("Year", "DayOfYear", "NetsLength", "NetsDuration", "scenario"), suffix = c("", "juv")) %>%
  left_join(pred_Ad, by = c("Year", "DayOfYear", "NetsLength", "NetsDuration", "scenario"), suffix = c("", "ad")) %>%
  left_join(predictmodr, by = "DayOfYear") %>%
  mutate(
    exAd = CountFoYad * rA.fit,
    exJuv = CountFoYjuv * rJ.fit,
    ex = CountFoY * r.fit
  ) %>%
  ggplot() +
  geom_line(aes(x = DayOfYear, y = exJuv), color = wes_palette("Cavalcanti1")[1], size = 1) +
  geom_line(aes(x = DayOfYear, y = exAd), color = wes_palette("Cavalcanti1")[2], size = 1) +
  geom_line(aes(x = DayOfYear, y = ex), color = wes_palette("Cavalcanti1")[4], size = 2) +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    limits = c(100, 350)
  ) +
  theme(aspect.ratio = 9 / 16, legend.position = "top") +
  labs(y = "Expection of Recapture", x = "Day of Year")

# ggsave(filename = 'figures/figureSM5.pdf', plot = p, width = 16, height=9, units = "cm")
# print(p)
ggplotly(p, width = wid, height = hei)

p <- ring %>% ggplot(aes(x = DayOfYear, y = as.numeric(isARetrap))) +
  geom_point(size = 2) +
  geom_smooth(method = "glm", formula = y ~ x, method.args = list(family = "binomial")) +
  scale_color_manual(values = wes_palette("Cavalcanti1")) +
  scale_x_continuous(
    breaks = as.numeric(format(ISOdate(2004, 1:12, 1), "%j")),
    labels = format(ISOdate(2004, 1:12, 1), "%b"),
    # expand = c(0,0),
    minor_breaks = c(),
    limits = c(100, 350)
  ) +
  scale_y_continuous(
    minor_breaks = c(),
    # expand = c(0,0),
    labels = scales::percent
  ) +
  theme(aspect.ratio = 9 / 16, legend.position = "none") +
  labs(y = "Probability of Capturing a recapture", color = "Adult", x = "Day of Year")

# ggsave(filename = 'figures/figureSM6.pdf', plot = p, width = 16, height=9, units = "cm")
# print(p)
ggplotly(p, width = wid, height = hei)

Figure 13: Variation throughout the year of the probability that a bird captured is a bird that has been captured in previous years. Blue dots denote the data for each capture of the database and the red line denote the GLM fit of a binomial model. The probability doesn’t change throughout the year indicating that maximizing the capture of bird captured previous season is the same as maximizing the number of capture.

nextSeasonI <- c()
for (i in (1:10)) {
  nextSeasonI[i] <- mean(pmap_lgl(ring %>% select(Year, RingNo), ~ any(.x < ring$Year & .x >= ring$Year - i & .y == ring$RingNo)))
}


p <- data.frame(x = (1:10), y = nextSeasonI) %>% ggplot(aes(y = y, x = x)) +
  geom_point(size = 2) +
  geom_line() +
  scale_y_continuous(expand = c(0, 0), minor_breaks = c(), labels = scales::percent, lim = c(0, .5)) +
  scale_x_continuous(expand = c(0, 0), breaks = c(1:10), minor_breaks = c()) +
  theme(aspect.ratio = 9 / 16, legend.position = "none") +
  labs(y = "Probability of recapture", color = "Adult", x = "Year after capture")

# ggsave(filename = 'figures/figureSM7.pdf', plot = p, width = 16, height=9, units = "cm")
# print(p)
ggplotly(p, width = wid, height = hei)

Acknowledgements

This work was supported by a grant from the Swiss Ornithological Institute and by the Swiss National Science Foundation (grant no. 191138). We thank A Rocha Kenya for providing the ringing dataset, and for carrying out field work to equip Red-capped Robin-chats with geolocators. We thank Martins Briedis, Matthew Sjaarda, Burri Reto, Wesley Hochachka, Alison Johnston and Améline Nussbaumer for their valuable contributions to the paper.

Data Accessibility

The data and code that support the findings of this study are openly available in [Github] at github.com/A-Rocha-Kenya/MaximizingRecapture.

Adamík, Peter, Tamara Emmenegger, Martins Briedis, Lars Gustafsson, Ian Henshaw, Miloš Krist, Toni Laaksonen, et al. 2016. “Barrier crossing in small avian migrants: Individual tracking reveals prolonged nocturnal flights into the day as a common migratory strategy.” Scientific Reports 6 (February): 1–9. https://doi.org/10.1038/srep21560.

Alemayehu, Fikir. 2016. “Land Use and Land Cover Change in the Coastal Area of Watamu Mida Creek, Kenya.” Open Journal of Forestry 06 (04): 230–42. https://doi.org/10.4236/ojf.2016.64019.

Azur, Melissa J., Elizabeth A. Stuart, Constantine Frangakis, and Philip J. Leaf. 2011. “Multiple imputation by chained equations: what is it and how does it work?” International Journal of Methods in Psychiatric Research 20 (1): 40–49. https://doi.org/10.1002/mpr.329.

Bennun, Leon. 2000. “Assessing and monitoring bird populations in Africa: an overview.” Ostrich 71 (1-2): 214–15. https://doi.org/10.1080/00306525.2000.9639915.

Benson, C. W. 1982. “Migrants in the Afrotropical Region South of the Equator.” Ostrich 53 (1): 31–49. https://doi.org/10.1080/00306525.1982.9634723.

Bridge, Eli S., Kasper Thorup, Melissa S. Bowlin, Phillip B. Chilson, Robert H. Diehl, René W. Fléron, Phillip Hartl, et al. 2011. “Technology on the Move: Recent and Forthcoming Innovations for Tracking Migratory Birds.” BioScience 61 (9): 689–98. https://doi.org/10.1525/bio.2011.61.9.7.

Briedis, Martins, Silke Bauer, Peter Adamík, José A. Alves, Joana S. Costa, Tamara Emmenegger, Lars Gustafsson, et al. 2019. “A full annual perspective on sex-biased migration timing in long-distance migratory birds.” Proceedings of the Royal Society B: Biological Sciences 286 (1897). https://doi.org/10.1098/rspb.2018.2821.

Brlík, Vojtěch, Jaroslav Koleček, Malcolm Burgess, Steffen Hahn, Diana Humple, Miloš Krist, Janne Ouwehand, et al. 2020. “Weak effects of geolocators on small birds: A meta‐analysis controlled for phylogeny and publication bias.” Edited by Jenny Dunn. Journal of Animal Ecology 89 (1): 207–20. https://doi.org/10.1111/1365-2656.12962.

Burgess, N, and G Clarke. 2000. “Coastal forests of eastern Africa.” In.

Bussière, Elsa M. S., Les G. Underhill, and Res Altwegg. 2015. “Patterns of bird migration phenology in South Africa suggest northern hemisphere climate as the most consistent driver of change.” Global Change Biology 21 (6): 2179–90. https://doi.org/10.1111/gcb.12857.

Buuren, Stef van, and Karin Groothuis-Oudshoorn. 2011. “mice : Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67. https://doi.org/10.18637/jss.v045.i03.

Collar, Nigel. 2020. “Red-capped Robin-Chat (Cossypha natalensis).” In Birds of the World, edited by Josep del Hoyo, Andrew Elliott, Jordi Sargatal, David Christie, and Eduardo de Juana, Birds of t. Cornell Lab of Ornithology. https://doi.org/10.2173/bow.rcrcha1.01.

Cox, Daniel, Miriam Brandt, Ross Mcgregor, Ulf Ottosson, Mathew Stevens, and Will Cresswell. 2011. “Patterns of seasonal and yearly mass variation in West African tropical savannah birds.” Ibis 153 (4): 672–83. https://doi.org/10.1111/j.1474-919X.2011.01150.x.

Devineau, Olivier, Rémi Choquet, and Jean-dominique Lebreton. 2006. “Planning Capture–Recapture Studies: Straightforward Precision, Bias, and Power Calculations.” Wildlife Society Bulletin 34 (4): 1028–35. https://doi.org/10.2193/0091-7648(2006)34[1028:PCSSPB]2.0.CO;2.

Finch, Tom, Philip Saunders, Jesús Miguel Avilés, Ana Bermejo, Inês Catry, Javier de la Puente, Tamara Emmenegger, et al. 2015. “A pan-European, multipopulation assessment of migratory connectivity in a near-threatened migrant bird.” Diversity and Distributions 21 (9): 1051–62. https://doi.org/10.1111/ddi.12345.

Gardali, Thomas, Daniel C Barton, Jennifer D White, and Geoffrey R Geupel. 2003. “Juvenile and Adult Survival of Swainson’s Thrush (Catharus Ustulatus) in Coastal California: Annual Estimates Using Capture-Recapture Analyses.” Edited by F. R. Thompson III. The Auk 120 (4): 1188–94. https://doi.org/10.1093/auk/120.4.1188.

Hahn, Steffen, José A. Alves, Kiril Bedev, Joana S. Costa, Tamara Emmenegger, Martin Schulze, Peter Tamm, Pavel Zehtindjiev, and Kiran L. Dhanjal-Adams. 2020. “Range-wide migration corridors and non-breeding areas of a northward expanding Afro-Palaearctic migrant, the European Bee-eater Merops apiaster.” Ibis 162 (2): 345–55. https://doi.org/10.1111/ibi.12752.

Hallworth, Michael T., and Peter P. Marra. 2015. “Miniaturized GPS Tags Identify Non-breeding Territories of a Small Breeding Migratory Songbird.” Scientific Reports 5: 1–6. https://doi.org/10.1038/srep11069.

Hauser, Cindy E., and Michael A. McCarthy. 2009. “Streamlining ’search and destroy’: Cost-effective surveillance for invasive species management.” Ecology Letters 12 (7): 683–92. https://doi.org/10.1111/j.1461-0248.2009.01323.x.

Kralj, Jelena, Miloš Martinović, Luka Jurinović, Péter Szinai, Szandra Süto, and Bálint Preiszner. 2020. “Geolocator study reveals east African migration route of Central European Common Terns.” Avian Research 11 (1): 1–11. https://doi.org/10.1186/s40657-020-00191-z.

Liechti, Felix, Chiara Scandolara, Diego Rubolini, Roberto Ambrosini, Fränzi Korner-Nievergelt, Steffen Hahn, Roberto Lardelli, et al. 2015. “Timing of migration and residence areas during the non-breeding period of barn swallows Hirundo rustica in relation to sex and population.” Journal of Avian Biology 46 (3): 254–65. https://doi.org/10.1111/jav.00485.

Lieury, Nicolas, Sébastien Devillard, Aurélien Besnard, Olivier Gimenez, Olivier Hameau, Cécile Ponchon, and Alexandre Millon. 2017. “Designing cost-effective capture-recapture surveys for improving the monitoring of survival in bird populations.” Biological Conservation 214 (August): 233–41. https://doi.org/10.1016/j.biocon.2017.08.011.

Lindberg, Mark S. 2012. “A review of designs for capture-mark-recapture studies in discrete time.” Journal of Ornithology 152 (SUPPL. 2): 355–70. https://doi.org/10.1007/s10336-010-0533-9.

Marris, Emma. 2010. “Conservation: Biodiversity as a bonus prize.” Nature 468 (7326): 895. https://doi.org/10.1038/468895a.

McCrea, Rachel S, and Byron J T Morgan. 2014. Analysis of capture-recapture data. CRC Press.

McKinnon, Emily A., and Oliver P. Love. 2018. “Ten years tracking the migrations of small landbirds: Lessons learned in the golden age of bio-logging.” The Auk 135 (4): 834–56. https://doi.org/10.1642/auk-17-202.1.

Moore, Alana L., and Michael A. McCarthy. 2016. “Optimizing ecological survey effort over space and time.” Methods in Ecology and Evolution 7 (8): 891–99. https://doi.org/10.1111/2041-210X.12564.

Nussbaumer, Raphaël, and Colin Jackson. 2020. “Using geolocators to unravel intra-African migrant strategies.” March. Watamu: A Rocha Kenya. https://doi.org/10.13140/RG.2.2.34477.10721.

Nwaogu, Chima Josiah, and Will Cresswell. 2016. “Body reserves in intra-African migrants.” Journal of Ornithology 157 (1): 125–35. https://doi.org/10.1007/s10336-015-1259-5.

Osinubi, Samuel Temidayo. 2018. “Intra-African Bird Migration Project : ecology and conservation.” FitzPatrick Institude of African Ornithology.

Procházka, Petr, Steffen Hahn, Simon Rolland, Henk van der Jeugd, Tibor Csörgő, Frédéric Jiguet, Tomasz Mokwa, Felix Liechti, Didier Vangeluwe, and Fränzi Korner-Nievergelt. 2017. “Delineating large-scale migratory connectivity of reed warblers using integrated multistate models.” Diversity and Distributions 23 (1): 27–40. https://doi.org/10.1111/ddi.12502.

R Core Team. 2013. “R: A language and environment for statistical computing.” Vienna, Austria.

Ruiz-Gutiérrez, Viviana, Paul F. Doherty, Eduardo C. Santana, Sarahy Contreras Martínez, Jorge Schondube, Heriberto Verdugo Munguía, and Eduardo Iñigo-Elias. 2012. “Survival of resident Neotropical birds: Considerations for sampling and analysis based on 20 years of bird-banding efforts in Mexico.” Auk 129 (3): 500–509. https://doi.org/10.1525/auk.2012.11171.

Saether, B. E. 1989. “Survival rates in relation to body weight in European birds.” Ornis Scandinavica 20 (1): 13–21. https://doi.org/10.2307/3676702.

Salewski, Volker, Martin Flade, Anatolii Poluda, Grzegorz Kiljan, Felix Liechti, Simeon Lisovski, and Steffen Hahn. 2013. “An unknown migration route of the ’globally threatened’ Aquatic Warbler revealed by geolocators.” Journal of Ornithology 154 (2): 549–52. https://doi.org/10.1007/s10336-012-0912-5.

Sekercioglu, Cagan H. 2010. “IN FOCUS: Partial migration in tropical birds: the frontier of movement ecology.” Journal of Animal Ecology 79 (5): 933–36. https://doi.org/10.1111/j.1365-2656.2010.01739.x.

Şekercioĝlu, çaĝan H., Richard B. Primack, and Janice Wormworth. 2012. “The effects of climate change on tropical birds.” Biological Conservation 148 (1): 1–18. https://doi.org/10.1016/j.biocon.2011.10.019.

Simmons, Robert E., Phoebe Barnard, W. R. J. Dean, Guy F. Midgley, Wilfried Thuiller, and Greg Hughes. 2004. “Climate change and birds: Perspectives and prospects from southern Africa.” Ostrich 75 (4): 295–308. https://doi.org/10.2989/00306520409485458.

Smart, Adam S., Andrew R. Weeks, Anthony R. van Rooyen, Alana Moore, Michael A. McCarthy, and Reid Tingley. 2016. “Assessing the cost-efficiency of environmental DNA sampling.” Methods in Ecology and Evolution 7 (11): 1291–98. https://doi.org/10.1111/2041-210X.12598.

Smith, Malcolm, Mark Bolton, David J. Okill, Ron W. Summers, Pete Ellis, Felix Liechti, and Jeremy D. Wilson. 2014. “Geolocator tagging reveals Pacific migration of Red-necked Phalarope <i>Phalaropus lobatus</i> breeding in Scotland.” Ibis 156 (4): 870–73. https://doi.org/10.1111/ibi.12196.

Streby, Henry M., Tara L. McAllister, Sean M. Peterson, Gunnar R. Kramer, Justin A. Lehman, and David E. Andersen. 2015. “Minimizing marker mass and handling time when attaching radio-transmitters and geolocators to small songbirds.” The Condor 117 (2): 249–55. https://doi.org/10.1650/condor-14-182.1.

Vickery, Juliet A., Steven R. Ewing, Ken W. Smith, Deborah J. Pain, Franz Bairlein, Jana Škorpilová, and Richard D. Gregory. 2014. “The decline of Afro-Palaearctic migrants and an assessment of potential causes.” Edited by Tony Fox. Ibis 156 (1): 1–22. https://doi.org/10.1111/ibi.12118.

Weiser, Emily L., Richard B. Lanctot, Stephen C. Brown, José A. Alves, Phil F. Battley, Rebecca Bentzen, Joël Bêty, et al. 2015. “Effects of geolocators on hatching success, return rates, breeding movements, and change in body mass in 16 species of Arctic-breeding shorebirds.” Movement Ecology 4 (1). https://doi.org/10.1186/s40462-016-0077-6.

Wood, Simon N. 2017. Generalized additive models: an introduction with R. CRC press.

Using ringing data to inform geolocator deployment

Abstract

Introduction

Materials and methods

Case study species

Capture site and database

Predicting the number of birds to be equipped

Maximizing geolocator retrieval

Results

Predicting the number of birds to be equipped

Maximizing geolocator retrieval

Informing geolocator deployment

Discussion

Predicting the number of birds to be equipped

Maximizing geolocator retrieval

Outlook

Appendix

Appendix 1. Data extends

Appendix 2. Capture model

Appendix 3. Recapture model

Acknowledgements

Data Accessibility

References