Correlation between stream temperatures in Slovenian streams in which marble trout live

This is an update on my research and I will try to post more often in these last months of my Marie Curie Fellowship. Files are hosted on my github page. Data have been collected by Alain Crivelli and Dušan Jesenšek since 1996. Some info on marble trout, the conservation program, and Western Slovenian streams here below.

1. Marble trout and Western Slovenian streams

Marble trout is a freshwater resident salmonid endemic in the Adriatic basin of Slovenia. Whether there are still pure marble trout populations living in the Po river system (Northern Italy) is subject of current research. Marble trout live in streams with mean summer temperature below 15°C and winter temperature ranging from 0 to 5 °C. Marble trout spawn in November-December and offspring emerge in May-June.
The Marble Trout Conservation Program started in 1993 in the upper reaches of the Soca River basin and its tributaries - the Idrijca and Baca Rivers - in Western Slovenia. Eight pure marble trout populations, all isolated and separated from the downstream hybrid marble-brown trout zone by impassable waterfalls, live in headwater streams in the basins of Soca, Baca, and Idrijca Rivers: Huda, Lower Idrijca, Upper Idrijca (in the map below Lower and Upper Idrijca are grouped together), Lipovesck, Studenc, Svenica, Zadlascica, Trebuscica.
Other two populations (Zakojska and Gacnik) have been created by translocating the progeny of the Zadlascica (Zakojska) and Trebuscica X Lipovesck (Gacnik) in 1996 and 1998, respectively.


2. Analyses

For some of the analyses I intended to carry out (temperature-dependent survival, growth, and recruitment), it was necessary to have complete temperature records for all streams since the start of the sampling. However, there were some missing data (sometimes whole seasons/years) in evert stream. The temperature .csv files are (stream_name)_temp.csv, the first column is the Date, the second is the mean daily temperature (Temp). Start by sourcing the file Temp.r, which is reading all the temperature files and merging them together (r scripts are here).


The output temp.all.df (along with the production of a ten-panel plot with stream-specific monthly temperature boxplots for 2009-2013) is a data.frame with columns Date, Temp, Year, Month, Stream, Calc (Meas = temperature has been recorded, see below for other values) (see below).


Daily Water Temperature (C)
Date Temp Year Month Stream Calc
1996-07-04 10.98 1996 7 Zak Meas
1996-07-05 10.99 1996 7 Zak Meas
1996-07-06 11.26 1996 7 Zak Meas
1996-07-07 11.19 1996 7 Zak Meas
1996-07-08 11.06 1996 7 Zak Meas
1996-07-09 9.96 1996 7 Zak Meas
1996-07-10 9.85 1996 7 Zak Meas
1996-07-11 10.07 1996 7 Zak Meas
1996-07-12 10.51 1996 7 Zak Meas
1996-07-13 11.08 1996 7 Zak Meas

Then, I tested the correlation between stream temperatures between pair of streams (one is the target - the one with missing data - and the other is the tested). I used the temperature data of the tested stream with the highest correlation with the temperature data of the target stream to impute the missing temperature data in the tested stream.
The Temp.corr.f function tests the correlation between water temperature data recorded in different streams.

source("Temp.corr.r") # contains Temp.corr.f
Temp.tb = Temp.corr.f(temp.all.df)

The Temp.tb data.frame has columns target stream (tar), tested stream (var), correlation between stream temperature of the two streams (cor), years with common number of days with temperature recorded (common.years), years with missing data for the target stream (miss.years), and years with missing data for the target stream, but with complete data for the tested stream ( The years in can thus be used to impute the missing data.
The correlation between water temperature of streams are typically very high (mean correlation[sd] = 0.97[0.01]).

Correlation of water temperature
tar var cor years.cor common.years miss.years
Zak Gac 0.95 5 2001-2002-2005-2009-2013 1996-1997-1999-2000-2006-2008-2010-2011-2014 2006-2008-2010-2011
Zak Sve 0.95 3 2002-2009-2013 1996-1997-1999-2000-2006-2008-2010-2011-2014 2006-2008-2010-2011
Zak Stu 0.97 3 2005-2009-2013 1996-1997-1999-2000-2006-2008-2010-2011-2014 2006-2008-2010-2011
Zak LIdri 0.97 3 2005-2009-2013 1996-1997-1999-2000-2006-2008-2010-2011-2014 2006-2008-2010-2011
Zak UIdri 0.96 3 2003-2005-2013 1996-1997-1999-2000-2006-2008-2010-2011-2014 2006-2010-2011

In each stream, I imputed the missing data (1) using the temperature data from the tested stream with the highest correlation with the target stream and (2) by applying the best model (linear or non-linear - gam -, chosen according to best prediction) linking the water temperature data of the two streams. The r script for imputing missing data is in Temp.filling.r.


The output of the script is the data frame temp.all.df with columns: Date, Temp, Year, Month, Stream,
Calc (Meas = temperature recorded in the stream, Gac2005 = in one year (1997) we had missing data for Gac and the only acceptable data for imputing was coming from Gac in 2005, Same_as_a = Same temperature as days after (just for a few days missing), Same_as_b = Same temperature as days before (just for a few days missing), Zak2012 = in one year (1997) we had missing data for Zak and the only acceptable data for imputing was coming from Zak in 2012, Stream_name = stream whose temperature data was used to impute missing data, degree_days = degree days for the day using 5C as base temperature, Sampling_Season = Summer for June, July, September - Winter for the rest of the year). Sampling occurred either in June or September or in both.

Daily Water Temperature (C)
Date Temp Year Month Stream Calc degree_days Sampling_Season
2006-08-19 13 2006 8 Stu Meas 7.8 Summer
2006-08-20 13 2006 8 Stu Meas 7.8 Summer
2006-08-21 13 2006 8 Stu Meas 8.1 Summer
2006-08-22 13 2006 8 Stu Meas 7.6 Summer
2006-08-23 13 2006 8 Stu Meas 7.6 Summer
2006-08-24 12 2006 8 Stu Meas 7.3 Summer

Temperature data is now ready to be used to test differences in water temperature between streams, and temperature-dependent survival, growth, and recruitment.

Slides of a recent talk I gave at the Hopkins Marine Station (De Leo's lab)

Slides of a recent talk I gave at the Hopkins Marine Station; I was invited by my former PhD supervisor Giulio De Leo.


Eco-evolutionary responses to extreme events

Reference paper

Extinction risk and eco-evolutionary dynamics in a variable environment with increasing frequency of extreme events

Slides (some colors are missing for unexplained reasons)

Paper published and media coverage

My colleagues and I (Simone Vincenzi, Scott Hatch, Thomas Merkling, Alexander S. Kitaysky) recently published the paper "Carry-over effects of food supplementation on recruitment and breeding performance of long-lived seabirds" in the Proceedings of the Royal Society Biological Sciences (you can find the un-gated paper here).

It has been a long and challenging work, from data preparation to multiple manuscript revision, but it has been worthwhile, as results are of general interest and intriguing.

From the Cover Letter we sent to PRSB's Editor: "This is the first experimental test of the long-term effects of controlled variation in early food availability in long-lived wild animals. In addition to casting light on some of the ecological consequences of variation in early food availability, our results also have pivotal consequences for conservation science".

------ Some other excerpts from the cover letter here below

The supplementation of food for wild animals is extensively applied as a conservation tool to increase the local production of young, but the effects of such food supplementation on the subsequent recruitment of long-lived animals into natal populations are largely unknown. For long-lived species, studies are generally observational due to the long time periods required for individuals to reproduce and/or complete their life cycles (SV note: I am lucky enough to work on two model system with tagging that started in both cases in 1996). Our experimental study, more than a decade long, of the long-term effects of early food supplementation on long-term performance of a long-lived species is thus a novel, original, and exciting contribution. We used the unique experimental system of kittiwakes breeding on Middleton Island (Alaska) to test the alternative hypotheses that food supplementation early in life (a) increases overall fitness of birds, or (b) delays viability selection, with no consequences for the long-term dynamics of the species.

The results of our study are exciting and surprising. Through rigorous statistical and modeling analyses, we found that delayed viability selection is decreasing the recruitment rate of food-supplemented chicks with respect to control birds. We also identified a potential mechanism for the delayed viability selection, i.e. more intensive brood reduction in control nests.Lifetime reproductive success of a subset of kittiwakes that thus far had completed their life cycle was not affected by the food supplementation during development. However, per-nest contribution of recruits was still higher for food-supplemented nests due to their greater productivity compared to control nests, thus suggesting a positive net effect of food supplementation on recruitment.


The paper received media coverage on ScienceNews with a very clear and accurate article written by Sarah Zielinski in the Wild Things blog

Boring, but important, note (as we all know that talk is cheap, but money buys whiskey) about funding:

Fieldwork and modeling were supported by the US Geological Survey and North Pacific Research Board (Project no. 320, BESTBSIERP Projects B74, B67 and B77). S.V. is supported by an IOF Marie Curie Fellowship FP7-PEOPLE-2011-IOF for the project ‘RAPIDEVO’ on rapid evolutionary responses to climate change in natural populations, and by the Center for Stock Assessment Research (CSTAR). The MC Fellowship FP7-PEOPLE-2011-IOF and the Institute of Arctic Biology at UAF provided funds to cover the publication costs.

Manuscript submitted

I recently submitted a new manuscript on vital rates and life histories in marble trout. Dense paper, lots of models, lots of results. Currently under review. Here below are the Title and Abstract.


Within and among-population variation in vital rates and population dynamics in a variable environment --- Vincenzi, Mangel, Jesensek, Garza, Crivelli.


Understanding the causes of within- and among-population differences in vital rates, life histories, and population dynamics is a central topic in ecology. In order to understand how within- and among-population variation emerge, we need long-term studies that include episodic events and contrasting environmental conditions, tag-recapture data for the estimation and characterization of individual and shared variation, and statistical models that can tease apart population-, shared-, and individual contribution to the observed variation.

We used long-term tag-recapture data and novel statistical and modeling techniques to investigate and estimate within- and among-population differences in vital rates, life histories and population dynamics of marble trout Salmo marmoratus, a narrow endemic freshwater salmonid. Only ten populations of pure marble trout still persist in Western Slovenian headwaters. Marble trout populations are also threatened by floods and landslides, which have already caused the extinction of two populations in recent years.

In particular, we estimated and determined causes of variation and trade-offs within- and among populations in growth, survival, and recruitment in response to variation in water temperature, density, sex, early conditions, and extreme events.

In all ten populations, we found that the effects of population density on traits were mostly limited to the early stages of life and that individual growth trajectories were established early in life. We found no clear effects of water temperature on survival and recruitment. Population density was variable over time in all populations, with flash floods and debris flows causing massive mortalities and threatening population persistence. Apart from flood events, variation in population density within streams was largely determined by variation in recruitment, with survival of older fish being relatively constant over time within populations, but substantially different among populations. A fast- to slow-continuum of life histories in marble trout populations seemed to emerge, with slow growth associated with higher survival at the population level, possibly determined by food conditions and age at maturity.

Our work provides unprecedented insight into the causes of variation in vital rates, life histories, and population dynamics in an endemic species that is teetering on the edge of extinction.

Some reflections on my science

Since I had to change the links to my publications due to some obscure passage of pdfs from one folder to another, I had a chance to have a look at all the papers I published so far. 40 total, 28 as first author, 4 under review (2 as first author, other 2 under review). Surprisingly (or not, upon further reflection) I barely remember the content of most of my papers and I have little idea on how they were originally thought, what was the development, what was the contribution of co-authors, why I used certain methods and not others. I saw big tables I did not remember I had prepared. I saw a Figure in which fish are one year older than what they should be (I also thought I sent the correct Figure during the revision process, apparently not). I read long Introductions and longer Discussions (I write a lot, no doubt). I remember long struggles to get papers accepted even if I currently do not remember the major contentious points.

Just to be clear, I do not have any memory disorder. However, I have published in many different areas, in part because I prefer to zig-zag than follow a straight-ish line, in part because I have been supported by soft money throughout all my career and I haven't been too rigid in my research/grant choices. I also tried to use novel methods (for me or in general), since I like to challenge myself and expand my research tools. I tend to go very deep and very fast in my research and this - like cramming for a test - is not conducive to long-term retention of information.

This "discovery" made me think about my research trajectory, what kind of tools and skills I have acquired, and whether production of science is like the production of eggs in fish: you give your contribution and you let it find its way.

Akin's Laws of Spacecraft Design

A great read with wider application

Some of my favorites:

1. Engineering is done with numbers. Analysis without numbers is only an opinion.

4. Your best design efforts will inevitably wind up being useless in the final design. Learn to live with the disappointment.

6. (Mar's Law) Everything is linear if plotted log-log with a fat magic marker.

9. Not having all the information you need is never a satisfactory excuse for not starting the analysis.

16. The previous people who did a similar analysis did not have a direct pipeline to the wisdom of the ages. There is therefore no reason to believe their analysis over yours. There is especially no reason to present their analysis as yours.

17. The fact that an analysis appears in print has no relationship to the likelihood of its being correct.

19. The odds are greatly against you being immensely smarter than everyone else in the field. If your analysis says your terminal velocity is twice the speed of light, you may have invented warp drive, but the chances are a lot better that you've screwed up.

21. (Larrabee's Law) Half of everything you hear in a classroom is crap. Education is figuring out which half is which.

24. It's called a "Work Breakdown Structure" because the Work remaining will grow until you have a Breakdown, unless you enforce some Structure on it.

29. (von Tiesenhausen's Law of Program Management) To get an accurate estimate of final program requirements, multiply the initial time estimates by pi, and slide the decimal point on the cost estimates one place to the right.

32. (Atkin's Law of Demonstrations) When the hardware is working perfectly, the really important visitors don't show up.

34. (Roosevelt's Law of Task Planning) Do what you can, where you are, with what you have.

37. (Henshaw's Law) One key to success in a mission is establishing clear lines of blame.

41. Space is a completely unforgiving environment. If you screw up the engineering, somebody dies (and there's no partial credit because most of the analysis was right...)


Links to R-related stuff

Git and GitHub

Predicting Baseball Game Attendance with R

Data wrangling process as seen by Hadley Wickham

Bayesian Rugby

Practical Data Science by Sebastian Raschka

10 things statistics taught us about big data analysis

Writing Scientific Papers Using Markdown

Data Processing with dplyr & tidyr

Hierarchical Bayesian Survival Analysis for CVD Risk Prediction in Diabetic Individuals

RStudio & git/github Demonstration (Video)

Using generalized linear models to compare group means in R

R graph catalog


Paper submitted

With Scott Hatch, Thomas Merkling, Sasha Kitaysky

Title: Food supplementation early in life delays viability selection in a long‑lived animal


Supplementation of food to wild animals is extensively applied as a conservation tool to increase local production of young. However, the effects of food supplementation on the subsequent recruitment as breeders of long-lived migratory animals into natal populations and their lifetime reproductive success are largely unknown. We examine how experimental food supplementation affects (a) recruitment as breeders of kittiwakes Rissa tridactyla born in a colony on Middleton Island (Alaska) between 1996 and 2006 (n = 1629) that bred in the same colony through 2013 (n = 235); and (b) breeding success of individuals that have completed their life cycle at the colony (n = 56). Birds were raised in nests that were either supplemented with food (Fed) or unsupplemented (Unfed). Fledging success was higher in Fed compared to Unfed nests. After accounting for hatching rank, growth, and oceanic conditions at fledging, Fed fledglings had a lower probability of recruiting as breeders in the Middleton colony than Unfed birds, but the per-nest contribution of breeders was still significantly higher for Fed nests. Lifetime reproductive success of a subset of breeders that completed their life cycle was not affected by the food supplementation during development. Our results cast light on the interaction between intrinsic quality and early food conditions in determining fitness of long-lived animals.

Keywords: Individual quality; supplemental feeding; long-lived animals; viability selection.


Paper submitted to Axios Review

A few months ago, my colleagues and I submitted to Fish and Fisheries a manuscript on the trade-offs between complexity and accuracy in random-effects models of body growth.

The paper was rejected mostly on the basis of lack of fit (i.e. the topic was only marginally interesting for the journal's readership). One Reviewer found the paper interesting and valuable, and recommended the submission of the manuscript to a more general journal, such as Ecology or Oikos. The other Reviewer commented on some unclear technical aspects of the work (the review was quite detailed and the recommendations/suggestions/critiques were valuable, thanks anonymous Reviewer).

I believe the paper should be of interest for a large audience of biologists, ecologists, computational scientists/statisticians. The main motivation of the paper is quite simple and very general: "We often face trade-offs between model complexity, biological interpretability of parameters, and goodness of fit." Then, with reference to models of growth: "Depending on formulation, parameters of some growth models may or may not be biologically interpretable. For instance the parameters of the widely used von Bertalanffy growth function (von Bertalanffy 1957) to model growth of fish may be considered either curve fitting parameters with no biological interpretation (i.e. providing phenomenological description of growth) or parameters that describe how anabolic and catabolic processes govern the growth of the organism (i.e. mechanistic description); see Mangel (2006). The classic von Bertalanffy growth function has 3 parameters: asymptotic size, growth coefficient, and theoretical age at which size is equal to 0. In the original mechanistic formulation of von Bertalanffy, asymptotic size results from the relationship between environmental conditions and behavioral traits and the growth coefficient is closely related to metabolic rates and behavioral traits (i.e. the same physiological processes affects both growth and asymptotic size). However, in the literature asymptotic size and growth rate are commonly treated as independent parameters with no connection to physiological functions, thus offering just a phenomenological description of growth."

However, I understand Editors may not fully grasp the relevance of the paper for their journal. For instance, the manuscript was previously submitted to another journal, but the Editor wrote: "I feel that the work is too specialised, as relatively few researcher work on growth curves". I might disagree on the claim that few researchers work on growth curves. I am sure that lots of scientists use growth models in their work, but I might agree on the number of people working on the development of growth models or methods for the estimation of growth model parameters (it is also quite hard).

My colleagues and I (my idea, my colleagues agreed) decided to submit the manuscript to Axios Review, a new service that should help authors publish their papers in higher profile journals. This is how it works: "Axios Review solves this problem by putting papers through rigorous external peer review and then referring them to the appropriate journal. When a journal asks the authors to revise and submit, the journal has effectively said that: i) the paper is within their scope, ii) that it is not fatally flawed, and iii) that it could be published in their journal. The Axios Review process effectively eliminates rejections on the grounds of novelty and significantly reduces the chances of rejection on quality. It’s similar to getting a ‘reject, encourage resubmission’ decision from the journal itself; for comparison, about 75% of resubmissions to top tier evolution journals get accepted. Authors submitting to Axios Review can have the reviewers comment on the suitability of their paper for any journal they choose, allowing them to aim for a high profile journal without the effort of formally submitting."

I submitted the manuscript to Axios Review a couple of days ago (target journals following an order that may or may not be the one I chose: Oikos, Ecology, Journal of Theoretical Biology, Ecological Applications). So far, communication with the Editorial staff has been excellent.

I did not upload the manuscript on arxiv or bioRxiv (I don't know where the manuscript will end up and thus which policy related to uploading of pre-print should I follow), please send an email if you'd like to read a pre-print.