As of now, I am half way through my internship with my employer the US EPA. The first half has had some challenges. We were not expecting the last species SDM to take as long as it did to finalize. The species had a far greater number of observation data points and had a relatively small distribution compared to one of the other species that we modeled (bird-voiced frog).
In the end, our cutoff values were quite low (~100). Cutoff values are important because they are used to convert the projection output from a continuous range to a binary output (presence/absence) prediction. With our values being so low, the model was indicating significant distribution increases in the future projections.
To resolve this we did some research on the GitHub page for biomod2 to better understand how the cutoff value was calculated. I was not able to find anything of relevance, so I opened my own issue. I was able to discuss our questions and better understand what could be causing this uncertainty.
We determined that the low cutoff value was likely driven by disagreement between the different algorithms used to build the ensemble model (GBM. XGBOOST, and GLM). Some algorithms were weighting explanatory variables more than others. It was this lack of agreement that likely led to the uncertainties in the model and lower cutoff values.
During this process, I did realize that two of the observation data points were well outside the current distribution and were likely errors, so I removed them. This improved the ensemble cutoff value slightly. The biomod2 developer, Maya, gave us two solutions to potentially resolve the issue: change the number of pseudo absence (PA) values generated or find a new variable to better define the species current distribution. For PA value selection, we were following guidelines from Barbet-Massin et al 2012. As for selecting a new variable, we were apprehensive about including any habitat parameters because those are likely to change as the climate changes which may constrict the future projections more than they should be.
Since we followed that PA method and used only the climate, elevation, and land cover explanatory variables for the other two species, we did not really want to deviate from our methods. We decided that we will discuss the uncertainty of the projection for the oak toad in the research article that results from this project and use the final output despite the lower cutoff values. Below is an example of an ensemble projection output from biomod2.
In addition to the internship, I have been working on updates to my LinkedIn profile. To improve the profile, I included a few more media samples and updated my current position. I also added an experience for my internship to highlight some of the work that I have been doing and linked my blog to the experience. My goal with the updates was to include my current work experience and add more examples of my work to better highlight things that can’t be included in a resume. Here is the link to my LinkedIn profile.

No comments:
Post a Comment