Species distribution modelling (SDM), also called environmental niche modelling (ENM), habitat suitability modelling, or range mapping, uses ecological models to predict where a species lives across different areas and times. These models rely on environmental data, such as temperature, rainfall, soil type, water depth, and land cover. SDMs are used in conservation biology, ecology, and evolutionary studies. They help scientists understand how environmental conditions affect a species' presence or numbers and can predict future or past species distributions. For example, predictions might show where a species could live in the future due to climate change, where it lived in the past to study evolutionary connections, or where an invasive species might spread. These predictions can aid in managing ecosystems, such as helping to reintroduce endangered species or plan protected areas for future climate conditions.
There are two main types of SDMs. Correlative SDMs, also known as climate envelope models or bioclimatic models, compare a species' current location to environmental conditions to find patterns. Mechanistic SDMs, also called process-based models or biophysical models, use information about a species' biology, such as how it survives in different climates, to predict where it can live.
How accurately these models reflect real-world species locations depends on several factors. These include the quality and complexity of the models, the accuracy of environmental data, the availability of reliable species data, and the impact of factors like physical barriers, geological history, or interactions with other species that affect where a species actually lives. Environmental niche modelling is part of the field of biodiversity informatics.
History
A. F. W. Schimper studied how geography and the environment affect where plants grow. He wrote about this in his 1898 book Plant Geography Upon a Physiological Basis and again in a 1908 book with the same title. Andrew Murray examined how the environment influences where mammals live in his 1866 book The Geographical Distribution of Mammals. Robert Whittaker's research on plants and Robert MacArthur's work on birds showed that the environment plays a major role in determining where species are found. Elgene O. Box created models to predict the areas where tree species can grow. He used computer simulations, which were among the first examples of species distribution modeling.
The use of more advanced statistical models called generalized linear models (GLMs) allowed scientists to build more accurate and realistic models of species distributions. The growth of satellite technology and the development of computer systems that map the environment increased the amount of data available for modeling and made it easier to analyze.
Correlative vs mechanistic models
Species Distribution Models (SDMs) began as correlative models. Correlative SDMs show where a species is found based on climate data and other environmental factors using statistical methods. When scientists have records of where a species is present and maps of climate conditions, they use models to predict the most likely areas where the species can live. These models assume that a species is in balance with its environment and that climate data is complete enough. They also help estimate where a species might live even if only a few locations are known.
For these models to work well, scientists need both records of where a species is found and where it is not found. However, records of where a species is not found are rare. Scientists often use "random background" or "pseudo-absence" data to fill this gap. If the data about where a species is found is incomplete, these methods can lead to errors. Correlative SDMs focus on where a species is currently found, which is called the "realized niche." This is different from the "fundamental niche," which includes all environments where a species could live if there were no limits like competition or movement barriers. A species' realized niche may be smaller than its fundamental niche if it cannot reach certain areas due to these factors.
Correlative SDMs are easier and quicker to build than mechanistic SDMs and use available data efficiently. However, because they are based on correlations, they do not explain why species live in certain places and are not good for predicting beyond known conditions. They may also be inaccurate if a species is not in balance with its environment, such as when a species is newly introduced and expanding its range.
In standard SDMs, scientists often model the distribution of one species at a time, using parameters that describe how environmental factors affect its likelihood of being present. This allows each species to respond differently to environmental changes but can be difficult when data about a species is limited. Multi-species SDMs study several species together, using shared parameters to compare how they respond to the environment. Neither standard nor multi-species SDMs consider how species interact with each other, which can influence biodiversity. Joint SDMs (J-SDMs) address this by modeling how species coexist, showing how both environmental factors and interactions with other species affect a species' presence. This can improve predictions for rare species and help understand community ecology. Both standard and J-SDMs can calculate community-level metrics, like the number of species in an area, which helps with decisions like conservation planning.
Mechanistic SDMs are newer models that use information about a species' biology, such as how it survives in different conditions, to predict where it can live. These models aim to describe the full range of environments a species can survive in, called the fundamental niche, and map this across landscapes. A simple model might show temperature limits that prevent a species from surviving. A more complex model might include steps like how body temperature affects survival, how much energy a species needs, and how populations grow or shrink. These models use climate and other environmental data as inputs. Because they do not rely on where a species is currently found, they are useful for species whose ranges are changing, like invasive species.
Mechanistic SDMs explain why species live in certain places and are better for predicting conditions outside known ranges. However, they are harder to create and require detailed biological data that may not always be available. These models also need many assumptions and can become very complicated.
Factors like how species move, how they interact with others, and evolutionary changes are often not included in either correlative or mechanistic models.
Correlative and mechanistic models can be used together to improve understanding. For example, a mechanistic model might identify areas where a species cannot survive, which can then be used as "absences" in correlative models.
Niche models (correlative)
There are many mathematical methods used to fit, select, and evaluate correlative species distribution models (SDMs). These methods include "profile" methods, which are simple statistical techniques that use environmental distance to known sites of occurrence, such as BIOCLIM and DOMAIN; "regression" methods, such as generalized linear models; and "machine learning" methods, such as maximum entropy (MAXENT). Ten machine learning techniques used in SDM are listed below. An incomplete list of models that have been used for niche modeling includes:
- BIOCLIM
- DOMAIN
- Ecological niche factor analysis (ENFA)
- Mahalanobis distance
- Isodar analysis
- Generalized linear model (GLM)
- Generalized additive model (GAM)
- Multivariate adaptive regression splines (MARS)
- Maxlike
- Favourability Function (FF)
- MAXENT
- Artificial neural networks (ANN)
- Genetic Algorithm for Rule Set Production (GARP)
- Boosted regression trees (BRT)/gradient boosting machines (GBM)
- Random forest (RF)
- Support vector machines (SVM)
- XGBoost (XGB)
Ensemble models combine results from multiple models to include features from each. Often, the average or middle value across several models is used as an ensemble. Consensus models are those that are closest to the average or middle value of all models. These can be single model runs or groups of models combined together.
Niche modelling software (correlative)
SPACES is an online tool that helps scientists study how plants and animals live in different environments. It lets users try many different methods to create models using a web browser, which works on computers and mobile devices.
MaxEnt is a popular method that uses data showing where a species is found. It works well even when there are few records of where the species lives.
ModEco is another tool that includes several different methods for modeling.
Qb.SDM uses methods like Random Forest, XGBoost, and MaxEnt. It also connects to a database called GBIF to access information about species.
DIVA-GIS is a simple tool that is good for teaching. It includes a method called BIOCLIM, which helps predict where species might live based on environmental conditions.
The Biodiversity and Climate Change Virtual Laboratory (BCCVL) is a website that makes it easier to study how climate change affects plants and animals. It gives users access to global data about the environment or lets them upload their own data. Users can run six types of experiments using 17 different methods, such as Species Distribution Models, Climate Change Projections, and Ensemble Analysis. Results can be viewed and compared easily. Examples of model outputs are available online.
Ecocrop is a tool that helps scientists determine if a plant can grow in a specific area. It also predicts how much crops might produce and how climate change could affect plant growth.
Many niche modeling methods are available in software packages like 'dismo', 'biomod2', and 'mopa'.
Developers interested in creating new tools might use the openModeller project.
The Collaboratory for Adaptation to Climate Change (adapt.nd.edu) has created an online version of openModeller. This version allows users to run experiments quickly using a web browser, without needing powerful local computers. It supports running multiple experiments at the same time.
SDM applications
Species distribution models (SDMs) are widely used tools in ecological research, conservation planning, and environmental management. These models help scientists understand and predict where different species live, both now and in the future. They are not only useful for studying ecosystems but also help people make important decisions about protecting nature and creating policies to address environmental changes.
One important use of SDMs is in protecting biodiversity and designing nature reserves. By showing where habitats are most suitable for certain species, SDMs help choose areas that need protection or restoration. In places where there is little field data, SDMs can predict where rare, unique, or endangered species might live, helping to expand protected areas. They also help evaluate how well current reserves will work under today’s and future environmental conditions.
Another key use of SDMs is studying how climate change affects species. These models predict how species’ ranges might change under different climate conditions, showing if their areas could shrink, grow, or move. This information helps planners and conservationists prepare for changes in biodiversity and take steps to protect species. SDMs also help find "climate refugia"—areas that are likely to stay suitable for species even as the climate changes—important for long-term conservation efforts.
In managing invasive species, SDMs are becoming more important. They help predict where non-native species might spread, supporting efforts to stop invasions and protect ecosystems. Similarly, in disease research, SDMs help predict where diseases might spread by combining data about the environment, climate, and where animals live.
In ecological and evolutionary studies, SDMs help scientists understand how species interact with their environment, how different species use space, and patterns of where species live. They are often used with genetic data to study how populations connect, how species evolved, and how they respond to changes in their environment.
Finally, SDMs are used in studying ecosystem services, restoring habitats, and planning land use. By identifying where important species like pollinators or keystone species live, SDMs help create plans that balance conservation with human needs.
In summary, SDMs are used in many areas, from basic research to practical conservation. As technology improves, such as with satellite data and computers, SDMs are becoming even more useful for connecting scientific knowledge with real-world environmental decisions.