MaxEnt

A program for maximum entropy modelling of species geographic distributions, written by Steven Phillips, Miro Dudik and Rob Schapire, with support from AT&T Labs-Research, Princeton University, and the Center for Biodiversity and Conservation, American Museum of Natural History.  Thank you to the authors of the following free software packages which we have used here: ptolemy/plot, gui/layouts, gnu/getopt and com/mindprod/ledatastream.

This page contains reference information for the MaxEnt program.  Background information on the method can be found in the following two papers:

   Steven J. Phillips, Robert P. Anderson, Robert E. Schapire.
   Maximum entropy modeling of species geographic distributions.
   Ecological Modelling, Vol 190/3-4 pp 231-259, 2006.

   Steven J. Phillips, Miroslav Dudik.
   Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation.
   Ecography, Vol 31 pp 161-175, 2008.

The model for a species is determined from a set of environmental or climate layers (or "coverages") for a set of grid cells in a landscape, together with a set of sample locations where the species has been observed.  The model expresses the suitability of each grid cell as a function of the environmental variables at that grid cell.  A high value of the function at a particular grid cell indicates that the grid cell is predicted to have suitable conditions for that species.  The computed model is a probability distribution over all the grid cells.  The distribution chosen is the one that has maximum entropy subject to some constraints: it must have the same expectation for each feature (derived from the environmental layers) as the average over sample locations.

Inputs, Outputs and Parameters

Input files, output directory and algorithm parameters can be specified through the user interface, or on a command line.  The user interface is best for doing single runs, while the command line is useful for repeated runs or automatically performing a sequence of runs with variations in the set of inputs. 

Inputs:

Algorithm Parameters:

Outputs:

All output files are written in the output directory. The summary of a maxent run is given in In addition, maxent produces several files for every species. For a species called mySpecies, it produces files The output format for predicted distributions is either raw,, logistic (the default) or cumulative. For raw output, the output values are probabilities (between 0 and 1) such that the sum over all cells used during training is 1. Typical values are therefore extremely small. For logistic output, the values are again probabilities (between 0 or 1), but scaled up in a non-linear way for easier interpretation. If typical presences used during training are from environmental conditions where probability of presence is around 0.5, then the logistic output can be interpreted as predicted probability of presence (otherwise they can be interpreted as relative suitability). If p(x) is the raw output for environmental conditions x, the corresponding logistic value is c p(x) / (1 + c p(x)) for a particular value of c (namely, the exponential of the entropy of the raw distribution). For the cumulative output format, the value at a grid cell is the sum of the probabilities of all grid cells with no higher probability than the grid cell, times 100.  For example, the grid cell that is predicted as having the best conditions for the species, according to the model, will have cumulative value 100, while cumulative values close to 0 indicate predictions of unsuitable conditions.

ESRI ASCII Grid Format

(Copied from the ArcWorkstation 8.3 Help File)

The ASCII file must consist of header information containing a set of keywords, followed by cell values in row-major order. The file format is

  <NCOLS xxx>
<NROWS xxx>
<XLLCENTER xxx | XLLCORNER xxx>
<YLLCENTER xxx | YLLCORNER xxx>
<CELLSIZE xxx>
{NODATA_VALUE xxx}
row 1
row 2
...
row n
where xxx is a number, and the keyword nodata_value is optional and defaults to -9999. Row 1 of the data is at the top of the grid, row 2 is just under row 1 and so on. For example:
  ncols         386
nrows 286
xllcorner -128.66338
yllcorner 13.7502065
cellsize 0.2
NODATA_value -9999
-9999 -9999 -123 -123 -123 -9999 -9999 -9999 -9999 -9999 ...
-9999 -9999 -123 -123 -123 -9999 -9999 -9999 -9999 -9999 ...
-9999 -9999 -117 -117 -117 -119 -119 -119 -119 -119 -9999 ...
...
The nodata_value is the value in the ASCII file to be assigned to those cells whose true value is unknown. Cell values should be delimited by spaces. No carriage returns are necessary at the end of each row in the grid. The number of columns in the header is used to determine when a new row begins. The number of cell values must be equal to the number of rows times the number of columns.

The current implementation of maxent requires fields xllcorner, yllcorner and nodata_value.


How it works

This is a very brief description -- for more details, please see the papers described above.  Here we first describe an unregularized version (with the regularization value set to zero); in practice, we always recommend to use regularization. Without regularization, the distribution being computed is the one that has maximum entropy among those satisfying the constraints that the expectation of each feature matches its empirical average.  This distribution can be proved to be the same as the Gibbs distribution that maximizes the product of the probabilities of the sample locations, where a Gibbs distribution takes the form

   P(x) = exp(c1 * f1(x) + c2 * f2(x) + c3 * f3(x) ...) / Z

Here c1, c2, ... are constants, f1, f2, ... are the features, and Z is a scaling constant that ensures that P sums to 1 over all grid cells.  The algorithm that is implemented by this program is guaranteed to converge to values of c1, c2, ..., that give the (unique) optimum distribution P.

For each species, the program starts with a uniform distribution, and performs a number of iterations, each of which increases the probability of the sample locations for the species.  The probability is displayed in terms of "gain", which is the log of the number of grid cells minus the log loss (average of the negative log probabilities of the sample locations).  The gain starts at zero (the gain of the uniform distribution), and increases as the program increases the probabilities of the sample locations.  The gain increases iteration by iteration, until the change from one iteration to the next falls below the convergence threshold, or until maximum iterations have been performed.

In the regularized case, the gain is lower by an additional term, which is the weighted sum of the absolute values of c1, c2, ... .  This limits overfitting and prevents c1, c2, ...  from becoming arbitrarily large. Minimizing the regularized loss (or equivalently, maximizing the regularized gain) corresponds to maximizing the entropy of the distribution subject to a relaxed constraint that feature expectations be only close to feature averages over sample locations rather than exactly equal to them.

 

Regularization and feature class selection

The predictive performance of the MaxEnt is influenced by the choice of feature types and the regularization constants.  Here we describe the default settings, which can be overridden, if desired, using the command line flags described below.  By default (i.e., when using "Auto features"), all feature types are used when there are at least 80 training samples;  from 15 to 79 samples, linear, quadratic and hinge features are used;  from 10 to 14 samples, linear and quadratic features are used;  below 10 samples, only linear features are used.

The default values for the constants c1, c2 described above is an empirically tuned value (called "beta", and depending on the feature type and the number of samples) divided by the square root of the number of samples.  The default values for beta for the various feature types are given in the following tables, with interpolation in between:


Linear (2-9 samples)
Sample size 0 10 30 100+
Beta 1.0 1.0 0.2 0.05

Linear + Quadratic (10-79 samples)
Sample size 0 10 17 30 100+
Beta 1.3 0.8 0.5 0.25 0.05

Linear + Quadratic + Product (80+ samples)
Sample size 0 10 17 30 100+
Beta 2.6 1.6 0.9 0.55 0.05

Threshold (80+ samples)
Sample size 0 100+
Beta 2.0 1.0

Hinge (15+ samples)
Sample size 0+
Beta 0.5

Categorical (15+ samples)
Sample size 0+ 10 17+
Beta 0.65 0.5 0.25

Projections

The values of c1, c2, ... and Z that were computed for features derived from the "Environmental layers" are used to compute weights using the layers in the "Projection directory".  Note that these weights are not probabilities and they need not sum to one since they use the normalization constant computed for "Environmental layers" rather than the one for "Projection directory". Their relative magnitudes represent how much a given locale is favored by the species over another locale. For each species, the weights are written in a file mySpecies_<dir>.asc in the output directory, where <dir> is the name of the projection directory.  By default, two kinds of "clamping" are done during the projection process.  First, the environmental layers are clamped: if a layer in the projection directory has values that are greater than the maximum of the corresponding layer used during training, those values are reduced to the maximum, and similarly for values below the corresponding minimum.  Second, features are also clamped: if a feature derived from the projection layers has value greater than its maximum on the training data, it is reduced to the maximum, and similarly for values below the corresponding minimum. Both forms of clamping help to alleviate problems that can arise from making predictions outside the range of data used in training the model.

Background Points

As described above, the maxent distribution is calculated over the set of pixels that have data for all environmental variables.  However, if the number of pixels is very large, processing time increases without a significant improvement in modeling performance.  For that reason, when the number of pixels with data is larger than 10,000 a random sample of 10,000 "background" pixels is used to represent the variety of environmental conditions present in the data.  The maxent distribution is then computed over the union of the "background" pixels and the samples for the species being modeled.  The number 10,000 can be changed from the "Settings" panel, or by using a command-line flag: see the batch-mode section below.

Memory Issues

If the environmental layers are very large files, you may get "out of memory" or "heap space" errors when you try to run the program.  There are a number of ways to address this problem.  

Format of the lambda file

The coefficients of the Maxent model for a species are output in a file called species.lambdas. The entries in the lambdas file are lines of the form: feature, lambda, min, max. The exponent of the Maxent model is calculated as

exponent = lambda1 * (f1(x)-min1)/(max1 - min1) + ... + lambdan * (fn(x)-minn)/(maxn -minn) - linearPredictorNormalizer

In other words, features are scaled so that their values would lie between 0 and 1 on the training data. By default, all features are clamped prior to projection of the model onto new data - see section "Projections" above. The linearPredictorNormalizer is a constant chosen so that the exponent is always non-positive (for numerical stability). Terms corresponding to hinge features are evaluated slightly differently. For example, the hinge feature prec' derived from the layer prec and described by the line: prec', lambda, min, max evaluates to the term

lambda * clamp_at_0(prec-min)/(max-min)

i.e., if prec< min then the value is 0 otherwise it is (prec-min)/(max-min). For the reverse hinge feature prec`, lambda, min, max, the term is

lambda * clamp_at_0(max-prec)/(max-min)

The densityNormalizer is the normalization term Z calculated over the background. The Maxent raw output is therefore:

raw = exp(sum lambdai * (fi(x)-mini)/(maxi - mini) - linearPredictorNormalizer) / densityNormalizer

Lastly, logistic output is calculated using the entropy given at the end of the lambda file: logistic = raw * exp(entropy) / (1 + raw * exp(entropy)).



Batch mode

All parts of the interface can be set from the command line, and the Run button can be automatically pressed after startup.  This allows for the program to be invoked in batch mode, multiple times in sequence, if required.  The command line flags can also be added to the maxent.bat file, at the end of the "java ..." line, to change the default settings of the program. Some of the more common flags have abbreviations that can be used instead of the full flag. As an example, the following two invocations are equivalent:

java -mx512m -jar maxent.jar environmentallayers=layers samplesfile=samples\bradypus.csv outputdirectory=outputs togglelayertype=ecoreg redoifexists autorun

java -mx512m -jar maxent.jar -e layers -s samples\bradypus.csv -o outputs -t ecoreg -r -a

Any boolean flag can be given the prefix "no" or "dont" to turn the flag off. Some abbreviations turn boolean flags off, and are shown preceded by "!" in the table below. The available flags are, in no particular order:


FlagAbbrvTypeDefaultMeaning
responsecurvesPbooleanfalseCreate graphs showing how predicted relative probability of occurrence depends on the value of each environmental variable
picturesKbooleantrueCreate a .png image for each output grid
jackknifeJbooleanfalseMeasure importance of each environmental variable by training with each environmental variable first omitted, then used in isolation
outputformatstringlogisticRepresentation of probabilities used in writing output grids. See Help for details
outputfiletypestringascFile format used for writing output grids
outputdirectoryodirectoryDirectory where outputs will be written. This should be different from the environmental layers directory.
projectionlayersjfile/directoryLocation of an alternate set of environmental variables. Maxent models will be projected onto these variables.
Can be a .csv file (in SWD format) or a directory containing one file per variable.
Multiple projection files/directories can be separated by commas.
samplesfilesfilePlease enter the name of a file containing presence locations for one or more species.
environmentallayersefile/directoryEnvironmental variables can be in a directory containing one file per variable,
or all together in a .csv file in SWD format. Please enter a directory name or file name.
randomseedbooleanfalseIf selected, a different random seed will be used for each run, so a different random test/train partition
will be made and a different random subset of the background will be used, if applicable.
logscalebooleantrueIf selected, all pictures of models will use a logarithmic scale for color-coding.
warningsbooleantruePop up windows to warn about potential problems with input data.
Regardless of this setting, warnings are always printed to the log file.
tooltipsbooleantrueShow messages that explain various parts of the interface, like this message
askoverwrite!rbooleantrueIf output files already exist for a species being modeled,
pop up a window asking whether to overwrite or skip. Default is to overwrite.
skipifexistsSbooleanfalseIf output files already exist for a species being modeled,
skip the species without remaking the model.
removeduplicatesubooleantrueRemove duplicate presence records.
If environmental data are in grids, duplicates are records in the same grid cell.
Otherwise, duplicates are records with identical coordinates.
writeclampgridbooleantrueWrite a grid that shows the spatial distribution of clamping. At each point, the value is the absolute difference between prediction values with and without clamping.
randomtestpointsXinteger0Percentage of presence localities to be randomly set aside as test points, used to compute AUC, omission etc.
betamultiplierbdouble1.0Multiply all automatic regularization parameters by this number. A higher number gives a more spread-out distribution.
maximumbackgroundMBinteger10000If the number of background points / grid cells is larger than this number, then this number of cells is chosen randomly for background points
biasfilefileSampling is assumed to be biased according to the sampling distribution given in this grid file.
Values in this file must not be zero or negative. MaxEnt will factor out the bias.
Requires environmental data to be in grids, rather than a SWD format file
testsamplesfileTfileUse the presence localities in this file to compute statistics (AUC, omission etc.)
The file can contain different localities for different species.
It takes precedence over the random test percentage.
replicatesinteger1Number of replicate runs to do when cross-validating, bootstrapping or doing sampling with replacement runs
replicatetypestringcrossvalidateIf replicates > 1, do multiple runs of this type:
Crossvalidate: samples divided into replicates folds; each fold in turn used for test data.
Bootstrap: replicate sample sets chosen by sampling with replacement.
Subsample: replicate sample sets chosen by removing random test percentage without replacement to be used for evaluation.
perspeciesresultsbooleanfalseWrite separate maxentResults file for each species
writebackgroundpredictionsbooleanfalseWrite .csv file with predictions at background points
responsecurvesexponentbooleanfalseInstead of showing the logistic value for the y axis in response curves, show the exponent (a linear combination of features)
linear!lbooleantrue
quadratic!qbooleantrue
product!pbooleantrue
thresholdbooleantrue
hinge!hbooleantrue
addsamplestobackground!dbooleantrueAdd to the background any sample for which has a combination of environmental values that isn't already present in the background
addallsamplestobackgroundbooleanfalseAdd all samples to the background, even if they have combinations of environmental values that are already present in the background
autorunabooleanfalseStart running as soon as the the program starts up
writeplotdatabooleanfalseWrite output files containing the data used to make response curves, for import into external plotting software
fadebyclampingbooleanfalseReduce prediction at each point in projections by the difference between
clamped and non-clamped output at that point
extrapolatebooleantruePredict to regions of environmental space outside the limits encountered during training
visible!zbooleantrueMake the Maxent user interface visible
autofeature!AbooleantrueAutomatically select which feature classes to use, based on number of training samples
doclampbooleantrueApply clamping when projecting
outputgrids!xbooleantrueWrite output grids. Turning this off when doing replicate runs causes only the summary grids (average, std deviation etc.) to be written, not those for the individual runs.
plotsbooleantrueWrite various plots for inclusion in .html output
maximumiterationsminteger500Stop training after this many iterations of the optimization algorithm
convergencethresholdcdouble1.0E-5Stop training when the drop in log loss per iteration drops below this number
adjustsampleradiusinteger0Add this number of pixels to the radius of white/purple dots for samples on pictures of predictions.
Negative values reduce size of dots.
threadsinteger1Number of processor threads to use. Matching this number to the number of cores on your computer speeds up some operations, especially variable jackknifing.
lq2lqptthresholdinteger80Number of samples at which product and threshold features start being used
l2lqthresholdinteger10Number of samples at which quadratic features start being used
hingethresholdinteger15Number of samples at which hinge features start being used
beta_thresholddouble-1.0Regularization parameter to be applied to all threshold features; negative value enables automatic setting
beta_categoricaldouble-1.0Regularization parameter to be applied to all categorical features; negative value enables automatic setting
beta_lqpdouble-1.0Regularization parameter to be applied to all linear, quadratic and product features; negative value enables automatic setting
beta_hingedouble-1.0Regularization parameter to be applied to all hinge features; negative value enables automatic setting
logfilestringmaxent.logFile name to be used for writing debugging information about a run in output directory
cachebooleantrueMake a .mxe cached version of ascii files, for faster access
applythresholdrulestringApply a threshold rule, generating a binary output grid in addition to the regular prediction grid.
togglelayertypetstringToggle continuous/categorical for environmental layers whose names begin with this prefix (default: all continuous)
togglespeciesselectedEstringToggle selection of species whose names begin with this prefix (default: all selected)
togglelayerselectedNstringToggle selection of environmental layers whose names begin with this prefix (default: all selected)
verbosevbooleanfalseGived detailed diagnostics for debugging
allowpartialdatabooleanfalseDuring model training, allow use of samples that have nodata values for one or more environmental variables.
prefixesbooleantrueWhen toggling samples or layers selected or layer types, allow toggle string to be a prefix rather than an exact match.
printversionbooleanfalsePrint Maxent software version number and exit
nodataninteger-9999Value to be interpreted as nodata values in SWD sample data