ga_optimization#
- PySeismoSoil.helper_site_response.ga_optimization(n_param: int, lower_bound: float, upper_bound: float, loss_function: Callable[[float, ...], float], damping_data: ndarray, use_scipy: bool = True, pop_size: int = 100, n_gen: int = 100, eta: float = 0.1, seed: int = 0, crossover_prob: float = 0.8, mutation_prob: float = 0.8, suppress_warnings: bool = True, verbose: bool = False, parallel: bool = False, n_cores: int | None = None) list[float] | numpy.ndarray [source]#
Perform a genetic algorithm (GA) process to fit the data.
It supports any loss function (not even differentiable or parametric), as long as the loss function can map the model parameters to a loss value.
The evolutionary process that this function can generate is a mutation and crossover within the specified bounds in a uniform fashion.
- Parameters:
n_param (int) – Number of parameters in the model.
lower_bound (float) – Lower bound of the search range (i.e., range in which the evolution of parameter values are constraint). Note that all the model parameters share this range. You cannot have a different range for each parameter.
upper_bound (float) – Upper bound of the search range (i.e., range in which the evolution of parameter values are constraint). Note that all the model parameters share this range. You cannot have a different range for each parameter.
loss_function (Callable[[float, ...], float]) – Function to be minimized by the genetic algorithm. It should map a set of parameters to a loss value. It takes a tuple/list of all the parameters and the damping data as input, and it needs to return a single float.
damping_data (np.ndarray) – Damping data for curve fitting. Needs to have two columns (strain and damping), and in the unit of 1 (not percent).
use_scipy (bool) – Whether to use the “differential_evolution” algorithm implemented in scipy (https://docs.scipy.org/doc/scipy/reference/generated/ scipy.optimize.differential_evolution.html) to perform the optimization. If False, use the algorithm implemented in the DEAP package.
pop_size (int) – The number of individuals in a generation. A larger number leads to potentially better curve-fitting, but a longer computing time.
n_gen (int) – Number of generations that the evolution lasts. A larger number leads to potentially better curve-fitting, but a longer computing time. If
use_scipy
is True (using “differential evolution”),n_gen
means the maximum number of generations, i.e., the evolution could end early if no loss reduction is found.eta (float) – Crowding degree of the mutation or crossover. A high
eta
will produce children resembling to their parents, while a loweta
will produce solutions much more different. (Only effective ifuse_scipy
isFalse
.)seed (int) – Seed value for the random number generator.
crossover_prob (float) – Probability of cross-over. “Cross-over” means producing offsprings from more than one parent. Larger values introduce more demographic diversity into the evolutionary process, which chould help escape the local minima, but at a cost of converging slower.
mutation_prob (float) – Probability of mutation. Larger values introduce more demographic diversity into the evolutionary process, which could help escape the local minima, but at a cost of converging slower. (
mutation_prob
is only effective whenuse_scipy
isFalse
.)suppress_warnings (bool) – Whether to suppress warning messages.
verbose (bool) – Whether to display information (statistics of the loss in each generation) on the console.
parallel (bool) – Whether to use parallel computing to simultaneously evaluate different individuals in a population. Note that different generations still evolve one after another. Only effective for the differential evolution for now. Also note that if using parallelization in differential evolution, you may need more generations to achieve the same optimization loss, because the best solution is being updated only once per generation.
n_cores (int | None) – Number of CPU cores to use. If
None
, all cores are used. No effects ifparallel
is set toFalse
.
- Returns:
opt_result – The optimization result: an array of parameters that gives the lowest loss.
- Return type:
list[float] | np.ndarray