evolutionary computing It’s underrated. Objections include “Evolution takes thousands of years” and “I don’t get the point.”
They are (or should be) very important to those in the AI community because they are the primitive precursors of intelligence. Understanding evolutionary computation (EC) will make LLM easier to understand.
So that you don’t have to redefine “AI”, let’s now define a new term. sexual intercourse It stands for “SuperHuman Answer Generator”.
SHAG is a computer-based system that can provide answers to questions for which the clients or programmers using the system cannot (or do not need to) compute the answers themselves. To put it more briefly, SHAG generates answers that humans cannot generate.
Naturally, all LLMs are SHAGs. They can generate answers that programmers cannot. Understanding and replying in Finnish, etc.
Many people are even more surprised to learn that all SHAGs (and therefore all LLMs) are holistic. Because SHAG is holistic, it can identify some problems that all holistic systems share. red pill):
– The answers provided may not be accurate.
– It is not known whether the provided answer is optimal, complete, repeatable, concise, explainable, or transparent.
As evidence, we recognize that these problems are currently endemic to LLMs.
But is there also a SHAG that is not an LLM? In addition to my current LLM (including my Deep-Discrete-Neuron-Network LLM), my SHAGs include Genetic Algorithms (GA), Genetic Programming (GP), and Simulated Annealing (SA). In the following, we will mainly discuss GA.
Here are my favorite ways to use GA:
– Define the solution, but define the individuals that must compete for survival, and initialize it randomly for diversity.
– Create a population of these and store them all in an array. Let’s say the population is 1000 people.
– Define an objective function that returns a number indicating how good (fit) the individual is.
– Defines a crossover function that breeds two successful individuals together to produce better offspring.
– Define mutational functions that reintroduce more diversity to some individuals.
Repeat until the system stops improving. This can be detected by ensuring that the elite does not change. For a problem that works well, it may take about 10 cycles.
– Calculate individual fitness using the objective function
– Sort the array according to the individual’s physical strength.
– Start with the worst individuals and work your way towards better individuals.
– Replace an individual with the intersection of two superior individuals
– Stop replacement slightly below the top (to preserve the “elite”).
GAs all use individuals that contain some kind of genome. Part of the design challenge for intersection and objective functions is that the practitioner must understand the problem well enough to be able to determine not only which individual (considered to be the solution) is better, but also the parameters that shape it all. solution.
A simple GA actually has no phenotypes. Directly evaluate genotypes using the objective function. This is a very radical shortcut, but it is actually a method that is starting to be used in wetlab genomics. In the lab, we create crops with better yields without having to grow seeds for a year. Because they know what higher-yield DNA genomes look like.
Suppose we use GA to optimize transportation costs to design a square corner box that can hold 200 pounds of grain (with a known average density) as cheaply as possible. The genome contains the X, Y and Z sizes of the box, which are initially completely random for each individual. The objective function returns a fitness of 0 for all boxes with insufficient volume for the required amount of grain, otherwise it returns the length and perimeter so that shipping costs can be calculated in the traditional way.
A cross function can use two parent elements, one X and the other Y and Z, and sometimes one X and Y and the other Z. Every parent has a better fit than the individual they replace. We are hoping that recombination will exploit the partial resolution of the parents to produce even better offspring.
We’ll see very quickly that the best boxes of each generation are getting smaller and cheaper to ship. If the elites are stable, they all have the same X, Y, Z, which are optimal solutions.
Now consider a larger problem involving 500 numerical parameters and an objective function that uses all of them. It may be expensive to develop this, but if it is the only way forward, we will adopt it. The problem of acting well converges quickly.
A typical individual maintains 500 values in an array (similar to the genes on a chromosome), and crossovers brutally create new individuals by randomly alternating segments of “DNA” from one parent with segments from the other parent. Selected cut point.
-
Mutations are much less important than crossovers in that they are optional. Beginners get this backwards and even textbooks don’t emphasize this enough. If you don’t use crossovers, you’re using random searches and you’re throwing away the whole point and functionality of GA.
At a population of 1000 I have 10 elites available, so I will loop and replace 990 of the worst individuals with potentially superior offspring each turn. I typically apply mutations to at most a few percent of all individuals. -
Must have part Opportunity for descendants to inherit part Characteristics that made parents successful. I’ve seen beginners accidentally delete all records of the parent item, creating cross-functionality that degrades the system to random searches. This is important for tasks such as determining cutoff points (when using array representations) at which some properties tend to move together for synergy.
-
You can replace this mistake with a measurement. To test whether a GA works, compare the convergence of the full GA to the convergence of a version that replaces the cross-function by generating new random (starting point) objects without the history of the parent. We now have a random search system. If it’s not significantly slower than a fully functional GA, you’ll have to start over. It also shows how EV can speed up operations compared to linear or random search (they are equivalent).
Now we can address the objection that “evolution takes thousands of years.” Naturally so. This is especially true if they only have the opportunity to breed once a year. Your computer will process it faster.
Modern CPUs run at 3 GHz (3E9 cps), which is 3,000,000,000 clock cycles per second.
Let’s say we have a problem where it takes 1000 clock cycles per individual to compute the objective function value. This is often generous. Therefore, a GA with a population of 1000 people can run 3000 generations per second per thread. If you’re in a hurry, there are special versions of the GA framework that enable multithreading and can run in the cloud.
So speed is not an issue.
Other objections: The point of SHAG is that it can provide answers to problems that humans cannot solve, including problems without reliable and complete input data, and NP-Hard and NP-Complete problems such as the Knapsack problem. This is discussed in The Red Pill.
GA shines in situations where many parameters affect the outcome in complex (even complicated) ways, and where no one knows how to find the optimal answer, but where it is possible to relatively cheaply determine how good the answer is that an individual’s genome represents. .
If you find yourself in this situation, it might be a good idea to explore holistic methods, and be aware that sometimes GA may be an option. GA can be a million times cheaper than LLM for multiple matched tasks, which is economically significant for frequently encountered problems.
So the SHAGs for computers are LLM and other (future?) AI, GA, GP and SA. But the biggest SHAG of all is Darwin’s evolution of natural species. Humans obviously couldn’t have created the platypus from scratch, but evolution did. For more information about this, check out the following article:
To understand how LLMs solve problems that humans cannot solve on their own (such as protein folding), we must first study a simpler case: genetic algorithms.