Open Access is an initiative that aims to make scientific research freely available to all. To date our community has made over 100 million downloads. It’s based on principles of collaboration, unobstructed discovery, and, most importantly, scientific progression. As PhD students, we found it difficult to access the research we needed, so we decided to create a new Open Access publisher that levels the playing field for scientists across the world. How? By making research easy to access, and puts the academic needs of the researchers before the business interests of publishers.
We are a community of more than 103,000 authors and editors from 3,291 institutions spanning 160 countries, including Nobel Prize winners and some of the world’s most-cited researchers. Publishing on IntechOpen allows authors to earn citations and find new collaborators, meaning more people see your work not only from your own field of study, but from other related fields too.
To purchase hard copies of this book, please contact the representative in India:
CBS Publishers & Distributors Pvt. Ltd.
www.cbspd.com
|
customercare@cbspd.com
Interactive genetic algorithm(IGA) is a kind of genetic algorithms in which individuals’ fitness values are evaluated by human subjectively. They combine canonical genetic algorithms with human intelligent evaluations. Now they have been applied to optimization problems whose objectives can not be expressed by explicit fitness functions, such as music composition(Biles,1996), production design (Takagi,1998), image retrieval (Takagi,1999) and so on. However there exits human fatigue in evaluation, which limits population size and the number of evolution generations. Moreover, subjective fitness values given by human mainly reflect human cognition and preference, which is related to domain knowledge. So IGAs need more knowledge than other genetic algorithms for explicit optimization functions.
In IGAs, human fatigue is a key problem which limits the application of the algorithm. According to physiological characters of human, if human are absorbed in one work for a long time, they are easy to feel tired. However, human need to evaluate each individual in each generation. More individuals human evaluate, they feel more tired. If cognition knowledge about human preference to optimization problems can be extracted from the evolution and utilized to evaluate individuals instead of human, the number of individuals evaluated by human shall decrease which will alleviate human fatigue. This cognition knowledge is generally described by surrogate models.
Surrogate models are trained by samples which consist of human evaluated individuals and their fitness values. Hence, they approximatively reflect human preference. When human feel tired, surrogate models can replace human to evaluate all individuals or part of individuals in each generation so as to reduce the number of individuals evaluated by human. This can reduce human burden and alleviate human fatigue.
Here, four key issues about surrogate models which influenc the performances of the algorithm are taken into account. Firstly, surrogate models must exactly keep consistency with human cognition and preference in order to ensure the convergence of the algorithms. So how to obtain the models with good prediction precision and generalization is the base. The second issue is when to use surrogate models instead of human in evaluation. The last two problems are how many and which individuals are selected to calculate fitness values by surrogate models in each generation.
Up to now, many researches on surrogate models have been done. But most of them focus on the structure of surrogate models. Different models adopting artificial neural networks(Zhou,2005) or support vector machines(Wang,2003) are proposed. However, how to rationally utilize surrogate models are not taken into account enough. Firstly, in existing researches, no matter whether surrogate models exactly reflect human real preference, the models are used to evaluate individuals when human feel tired. Secondly, population size is small and fixed all the time. And the number of individuals evaluated by the models in each generation is also fixed. These make IGAs alleviating human fatigue limitedly. Thirdly, few of researches concern which individuals are selected to be evaluated by models. To solve above problems, latter three issues are mainly studied, including when the models are used in evaluation, how many and which individuals are evaluated by the models in each generation.
In the chapter, assume that human preference to optimization problems is stable. A novel interactive genetic algorithms adopting adaptive evaluation strategy based on surrogate models(AES-IGAs) is proposed. Firstly, two measures reflecting the models’ precision and human fatigue are given. Based on them, evaluation process is divided into three phases. In different phase, human participate in evaluation to different extent. Secondly, the number of individuals evaluated by surrogate models is adaptively tuned in evolution according to above two measures so as to alleviate human fatigue at most. Because surrogate models calculate individuals’ fitness values by computers which do not need human participation, population size is enlarged when only surrogate models are adopted in evaluation so as to improve convergence speed. Thirdly, how to select rational individuals evaluated by the models from population is studied. Fuzzy c-means clustering algorithm is adopted And three kinds of approximate distance is put forward as measures. To validate the rationality of AES-IGAs, experiments based on fashion evolutionary design system are done. And testing results are analyzed at last.
Startup mechanism offers some conditions which decide when to use surrogate model in evaluation process. That is, in which generation these conditions are satisfied, population can be evaluated by surrogate models in proper proportion. In general, when human feel tired or surrogate models have learned human preference exactly, the models are adopted to calculate the individuals’ fitness values. So startup mechanism about surrogate models normally includes two conditions. They are the condition of human fatigue and the condition of models’ precision. When any of conditions is satisfied, surrogate models are start up to evaluate.
Suppose FM is the fitness value calculated by surrogate models. FU is the fitness value given by human. xM and xU respectively denote individuals evaluated by the models and human. X(t) expresses the population in t-th generation. Adaptive evaluation strategy is shown as follows(Guo,2007).
Here, Fa(t)≥ε describes the condition of human fatigue where Fa(t) expresses the degree of human fatigue and ε is the threshold of Fa(t).
Definition 1: the degree of human fatigue
The degree of human fatigue reflects how tired human are. In general, human will spend more time evaluating individuals when there are more similar individuals in the population. Therefore, human feel more tired when the total number of individuals evaluated by human is more and time for evaluation in each generation is more.
Letting v(t) denotes how long human spend evaluating. β(t) is the proportion of human evaluated individuals to population. The degree of human fatigue is defined as follows(Guo, 2007).
Fa(t)=1-e-tv(t)β(t)S(t)E2
In formula(2), S(t) denotes the similarity of population, which describes average similarity of individuals in population, shown as follows.
where |P| is population size and n is the length of individuals. σl(xi(t),xj(t)) expresses the similarity of l-th bit between two individuals. σl(xi(t),xj(t))=1 if l-th bit of xj(t) is same as it of xj(t), otherwise σl(xi(t),xj(t))=0.
In formula(1), Pr(t)≤Ψ describes the condition of models’ precision where Pr(t) expresses the reliability of surrogate models and Ψ is the threshold of Pr(t).
Definition 2: the reliability of surrogate models
The reliability of surrogate models reflect the consistency between surrogate models and human preference. It is measured by average Euclid distance between individuals’ fitness values calculated by the models and them given by human, shown as follows.
Pr(t)=1|P|∑i=1|P|(FU(xi,t)-FM(xi,t))2E4
Obviously Pr(t) also describes the generalization of surrogate models.
In a word, whether surrogate models are utilized lies on two conditions: whether or not the degree of human fatigue exceed the threshold; whether or not reliability of surrogate models exceed the threshold.
3. The Number of Substituted Individuals Evaluated by Surrogate Models
In AES-IGAs, some individuals are evaluated by human, whereas others’ fitness values are calculated by surrogate models. Here, individuals evaluated by the models in each generation are called substituted individuals. How many substituted individuals are there is a key problem, which influences the performance of AES-IGAs. Up to now, in most of IGAs adopting surrogate models based evaluation strategy, the number of substituted individuals is fixed and equal to fifty percent of population size. This limits the effect of surrogate models on performances. Aiming at this problem, the number of substituted individuals adaptively varies in AES-IGAs.
In general, one hopes to evaluate less individuals in IGAs when he feels tired. And when surrogate models can reflect human preference better, it shall be used to evaluated more individuals instead of human so as to reduce human burden. Therefore, two factors including human fatigue and models’ precision are considered to decide the number of substituted individuals. When human feel more tired or models’ precision is better, less individuals are evaluated by themselves. So the number of substituted individuals in t-th generation is defined as
NM(t)=⌊|P|ρ(t)⌋E5
Here, ρ(t) is the proportion of substituted individuals to population, called the substituted proportion, shown as follows.
ρ(t)=e-(Pr(t)/f¯)(ε-Fa(t))E6
Here, f¯denotes the upper bound of the fitness values. Obviously the substituted proportion also satisfies ρ(t)+β(t)=1.
In existing researches, the substituted proportion in evolution is normally fixed after surrogate models are used. Corresponding evaluation process in IGAs can be divided into two phases: human-evaluation phases and models-evaluation phases. According to the substituted proportion, two kinds of dual-phase evaluation strategies with different ρ(t) are analyzed as follows.
(1) While the conditions of startup mechanism are satisfied, ρ(t)=1. That is, the evaluation process is divided into two phases. When surrogate models are not used, population is evaluated by human only. Otherwise, all individuals are evaluated by surrogate models.
(2) While the conditions of startup mechanism are satisfied, ρ(t)=C(0<C<1). Here C is a constant (Zhou,2005). That is, when surrogate models are adopted, population is not evaluated by surrogate models only. Some of individuals are evaluated by human, others are evaluated by surrogate models. And the number of substituted individuals is⌊|P|ρ(t)⌋.
In above two evaluation strategies, fixed substituted proportion does not make surrogate models used enough. Therefore, variable substituted proportion is adopted in this chapter. Corresponding evaluation process is different from above two instances. It contains three phases.
Phase I: Population is evaluated by human only
In this phase, the models are not used and all of individuals are evaluated by human. The evaluation mode is defined as follows.
It is obvious that in this phase, human do not feel tired and surrogate models can not exactly reflect human preference. Above phenomena possibly appear at the beginning of evolution. So this evaluation mode is usually adopted in the former evolution. Note that surrogate models shall be continually update according to evaluated individuals in this phase.
Phase II: Population is mixed evaluated by human and surrogate models
In this phase, surrogate models are startup because they learn human preference approximatively. However, the degree of human fatigue does not exceed the threshold. That means human still participate in the evaluation process. And human evaluated individuals are used as samples to improve models’ precision further. Obviously some of individuals are evaluated by human and others’ fitness values are calculated by the models. So the evaluation mode are shown as follows.
In order to reduce human burden gradually, the number of substituted individuals in phase II is increasing along with the degree of human fatigue.
NM(t)=⌊|P|e−(Pr(t)/f¯)(ε−Fa(t))⌋E10
In above two phases, population size is fixed and small because human participate in the evaluation process. In general, population size in IGAs is less than ten so as to alleviate human visual fatigue.
Phase III: Population is evaluated by surrogate models only
In this phase, all of individuals are evaluated by surrogate models. So the evaluation mode is described as follows.
In phase III, human often feel very tired. Individuals’ fitness values given by human may differ from their real preference. So these individuals can not be used to train surrogate models. And the models are not updated any more. This evaluation mode is normally adopted in the latter evolution. Because the evaluation process based on surrogate models is done by computers, there does not exit human fatigue in phase III. That means the evolution of AES-IGAs is the same as traditional genetic algorithms. So population size can be enlarged. But how to increase population size is a key problem.
Higher the models’ precision is, the generalization of the model is better. So enlarged population size is defined as follows.
Np(t)=|P|⌊f¯Pr(τk)+0.5⌋E12
Here, Pr(τk) expresses the reliability of surrogate models in Phase III. It is obvious that the exponent in formula(12) may be 1, 2 or 3. So population size may be enlarged to corresponding multiple of |P|.
5. Surrogate Model-based Evaluation Strategy with Oriented Selection of Substituted Individuals
In existing researches on surrogate model-based IGAs, substituted individuals are often randomly selected from population after the evaluation mode is decided. But if selected substituted individuals are far away from existing training samples, their fitness values calculated by surrogate models may differ from human cognition. This will misguide the evolution away from human real preference. Aiming at above problem, an oriented selection method of substituted individuals is proposed. Here, substituted individuals are selected in terms of the distance between individuals and samples so as to improve the evaluation precision to the utmost extent.
Taken oriented selection process in t-th generation as example, algorithm steps are shown as follows(Guo,2007).
Step1: The number of substituted individuals NM(t) is decided in terms of human fatigue.
Step2: Samples are composed of evaluated individuals and their fitness values given by human. They are grouped into sub-sample sets adopting Fuzzy-C mean algorithm.
Step3: Aiming at each sub-sample set, a surrogate model described by artificial neural networks is trained.
Step4: The distance between each individual in population and each samples or each sub-sample sets are calculated.
Step5: Individuals are sorted according to above distance. Then individuals with less distance are regarded as substituted individuals.
Step6: The fitness values of substituted individuals are calculated by their affiliated clusters’ surrogate model.
Obviously how to group samples, train surrogate models and select substituted individuals are three main issues. Here, former two problems are mainly discussed.
5.1. Sample clustering
How to construct sample set and decide the cluster number are two key problems in sample clustering.
Sample set are composed of evaluated individuals and their fitness values given by human. They are expressed by xU(t) and f(xU(t)). Because human cognition may vary in evolution, the fitness value of same individual may be variant in different generation. In order to track human preference, latest fitness values of any individuals are reserved. Therefore, sample set are described as follows.
Q(t)={(xU,f(xU))|xUl≠xUk,l,k=1,2,⋯|Q(t)|}E13
Sample set are grouped into clusters by fuzzy c-means clustering algorithm (Karayiannis, 1997). Suppose L=2,3,…,K(K<|Q(t)|) is the cluster number, which is normally decided according to domain knowledge. Improper K will lead to wrong clustering result. Aiming at this problem, comparison method (Cheng,2006) is adopted to obtain suitable cluster number. Assume that Qi(t)(i=1,2, …,L) is sub-sample set. And each cluster center is
Here, m∈[1,∞)is the weighing index. uil∈[0,1]denotes the subjection degree of xUl(t) to Qi(t), which satisfies∑i=1Luil=1,∀l=1,2,⋯,|Q(t)|.
5.2. Selection of substituted individuals
How many substituted individuals are there in each generation is the principal issue, which is decided by ρ(t). However, which individuals are selected as substituted individuals depends on their approximate distances. Here, the distances between an individual and each sample or each sub-sample set are called approximate distances.
Suppose xj(t) is an individual. Three kinds of approximate distances are defined as follows(Cheng,2009).
(1) Center approximate distance
The clustering center of each sub-sample set is used to judge which cluster does an individual belong to. Suppose ci(t) is clustering center of i-th cluster. The minimum distance between xj(t) and ci(t) is defined as center approximate distance, expressed by D1.
D1j=mini∈{1,2,⋯L}d1(xj(t),ci(t))E15
If individuals are encoded by real number, d1 is described as
d1(xj(t),ci(t))=∑h=1n(xjh(t)−cih(t))2E16
If individuals are encoded by binary number, d1 is described as
d1(xj(t),ci(t))=∪h=1n(xjh(t)⊕Cih(t))E17
Because only clustering centers of sub-sample sets are taken into account, the computation complexity of center approximate distance is O(|P|L).
(2) Cluster approximate distance
All of samples in each sub-sample set are used to judge which cluster does an individual belong to. The minimum distance beween xj(t) and samples in Qi(t) is defined as cluster approximate distance, expressed by D2.
D2j=mini∈{1,2,⋯L}d2(xj(t),Qi(t))E18
Suppose xUi(t) is a sample in i-th cluster. If individuals are encoded by real number, d2 is described as
Here,|Q(t)|=∑i=1L|Qi(t)|. So the computation complexity of cluster approximate distance is O(|P||Q(t)|).
(3) Generalized approximate distance
All of samples in sample set are used to judge which cluster does an individual belong to. The minimum distance between xj(t) and xU(t) is defined as generalized approximate distance, expressed by D3.
D3j=minl∈{1,2,⋯|Q(t)|}d3(xj(t),xUl(t))E21
If individuals are encoded by real number, d3 is described as
d3(xj(t),xUl(t))=∑h=1n(xjh(t)−xUlh(t))2E22
If individuals are encoded by binary number, d3 is described as
d3(xj(t),xUl(t))=h=1n(xjh(t)⊕xUlh(t))E23
Obviously the computation complexity of generalized approximate distance is O(|P||Q(t)|).
Through comparison of the computation complexity of above three approximate distances,
we know that the computation complexity of generalized approximate distance is the same as it of cluster approximate distance. In addition, population size is normally more than the cluster number, namely |Q(t)|≥L. So the computation complexity of center approximate distance is least.
In genetic algorithms, whether and how fast they converge are used to measure the algorithm’s performances. And many researches have discussed genetic algorithms’ convergence. However, IGAs’ convergence is seldom taken into account. In this chapter, AES-IGAs’ convergence is analyzed in terms of drift analysis and the first hitting time condition proposed by Jun He(He, 2001).
Definition 3: critical generation
In AES-IGAs, when t>τk, surrogate models are used to calculate all individuals’ fitness values. If the algorithms can converge to optimal solution under above condition, τk is called critical generation as shown in follows(Guo,2007). Whether τk exists is a key problem which influences AES-IGAs’ convergence.
Here, D(xi(t+ζ))is simplified description ofD(xi(t+ζ),x*),x*∈S*.S* is the optimal variable space. D() denotes the distance between two individuals. If individuals are encoded by binary number, D() is calculated by Hamming distance. Otherwise, if individuals are encoded by real number, D() is calculated by Euclidean distance.
Theorem 1: For any generation satisfied t>τk, all individuals are evaluated by surrogate models. During this phase, the evolution of AES-IGAs is the same as genetic algorithms. Based on the first hitting time drift condition with exponential form(He,2001), there existsζ¯=min{ζ|D(X(t+ζ))=0}, which satisfies drift conditionE[ζ¯|d(X(τk))0]h(|P|). Here, h(|P|) is a polynomial function of |P|.
Proof: Assume that individuals are encoded by binary number. So the distance between an individual and optimal solution is
D(xi(t))=∑l=1nσl(xi(t),x*)E26
Given a random sequence{D(X(τk+j)),j=0,1,⋯}, ∀X(τk+j),D(X(τk+j))≤|P|is obviously satisfied. If the algorithms adopt one-point crossover, bit mutation and roulette selection, following inequation is obtained in terms of drift condition.
E[D(X(τk+j+1))−D(X(τk+j))|D(X(τk+j))0]≥h1(|P|)E27
Here, h1(|P|)>0.From Theorem.1 in reference(He,2001), we knowE[ζ¯|d(X(τk))0]|P|h1(|P|).
Leth(|P|)=|P|h1(|P|). Then E[ζ¯|d(X(τk))0]h(|P|) is satisfied.
Theorem 2: When t>τk, the substituted proportion ρ(t) converges to 1.
Proof: Suppose {⋯ρ(t),ρ(t+1)⋯|∀ρ(t)0} is a random sequence. The gradient of ρ(t) is
Consideringε−Fa(t)0andFa(t+1)Fa(t)⇒ε−Fa(t+1)ε−Fa(t), then
Therefore,ρ(t+1)ρ(t)1. This shows that random sequence {⋯ρ(t),ρ(t+1)⋯|∀ρ(t)0} increases by degrees. And ρ(t) continuously varies along with t. Whent≥τk,. So
ρ(t)=e−[(Pr(t)/f¯)(ε−Fa(t))]≥1E29
This makes limρ(t)t→τk=1 while∃τk≤T. Therefore, total number of individuals evaluated by human in evolution is
Here, fashion evolutionary design system is adopted as a typical background to validate the rationality of IGAs with adaptive evaluation strategy (AES-IGAs). The goal of the system is to find a dress which wins human favor (Kim,2000). Visual Basic 6.0 as programming tool for human-machine interface and Microsoft Access as database are utilized. Matlab 6.5 is adopted to train surrogate models based on artificial neural networks. System’s human-computer interface is shown as follows.
Figure 1.
Human-computer interface in AES-IGAs.
In fashion evolutionary design system, each dress is composed of collar, skirt and sleeve. Each part has two factors including pattern and color which described by two bits. So each dress is expressed by twelve bits, which act as six gene-meaning-units (GM-units) (Guo,2007). Each gene-meaning-unit has four alleles. The meanings of each allele in gene-meaning-unit are shown in Table 1.
GM-unit
allele
meaning
code
meaning
code
meaning
code
meaning
code
collar’s pattern
medium collar
00
high collar
01
wide collar
10
gallus
11
sleeve’s pattern
long sleeve
00
medium sleeve
01
short sleeve
10
non-sleeve
11
skirt’s pattern
long skirt
00
formal skirt
01
medium skirt
10
short skirt
11
color
pink
00
green
01
black
10
white
11
Table 1.
The meanings of each allele in gene-meaning-unit.
7.2. Desired objectives and parameters in experiments
In order to validate the rationality of adaptive evaluation strategy and the influence on performance of IGAs, two groups of experiments are designed. They have different desired objectives which reflect different psychological requirements of human. Desired objectives of experiments are shown as follows.
Experiment I: To find a favorite dress fitting for summer without the limit of color.
Experiment II: To find a favorite dress fitting for summer and the color is pink.
In both experiments, artificial neural network is adopted as surrogate model. The values of parameters about the model and the evolution process are shown in Table 2.
parameters about the evolution
crossover probability
mutation probability
population size
generation
ε
ψ
0.6
0.01
8
40
0.8
0.9
parameters about the model
input neurons
hidden neurons
output neurons
learning rate
epochs
error
6
15
1
0.09
15000
10-2
Table 2.
The values of parameters.
7.3. Comparison of different substituted proportion
In order to validate the rationality of AES-IGAs, thirty persons are gathered to do two groups of experiments aiming at desired objective of experiment II. In experiments, when surrogate models are used in evaluation process, two kinds of fixed substituted proportion and adaptive substituted proportion are adopted respectively. Testing results done by all persons are integrated, as shown in Table 3.
the substituted proportion
average generation
average number of human evaluated individuals
average generation when the models are adopted
ρ(t)=0.5
15
104
11
ρ(t)=1
14
88
11
ρ(t)=e-(Pr(t)/f¯)(ε-Fa(t))
12
55
8
Table 3.
Comparison of the performance by different substituted proportion.
Obviously when different ρ(t) are adopted, convergence speed of the algorithms are similar. But total number of individuals evaluated by human varies. The reason for this is that surrogate models are adopted in evaluation only when formula(1) is satisfied. Firstly, if ρ(t)=0.5, half of population is evaluated by human when surrogate models are adopted. However, human shall evaluate individuals all along. So total number of individuals evaluated by human is more and human will feel more tired. Secondly, if ρ(t)=1, all individuals are evaluated by surrogate models when human feel tired and the models can reflect human preference exactly. Although human evaluate fewer individuals than first strategy, the models are used later than adaptive evaluation strategy. That is, less individuals are evaluated by human in AES-IGAs. Therefore, IGAs adopting adaptive evaluation strategy can effectively alleviate human fatigue more.
7.4. Comparison of different population size in Phase III
When population is evaluated by surrogate models only, fixed population size |P| and population size NP are adopted in experiments respectively. Testing results are shown in Table 4.
population size
average generation
average number of human evaluated individuals
|P|
12
55
Np
10
55
Table 4.
Comparison of different population size in Phase III.
It is obvious that different population size in Phase III do not influence the degree of human fatigue in evolution because average number of individuals evaluated by human is same. As we know, no matter how many individuals are there in Phase III, human do not participate in evaluation any more. That is, total number of human evaluated individuals in evolution only depends on the evaluation in Phase I and Phase II. However, convergence speed is faster while population size in Phase III is enlarged. The reason for this is that exploration of the algorithm with larger population size is better.
7.5. Critical generation when Phase III is started
Thirty persons are gathered to do two groups of experiments aiming at above two desired objectives. Results in experiments are shown in Figure 2.
Figure 2.
ρ(t) in experiments.
From Figure 2, we know AES-IGAs converges. This matches Theorem.2. In addition, the substituted proportion is increasing along with the evolution. Aiming at desired objective of experiment II, all individuals are evaluated by the models when t≥8. That is, critical generation is 8. Obviously desired objective of experiment II include more detail requirements. So human need to pay attention to more GM-units, which make human feel more tired. This leads to larger critical generation. However, aiming at desired objective of experiment I, AES-IGAs converge when t=7. At this time, the degree of human fatigue do not exceed the threshold. So surrogate models are not completely used to replace human in evaluation process. That means critical generation do not exist in experiment I.
7.6. Comparison of different selection methods of substituted individuals
In order to analyze the performance of oriented selection methods, following three measures are given.
(1) Generation when stop rules are satisfied or human find the satisfied individuals reflects convergence speed, as shown in follows.
T=argmint|f(x(t))-f*|δE31
(2) Total number of individuals evaluated by human in evolution influences the degree of human fatigue. Obviously T or NU is larger, human feel more tired.
NU=∑t=1T(|P|−NM(t))E32
(3) Evaluation error of surrogate model describes square root error of the fitness values given by human human fU(xM(t)) and the models fM(xM(t)). Note that fU(xM(t)) is only used as reference and not utilized in evolution. Obviously this measure reflects generalization of the models.
EF(t)=1NM(t)f¯∑k=1NM(t)(fM(xMk(t))−fU(xMk(t))2E33
Sixteen persons are divided into four groups. Aiming at desired objective of experiment II, each group does experiments adopting random selection method and three kinds of oriented selection methods respectively. Here, cluster number L=3. This means human preference is divided into three ranks, including favor, normal, dislike. Testing results done by each group are integrated, as shown in Table 5.
measures
T
NU
random selection
18
112
oriented selection (D1)
value
14
92
comparison with random selection
↓22.2%
↓17.9%
oriented selection (D2)
value
12
76
comparison with random selection
↓33.3%
↓32.1%
oriented selection (D3)
value
12
80
comparison with random selection
↓33.3%
↓28.6%
Table 5.
Comparison of different selection methods.
From Table5, we know that: (I) Because the number of samples is small in AES-IGAs, generalization of surrogate models are bad. If substituted individuals are randomly selected, they may be far from samples, which may lead to large evaluation error of surrogate model. So convergence of IGAs adopting random selection method is worse. (II) Human may favor more than one satisfied dress. Hence desired objectives of both experiments in fact are implicit multi-mode functions. Because oriented selection methods adopting D2 or D3 all consider the distances between all samples and individuals, useful information embodied in evolution are utilized enough. So their performance are better. However, oriented selection methods adopting D1 only takes the distribution of cluster center into account. Incomplete information of individuals may result in larger evaluation error so as to make the performance worse.
In above experiments, evaluation error of surrogate models varies during the evolution, as
shown in Figure 3. Obviously no matter which selection method is adopted, EFincreases in former evolution. During this stage, human do not feel tired and can not exactly decide which one is he most like. Surrogate models are trained based on samples and do not participate in evolution. Along with the evolution, human gradually know what they favor. Surrogate models are close to human preference. So evaluation error decrease in latter evolution. In addition, oriented selection methods adopting D2 or D3 reserve samples’ information to largest degree. Hence AES-IGAs using these selection methods have less evaluation error. This can effectively avoid misguiding evolution.
Figure 3.
Evaluation error of surrogate models.
In each generation, different substituted individuals are selected when above four selection methods are respectively adopted. In order to compare the difference among random selection method and oriented selection methods, the number of dissimilar substituted individuals in each generation are noted, as shown in Table 6. Here, NMD(t)denotes the number of dissimilar substituted individuals.
generation
6
7
8
10
11
12
15
18
random selection
NM(t)
0
0
0
0
4
4
4
4
oriented selection (D1)
NM(t)
0
0
0
4
4
4
4
-
NMD(t)
0
0
0
4
3
2
2
-
oriented selection (D2)
NM(t)
0
4
4
4
4
-
-
-
NMD(t)
0
4
4
4
3
-
-
-
oriented selection (D3)
NM(t)
0
0
4
4
4
4
-
-
NMD(t)
0
0
4
4
3
3
-
-
Table 6.
Comparison of dissimilar substituted individuals selected by different methods.
In a word, the performances of oriented selection methods are better than random selection method. Through comparison of three kinds of oriented selection methods, oriented selection methods using D2 or D3 has better performance and stability, yet larger computation complexity. On the contrary, computation complexity of oriented selection methods using D1 is less, whereas the algorithm’s performance is worse. Considering multi-mode property of human preference and small population size in IGAs, oriented selection methods using D2 is more fit for IGAs than others.
7.7. Comparison of selection methods with different cluster number
Experiments adopting AES-IGAs with different cluster number are done. Here, two fixed cluster number including 3 or 6 and adaptive cluster number are adopted. Testing results are shown in Table 7.
L
3
6
adaptive
measures
T
NU
T
NU
T
NU
random selection
18
112
17
108
15
100
oriented selection (D1)
value
14
92
12
80
11
72
comparison with random selection
↓22.2%
↓17.9%
↓29.4%
↓25.9%
↓26.7%
↓28.0%
oriented selection (D2)
value
12
76
11
76
12
76
comparison with random selection
↓33.3%
↓32.1%
↓29.4%
↓29.6%
↓20.0%
↓24.0%
oriented selection (D3)
value
12
80
12
84
11
76
comparison with random selection
↓33.3%
↓28.6%
↓35.3%
↓22.2%
↓26.7%
↓24.0%
Table 7.
Comparison of selection methods with different cluster number.
By compare the results, we know that: (I) AES-IGAs using adaptive cluster number have best performance. Because cluster number is more reasonable, the clustering precision of individuals is better. This will speed up convergence. (II) No matter which kind of approximate distance is adopted, oriented selection methods converge faster than random selection method. And oriented selection method adopting D2 has better stability. The reason for this is individuals in each cluster contains more information than cluster center, which makes evaluation error less. Hence, surrogate model can participate in evaluation earlier so as to alleviate human fatigue at most.
In order to analyze the influence of selection methods and cluster number on human fatigue, the curves describing the degree of human fatigue are shown in Figure 4. Note that the threshold of human fatigue is 0.95 in these experiments.
Figure 4.
The trend of human fatigue.
Obviously no matter which kind of selection methods are used, human fatigue all can be alleviated as long as cluster number is reasonable. Especially the degree of human fatigue is less if oriented selection methods using D2 or D3 are adopted.
7.8. Comparison of performance about IGAs
In order to validate the improvement in performance of IGAs with adaptive evaluation strategy, thirty persons are gathered. Everyone do experiments aiming at both desired objectives adopting conventional IGA and AES-IGA respectively. Testing results for above four experiments done by all persons are integrated, as shown in Table 8.
experiments
I
II
algorithms
IGA
AES-IGA
IGA
AES-IGA
average generation
14
7
31
12
average number of individuals evaluated by human
112
38
248
55
average number of individuals evaluated by human in each generation
8
5
8
5
generations when Fa(t)≥ε
-
-
-
8
Table 8.
Comparison of the performance between IGAs and AES-IGAs.
Through comparison of testing results in experiment I, generation adopting AES-IGA averagely reduces 52.1% than IGA. And total number of individuals evaluated by human adopting AES-IGA averagely reduces 71.5%. These indicate adaptive evaluation strategy can effectively alleviate human fatigue and speed up convergence.
In order to further alleviate human fatigue in interactive genetic algorithms, surrogate models are used to evaluate individuals instead of human. When to utilize surrogate models, how many and which individuals are calculate fitness values adopting the models are studied. Startup mechanism about surrogate models considering the degree of human fatigue and the models’ precision is given. Variable proportion of population evaluated by surrogate models is proposed. Three phases including evaluated by human only, mixed evaluated by human and the models, evaluated by the models only, are discussed according to the number of substitution individuals. Especially, population size is enlarged in third phase. Surrogate model-based evaluation strategy with oriented selection of substituted individuals is proposed. Three kinds of approximate distance are presented so as to judge which cluster individuals belongs to. At last, convergence of AES-IGAs is proved.
Taking fashion evolutionary design system as a testing platform, the rationality of adaptive evolution strategy is validated aiming at different psychological requirements of human. Testing results indicate AES-IGAs converge faster than canonical IGAs and human feel less tired. Therefore, AES-IGAs can effectively alleviate human fatigue and improve the convergence speed so as to reduce human burden for evaluation which makes human absorbed in more creative design work.
This work was supported by National Natural Science Foundation of China (Grant No. 60805025) and the 863 Project of China (Grant No. 2007AA12Z162).
References
1.BilesJ. A.AndersonP. G.LoggiL. W.1996 Neural network fitness functions for a musical IGA, Proceedings of the International ICSC Symposium on Intelligent Industrial Automation and Soft Computing, 3944
2.TakagiH.1998 Interactive evolutionary computation: system optimization based on human subjective evolution. Proceedings of IEEE Conference on Intelligent Engineering System, 16
3.TakagiH.ChoSung-BaeKodaToshihiko1999 Evaluation of an IGA-based Image Retrieval System Using Wavelet Coefficients. IEEE International Fuzzy Systems Conference Proceedings, 17751780
4.ZhouYong; Dun-weiGong; Guo-shengHao; et al.2005 Neural network based phase estimation of individual fitness in interactive genetic algorithm. Control and Decision, 20234236
5.WangS. F.; WangS. H.; WangX. F.2003 Improved interactive genetic algorithm incorporating with SVM and its application. Journal of Data Acquisition & Processing,18429433
6.GuoYi-nan; GongDun-wei; HuiWang. 2007Adaptive evaluation strategy based on surrogate model. Lecture Notes in Computer Science. 4550472481
7.GuoYi-nan; GongDun-wei. 2007 Extraction and utilization about knowledge in hierarchical interactive genetic algorithms. Control and Decision, 1213291335
8.KarayiannisN. B.BezdekJ. C.1997 Integrated approach to fuzzy learning vector quantization and fuzzy c-means clustering. IEEE Transactions on Fuzzy Systems, 54622630
9.ChengJian; GuoYi-nan; QianJian-sheng. 2006 A multiple neural network architecture based on fuzzy c-means clustering algorithm. Proceedings of 6th World Congress on Intelligent Control and Automation, 18751878
10.ChengJian; GuoYi-nan; GongDun-weiet al.2009 Surrogate model-based evaluation strategy with non-random selection of substituted individuals. Chinese Journal of Electronics, 1190195
11.HeJunYaoXin2001Drift analysis and average time complexity of evolutionary algorithms. Artificial Intelligence.
12.KimH.ChoS. B.2000 Application of interactive genetic algorithm to fashion design. Engineering Applications of Artificial Intelligence, 13635644