1. Introduction
Existing approaches to solving H∞ control problem typically pose the problem as a differential game where the solution is obtained by finding the minimax saddle point; see [1, 2, 3, 4, 5, 6, 7] for example. This approach is intuitive from the H∞ perspective since the problem is indeed mathematically a minimax problem. However, when it is found that the Riccati equations obtained are indeed very similar to the standard Riccati equations obtained from related LQR problems with the same system matrices, tremendous interests on how to reconcile and combine the two approaches in a fruitful way have been generated. This has given rise to the once-very-hotly investigated mixed H2/H∞ control problem [1, 8, 9, 10]. Specifically, the system model considered is of the form:
dxt/dt=Axt+Buut+E2w2t+E∞wωtz2t=C2xt+D2w2tz∞t=C∞xt+D∞w∞t, x0=0E1
Let T2 and T∞ denote the closed-loop transfer functions from w2 to z2 and from w∞ to z∞, respectively. The problem was to minimize the H2 norm of T2 subject to the H∞ norm of T∞ smaller than certain prescribed positive value γ. Clearly, if w∞ = 0, then the H2 norm will give a performance measure on the effect of w2. Conversely, if w2 = 0, then the H∞ norm would give a performance bound on the effect of w∞. But what if w2 and w∞(t) are both nonzero? The main focus of such an approach is usually to optimize the H2 norm for performance, and the H∞ norm bound was more for “robustness” against model uncertainties than for worst-case performance. Thus, we start with a signal-based worst-case control problem, but eventually end up with a “robust” (bound on the H∞ norm) LQR control solution. Such an approach is justifiable but an implicit implication is that the performance is only optimized for the H2 norm and hence not, at least not expressly, about worst-case performance optimization. Another fact to note is that though we obtain a solution to guarantee the H∞ bound of the closed-loop system by solving the relevant Riccati equation (Eq. (36)), such bound is generally not tight. This fact can be easily seen by noting the fact that as we set γ−2 → 0, we recover the LQR. Obviously, the H∞ norm of the closed-loop system corresponding to an LQR cannot be ∞; so it is clear that as we set γ2 → ∞, the closed-loop H∞ norm does not approach ∞, and hence, γ does not equal the H∞ norm in general. An implication of this fact is that the controller obtained via solving Eq. (36) with γ2 = γo2 is not one that minimizes the closed-loop H2 norm subject to the H∞ norm less than or equal to γo2. For example, suppose the closed-loop H∞ norm of the pure LQR control system is γLQR2. Then, it is clear that this LQR is also the optimal solution to the minimum closed-loop H2 norm control problem subject to any value of closed-loop H∞ norm larger than γLQR2, and this solution ought to be unique. But if we solve Eq. (36) with any finite γ2 > γLQR2, we will not obtain the LQR. This implies that the obtained solution cannot be optimal.
Other salient features of such an approach are: (i) the performance in consideration is usually (there are some exceptions such as [7]) over the infinite time horizon, (ii) the initial state is zero and, (iii) as already pointed out above, the performances in respect of the H2 and H∞ criteria are, loosely speaking, in a mutually-exclusive “either-or” situation; in that, it does not consider the combined effect when both the stochastic and the deterministic disturbances are present simultaneously. In other words, while the H∞ norm gives a bound on the worst-case performance of the closed-loop system when there is a deterministic disturbance (but with no stochastic noises) and the H2 norm gives indication on the performance in the presence of stochastic noise (but in the absence of deterministic disturbances), the combined effect on the closed-loop system when both types of exogenous disturbances are present is not explicitly and quantitatively addressed. Another problem we may identify is that the H2 norm obtained in a mixed H2/H∞ design is actually in regards to the nominal model. When the system matrices are subject to perturbations, such H2 norm would also be subject to change. Hence, the question of what would be the worst-case performance of the closed-loop system when the nominal model has been subject to perturbations remains to be answered. In other words, stability robustness may be established but nothing about the worst-case performance has been said, at least not quantitatively.
In this chapter, we revisit the mixed H2/H∞ control problem but from a different perspective. Specifically, we pose the original problem as a stochastic H∞ control problem where the system is subject to both possibly “worst-case” (bounded) deterministic disturbances and stochastic “zero-mean” white noises simultaneously. We are interested in minimizing the worst-case combined effect of these disturbances, along with an uncertain nonzero initial condition [6, 11], on the closed-loop performance of the system. It should be clear that these problems would arise when one is interested to model and drive the state of a system from some uncertain nonzero value to the origin in some optimal way, in the presence of both high and low frequencies noises. Note that while we model the disturbances as two separate noises, they could actually be one single disturbance consisting of both low and high frequencies components, thus modeling a nonzero-mean disturbance.
Our results here will be derived from the Riccati differential equation related to the finite-horizon LQR problem, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we minimize the expected worst-case cost and derive our solution by explicitly working with P(0) of the finite-horizon LQR Riccati equation, which is directly related to the value of the cost functional concerned when the initial state is nonzero. Clearly, if we assume the deterministic noise to be absent, the problem reduces to a stochastic LQR problem, and if we assume the stochastic white noise to be absent, we recover the standard H∞ control problem. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H∞ control problem, and H∞ control with stochastic disturbances (compare [5, 8, 9, 10] for example).
It should be noted that the problem we consider here is different from the problem considered in some existing stochastic H∞ control literature. For example, in [5, 10], the stochastic H∞ approach was to incorporate stochastic disturbances into the system matrices to obtain a stochastic system with multiplicative white noise perturbations. This approach is quite different from the standard and familiar stochastic LQR/LQG approach, where the system matrices are assumed to be deterministic, not subjecting to stochastic perturbation or state-dependent multiplicative noises.
The organization of the chapter is as follows. In Section 2, we formally formulate the control problem of our interest. In Section 3, we state and derive some mathematical preliminaries. In Section 4, we consider how one may synthesize a finite-horizon controller with uncertain initial state. In Section 5, we consider steady-state solutions. Section 6 considers control with output feedback when the states are not directly measureable. In Section 7, we give an example. Section 8 contains our conclusions.
Notation: ||.||2 denotes Euclidean norm of the vector; ||.||L2 the L2-norm; Δ denotes “equal by definition”; E{.} denotes taking expectation; (∫T. dt) denotes the definite integral with respect to t from 0 to T; ||x(t)||T2 denotes E(∫Tx(t)Tx(t) dt), and A ≥ B means A-B is positive semi-definite.
Advertisement
2. Problem formulation
Consider the continuous-time linear time-invariant system with input disturbances w(t) and v(t):
dxt/dt=Axt+Buut+Bwwt+vtE2
Ex0=0,Ex0x0T≤X0E3
A, Bu, Bw are known real matrices, u(t) is the control input, w(t) is an unknown but bounded exogenous input with E{ʃTw(t)Tw(t)dt} ≤ 1, v(t) is a stationary Gaussian zero-mean white noise with known covariance matrix ϴ. For simplicity, we assume that the system is controllable and observable in the case of output feedback.
The feedback control problem of our interest is to find a linear causal control law u(t) = K(t)x(t) to minimize
Ju=Π1/2xtT2+R1/2utT2+ExTTQxTE4
where Π and R are positive-definite and Q is positive-semi-definite weight matrices.
Note that (4) may be re-written as
Ju=Π1/2xtT2+R1/2KtxtT2+EQ1/2xT2E5
In what follows, we shall sometimes drop the argument t (especially for the system matrices), or denote z(t) as zt, for notational simplicity when there is no danger of confusion.
Advertisement
3. Preliminaries
Lemma 3.1: Let A, B, L = LT, and QL = QLT be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution P0(t) of the differential Riccati equation (DRE)
dPt/dt=–PtA–ATPt+PtBBTPt–LPT=QL.E6
Then the minimum of the cost functional
JΔ¯∫TxTLx+wTwdt+xTTQLxTE7
for the system dx/dt = Ax(t) + Bw(t) + v, E{v(t)} = 0, E{v(t)v(t)T} = ϴ, E{x(0)} = 0, and E{x(0)x(0)T} = X0 is given by
Jmin=TrPo0X0+∫TPotΘdtE8
The minimizing w(t) is given by
wt=–BTPotxtE9
Proof: Well-known stochastic LQR result (see Ref. [12], p. 221).
We next give a similar result concerning the deterministic problem and the related Riccati differential equation, which we will find useful later in studying some of the properties of RDE (6).
Lemma 3.2: Let A(t), B(t), L(t) = L(t)T, and QL = QLT be given matrices. Suppose there exists on the interval 0 ≤ t ≤ T a solution P0(t) of the differential Riccati equation (DRE)
dPt/dt=–PtA–ATPt+PtBBTPt–LPT=QL.E10
Then the cost functional
ηΔ¯∫TxTLx+wTwdt+xTTQLxTE11
for system dx/dt = Ax(t) + Bw(t): x(0) = x0 is given by
η=wt+BTPotxtT2+x0TPo0x0.E12
The minimum value of η is given by x0TP0(0)x0, and the minimizing w(t) is given by
wt=–BTPotxtE13
Proof: See Ref. [13], (proof Theorem 1, p. 131.)
Remark 3.1: Note that QL and L are symmetric but not necessarily nonnegative definite. Furthermore, as pointed out in [13], because L is not assumed to be nonnegative definite, it is not generally true that a minimum η exists. However, we know from LQR theory that such minimum η exists when L is nonnegative definite. Hence, it is reasonable to expect that a minimum η exists if L is not “exceedingly” negative definite.
Lemma 3.3: Let A, B, L = LT ≤ 0, and QL = QLT be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution Ps(t) of the differential Riccati inequality (DRI), [14]:
dPt/dt≥–PtA–ATPt+PtBBTPt–LPT=QL.E14
Then for the system dx/dt = Ax(t) + Bu(t): x(0) = x0, η as defined in (11) satisfies:
η≥wt+BTPstxtT2+x0TPs0x0.E15
Proof: Since L ≤ 0, there exists L1 ≤ L such that
dPst/dt=–PstA–ATPst+PstBBTPst–L1PsT=QL.E16
Let
η1Δ¯∫TxTL1x+wTwdt+xTTQLxTE17
Clearly, η ≥ η1 for any x and w since L ≥ L1. Applying Lemma 3.2 with L replaced with L1 gives η1 = ||w(t) + BTPs(t)x(t)||T2 + x0TPs(0)x0 and completes the proof.
Remark 3.2: Since the inequality sign in (14) is non-strict, the set of solutions that solve Lemma 3.3 contains P0(t) of Lemma 3.2 as a member.
Remark 3.3: In Lemmas 3.1-3.3, P(t) is independent of the initial condition x0, even though the cost η is a function of x0.
Proposition 3.1: Let P0(t) be the solution that satisfies Lemma 3.2 and P1(t) be any solution that satisfies (14). Then
Po0≥P10E18
Proof: Since η ≥ η1 for any x and w, it follows that x0TP0(0)x0 ≥ x0TP1(0)x0 for any x0. Thus (18) must follow.
Remark 3.4: Note time 0 may be any time before time T. So (18) also implies P0(t) ≥ P1(t), all t < T.
Proposition 3.2: Let P1(t) be a solution and P2(t) be any solution that satisfies (14). Then P1(t) also satisfies Lemma 3.2 if and only if
TrP10≥TrP20E19
for all admissible P2(t).
Proof: Necessity is obvious from Proposition 3.1. To prove sufficiency, suppose there is a P1(t) that satisfies (19) but not a solution to Lemma 3.2. Since P1(t) is a solution to (14), it follows from Proposition 3.1 that P0(0) ≥ P1(0). But P0(0) is also an admissible solution of Lemma 3.3. Then, since Tr[P1(0)] ≥ Tr[P0(0)] but P1(0) ≠ P0(0), it follows that P0(0) ≥ P1(0) does not hold. This contradicts Proposition 3.1 and completes the proof.
Proposition 3.3: Let P1(t) be a solution to Lemma 3.2 and P2(t) be any solution that satisfies (14). Then the following statements are equivalent:
TrP10≥TrP20dTrP1t/dt≤dTrP2t/dtE20
Proof: Obvious consequence of Bellman’s principle of optimality, since both P1(t) and P2(t) are solutions computed backward in time from final time T with Ps(T) = QL.
Remark 3.4: Proposition 3.3 tells us that in order to find the minimizing w(t) which minimizes (7) and (11) from the set of sub-optimal solutions characterized by Lemma 3.3, it suffices to search for one with the maximum TrP(t) or minimum d{TrP(t)}/dt over all tϵ[0,T]. These properties will be useful later when we search for the optimal controller via linear matrix inequality (LMI) as follow.
Consider the LMI:
Γt+PA+ATP–LPBBTPI≥0E21
Applying Schur transformation to (21) then gives us Γ(t) ≥ –P(t)A–ATP(t) + P(t)BBTP(t)–L. It follows from Proposition 3.3 that the Γ(t) that solves minTrΓ subject to (21) at all tϵ[0,T] will give us the desired dP/dt that solves dP/dt = –PA–ATP + PBBTP(t)–L for all tϵ[0,T], where Γ = dP/dt.
Before proceeding further, let us recollect what we have done so far. Lemmas 3.1 and 3.2 show that the minimizing w(t) is the same whether the system is subjected to an input stochastic white noise or not. However, the minimum costs resulted will be different as given by (8) and x0TP0(0)x0 (implicitly given by (12)), respectively. Then from Lemma 3.3 to Proposition 3.3, we transformed the problem of finding P(t) for (6) into an equivalent LMI problem.
Advertisement
4. Finite-horizon state regulation
Let u = Kx where K is the state feedback controller. Let Acl = A + BuK. Let
L=–Π+KTRKE22
Then it is clear
Ju=∫TxTΠ+KTRKxdt+xTTQxTE23
We are interested in finding a w(t) to maximize Ju. For this purpose, we consider
–γ−2∫TxTΠ+KTRKxdt+xTTQxT+wT2–wT2E24
Let
dPt/dt=–PtAcl–AclTPt+PtBBTPt+γ−2Π+KTRKPT=–γ−2Q.E25
Lemma 4.1: Let Acl = A + BuK and P0 be given by (25) for some chosen value of γ2. Then for the system (2), the state feedback control law u = Kx achieves the close loop performance for all admissible w(t):
Ju≤γ2‖w‖T2–TrPo0X0–∫TPotΘdtE26
And the worst case w0 achieves
Ju=γ2‖wo‖T2–TrPo0X0–∫TPotΘdtE27
Proof: From Lemma 3.1,
minw–γ−2∫TxTΠ+KTRKxdt+wT2–wT2=TrPo0X0+∫TPotΘdt–wT2
Multiply throughout by –γ2 we obtain:
maxwJu–γ2wT2+γ2wT2=–γ2TrPo0X0+∫TPotΘdt+wT2E28
This completes the proof.
Remark 4.2: From (26), we recover the conventional H∞ inequality [4, 12]:
Ju≤γ2‖w‖T2ifx0=0andΘ=0E29
Corollary 4.1: Let Acl = A + BuK and P0 be given by Lemma 4.1. If w0 = –BwTP0(t)x(t), then
Ju=γ2‖wo‖T2–TrPo0X0–∫TPotΘdtE30
Proof: Obvious from (28) and Lemma 3.1. This completes the proof.
To derive the optimal controller for Lemma 4.1, We substitute Acl = [A + BuK] into (6) with P(T) = −γ−2Q. Thus we obtain
dPt/dt+PtA+BuK+A+BuKTPt–PtBwBwTPt–γ−2Π+KTRK=0E31
From (27), it is clear that the “larger” the TrP0(0)X0 and TrʃTP0(t)ϴdt, the smaller Ju would be. This implies that we want to find the least negative-definite P0(t) for all t < T. Proposition 3.2 implies that we should search for the P0(t) with the least negative TrP(t). Also, since we will be working backward in time when solving (31), this implies that we want to find a K(t) which minimizes Tr{dP/dt} at every instant of t (principle of optimality), to maximize TrP0(t) (Proposition 3.1 and 3.2). To achieve this, we note
dPt/dt=–PtA+BuK–A+BuKTPt+PtBwBwTPt+γ−2Π+KTRKE32
Differentiate Tr{dP/dt} with respect to K and set to 0:
0=–2PtBu+2γ−2KTRE33
K=γ2R−1BuTPd2TrdP/dt/dK2=2γ−2R>0E34
Substituting (34) into (32), we obtain
dP/dt=–PA–ATP+PBwBwT–γ2BuR−1BuTP+γ−2ΠE35
If we let γ−2Pγ = –P, then we may show that (35) is equivalent to
–dPγ/dt=PγA+ATPγ–PγBuR−1BuT–γ−2BwBwTPγ+ΠE36
Note that (36) is the same RDE obtained in the conventional approach to H∞ control with R = I and trivial initial state, ([4, 12]). We have, thus, established the connection between (35) and (36). However in our derivation here, we are able to concurrently derive a quantitative cost corresponding to the solution based on the initial condition and the simultaneous disturbances, as is in the case for stochastic LQR control.
Theorem 4.1: Let Acl = A + BuK and P0 be given by (35) with P(T) = −γ−2Q. Then K = γ2R−1BuTP achieves
Ju≤γ2‖w‖T2–TrPo0X0–∫TPotΘdtE37
And the worst case w0 achieves
Ju=γ2‖w‖T2–TrPo0X0–∫TPotΘdtE38
Proof: Substituting K = γ2R−1BuTP into (25) and applying Lemma 4.1 completes the proof.
To compute ||w0||T, the system with w0 = –BwTP0(t)x(t) is given by
dx/dt=Ax+BuKx–BwBwTPox+v=A+BuK–BwBwTPox+v=Awx+vE39
Define
ExtxtT=XtE40
Then Xt satisfies [12]:
dXt/dt=AwXt+XtAwT+Θ, Xt0=X0E41
E‖wo‖T2=E∫TxtTPoBwBwTPoxtdt=Tr∫TPoBwBwTPoXtdtE42
To check if the value of γ2 selected is the correct one, we require γ2 to be such that ||w0||T2 = 1. This could be achieved by varying and searching over values of γ2. We note that ||wo||T2 is generally a decreasing function of γ2, as is already known from conventional H∞ control theory.
Once the desired value of γ2 and hence P(t), 0 ≤ t ≤ T, are found, we may then compute the cost via Theorem 4.1. We summarize the results in the following conceptual algorithm:
Conceptual Algorithm 1.
Step 1: Choose a value of γ2 and compute P(t) with (35). If P(t) does not exist over the entire time interval [0, T], the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and repeat this step.
Step 2: If P(t) exists over the entire time interval [0, T], compute ||w0||T2. If ||w0||T2 = 1, we have the desired solution. The optimal state control law is given by (34).
Step 3: If ||w0||T2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.
Step 4: If ||w0||T2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, decrease γ2, and go to step 1.
Remark 4.1 It is important to note the presence of ϴ in (41). In other words, v does have an impact on the value of Xt and hence w0. This distinguishes our approach here from the previous mixed H2/H∞ solutions which do not explicitly take ϴ into consideration.
Advertisement
5. Infinite-horizon state regulation
It is well known from conventional H∞ control theory [4, 12, 15] that if γ2 is large enough, the feedback gain would approach a steady-state value K∞ far from the final time. In such a case, the feedback system would approach a linear time-invariant system. Unfortunately, this is not the steady-state solution for us, at least not for the problem formulated in Section 2. This is because as T → ∞ and as system approaches steady-state, limT → ∞{TrʃTϴPo(t)dt} → limT → ∞Tr{TϴPo(∞)} → –∞ will dominate the other two cost components, namely TrP0(∞)X0 and limT → ∞||w||T = ||w||L2 < ∞. It follows that for formulations which require ||w||L2 ≤ ∞ and finite x0, the optimal infinite-horizon solution is indeed the LQR since it minimizes Tr{ϴPo(∞)} with γ−2 = 0.
Instead, we shall explore an alternative formulation. First, we note that there exists a steady-state solution P∞ to the algebraic Riccati equation means P∞ is the solution to the RDE with –γ−2Q = P(t) = P∞, 0 ≤ t ≤ T. Substituting this into the RHS of (26), we obtain and define
ηss≤γ2E‖w‖T2–TrP∞X0+TΘP∞E43
where
ηssΔ¯EΠ1/2xT2+R1/2uT2–γ2xTTP∞xTE44
Dividing (43) by T gives
ηss/T≤γ2E‖w‖T2/T–TrP∞X0/T–TrΘP∞E45
Assume limT → ∞E{||w||T2}/T = p2. Note that p2 may be interpreted as “power”. We obtain the following result:
Lemma 5.1: Let P∞, K∞, and Acl = A + BuK∞ be the steady-state solutions corresponding to (35) with T → ∞. If A + BuK∞–BwBwTP∞ is stable, then K∞ achieves the infinite-horizon performance
η∞Δ¯limT→∞ηss/T≤γ2p2–TrΘP∞E46
Proof: If A + BuK∞–BwBwTP∞ is stable, it implies that if we let Q = −γ−2P∞, then P(t) = P∞ exists for all t < ∞. It follows that (45) holds. Noting limT → ∞{P∞X0/T} = 0 completes the proof.
Remark 5.1: Substituting limT → ∞E{||w||T2}/T = p2 into and setting ϴ = 0 in (46) shows that the H∞ norm of the closed-loop system from w to ηss is bounded by γ. Similarly, setting w = 0 shows that the (from map) steady-state LQR cost averaged over T as T → ∞ would be γ2Tr{–ϴP∞}.
Finally, since A + BuK–BwBwTP∞ is stable, there exists a X∞ which satisfies the following Lyapunov equation:
A+BuK–BwBwTP∞X∞+X∞A+BuK–BwBwTP∞+Θ=0E47
So w0 = BwTP∞x gives
E‖wo‖T2=∫TTrBwTP∞TBwTP∞X∞dt=TrBwTP∞TBwTP∞X∞TE48
since Bw, P∞, and X∞ are all constant matrices. Dividing both sides of (48) by T gives
limT→∞E‖wo‖T2/T=TrBwTP∞TBwTP∞X∞.E49
where X∞ is given by (47).
We may further assume that limt → ∞E{||w||22} = limT → ∞||w||T2/T ≤ 1, and ϴ is unknown with worst-case Trϴ ≤ σ where σ ≥ 0. This would correspond to the system being subject to both worst case w(t) and v(t), where v(t) belongs to the class of white noises that satisfies E{vTv} ≤ σ. The controller designed may then be interpreted as one which optimizes expected worst-case performance in the presence of both bounded-power deterministic disturbances and stochastic white noises. We summarize the procedure of how to find such a controller in conceptual algorithm 2.
Conceptual Algorithm 2.
Step 1: Guess a value of γ2 and compute P∞ from (35) with dP/dt = 0. If P∞ < 0 does not exist, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and repeat this step.
Step 2: If P∞ exists, find ϴo which solves minTrϴ=σ{TrP∞ϴ}. Compute X∞ from (47) with ϴ = ϴo and compute p2 = limt → ∞||wo||2 via (48). If p2 = 1, we have the desired solution.
Step 3: p2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.
Step 4: p2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, increase γ2, and go to step 1.
Advertisement
6. H∞ control with output feedback
In this section, we consider the case where the state of the system is not available for feedback. In this case, we consider dynamical controllers of the form u(t) = K(t)*z(t) where z(t) denotes the measured output available to the controller and * denotes the convolution operator. Let
zt=CztxE50
Assume that the state equation of the controller is given by
dxK/dt=AKtxK+BKtzE51
u=CKtxK+DKtz=CKxK+DKCzxE52
The closed-loop system may then be represented by the augmented state-space model with xclT Δ [xT xKT]T, [3, 16]:
dxcl/dt=Aclxcl+Bclw+I0TvΔ¯Aclxcl+Bclw+ζE53
Acl=Ap+BGCE54
Bcl=BwT0TE55
Ap=A000B=Bu00IC=Cz00IG=DKCKBKkAKkE56
Although in the above formulation, the dimensions of A and AK need not be the same, we shall assume theirs to be the same, for simplicity and better performance. We shall also set xK(0) = E{x0} = 0. Note that the covariance of the stochastic input in (53) is E{ζζT} = [I 0]Tϴ[I 0].
u(t) in (52) may be further re-written as
u==YGVTV+YGYTCzYxclE57
since
DK=I0GI0TΔ¯YGYTE58
CK=I0G0ITΔ¯YGVTE59
Substitute (57) into the cost functional:
xTΠx+uTRu=xclTYTПY+YGVTV+YGYTCzYTRYGVTV+YGYTCzYxclE60
and define
QzΔ¯YTПY+YGVTV+YGYTCzTTRYGVTV+YGYTCzY
Then it follows
xTΠx+uTRu=xclTQzxclE61
Now, substituting all these parameters for (25), we obtain
dPt/dt+PtAcl+AclTPt–PtBclBclTPt–γ−2Qz=0PT=–YTQYE62
Since Acl and [YGVTV + YGYTCzY] are affine in G, (62) may be transformed into a LMI in G.
Γ+PAcl+AclTP–γ−2YTПYPBclYGVTV+YGYTCzYTBclTPI0YGVTV+YGYTCzY0γ2R−1≥0E63
Hence, (62) may be solved by, at each time step t, finding Γ(t) and G(t) by solving the following optimization problem:
minGTrΓsubject to63E64
With Γ(t) = dP/dt found, P(t-ε) may be computed (note that a minimum dP/dt would give a maximum P(t-ε) given P(t). Hence, we may iteratively work backward (numerically) toward time 0 to construct P(t) and G(t).
With G(t) found, the worst case w may be found as
wot=–BclTPotxclt,E65
and let
Aw=Ap+BGC–BclTPoE66
The rest of the development is hence similar to the full information case and left to the reader.
Conceptual Algorithm 3.
Step 1: Choose a value of γ2 and find G(t) and P(t) via (63)–(64). With G(t), compute Acl and Bcl from (54) and (55), respectively.
Step 2: With the P(t) obtained, compute ||w0||T2 with xcl(0) = 0. If ||w0||T2 = 1, we have the optimal solution with CK and DK defining the optimal control law (52).
Step 3: If ||w0||T2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.
Step 4: If ||w0||T2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, decrease γ2, and go to step 1.
Another variant output feedback problem is if we assume
zt=Cztx+Dww+mE67
where we assume DwBwT = 0, and m a stationary zero-mean white noise process with E{mmT} = ʘ and E{mvT} = 0. In this case, we may consider a strictly causal controller
dxK/dt=AKtxK+BKtzu=CKtxKi.e.DK=0E68
Other than mathematical simplicity, choosing a strictly causal controller when the plant is not strictly proper may also be desirable for robustness purposes, because a strictly proper return ratio is generally more robust against un-modeled dynamics at high frequencies.
The augmented system in this case is given by:
dxcl/dt=Aclxcl+Bclw+BξξBξ=I0T+BGI0TE69
where Acl and G are given by (54)–(56) with DK = 0,
ξ=vTmT,andEξξT=diagΘ⊖E70
where diag[ϴ, ʘ] denotes a block diagonal matrix with diagonal sub-blocks ϴ and ʘ.
Bcl=BwT0T+BGDwT0TE71
u=CKxK=CKVxclE72
Thus
Π1/2xtT2+R1/2utT2=∫TxclTYTПY+VTCKTRCKVxcldtE73
Hence, we may solve the problem by letting
Qz=–YTПY+VTCKTRCKVE74
and substitute (54), (71), (56) with DK = 0, and (74) into (62) to obtain the appropriate RDE. The rest of the details and steps are similar to above Dw = 0 case and left to the reader.
Advertisement
7. Example
We consider the example (modified) from ([11], p. 335).
dx/dt=x+u+2w+vE75
and
Π=100,R=1,Q=0,
We further assume
Ev=0,EvvT=1,Ex0=0,Ex0x0T=1E76
We shall apply Conceptual Algorithm 2 to find a linear time-invariant controller that minimizes the infinite-horizon performance index η∞ Δ limT → ∞(ηss/T) of Section 5. By iterating with different values of γ2, steady-state solutions for various relevant parameters are tabulated (Table 1) below:
γ2 | –Po∞ | X∞ | E||w||22 | η∞ | η∞/p2 | η∞/(p2 + q2) |
---|
4.5 | 8,44 | 0.118 | 33.6 | 42.04 | 1.25 | 1.22 |
5 | 5.58 | 0.094 | 11.74 | 17.32 | 1.47 | 1.35 |
8 | 2.035 | 0.063 | 1.03 | 3.065 | 2.97 | 1.51 |
10 | 1.468 | 0.058 | 0.50 | 1.968 | 3.94 | 1.31 |
Table 1.
Iterations with different values of γ2.
where q2ΔTrϴ = 1 (note also p2Δ E||w||22). It can be seen that the desired solution occurs at about γ2 = 8. We also see that values of limt → ∞E||w||2 and ϴ would determine the desirable value of γ2 to be used. Note also that while η∞/p2 monotonically increases with γ2, η∞/(p2 + Trϴ) has a maximum with respect to γ2 which is the worst-case combination of w and v from a power-ratio point of view. Note that p2 + Trϴ is a measure of the combined power of the two disturbances.
Advertisement
8. Conclusion
We have considered the H∞ state regulation problem for which the initial state of the plant is possibly nonzero, and the system is subject to worst-case nonzero-mean exogeneous input and zero-mean white noise disturbance. It should be clear that the two signals may be combined to represent a nonzero-mean disturbance which consists of both high and low frequencies components.
We note that our results here were derived from the Riccati differential equation related to the LQR, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we derive our solution by maximizing P(0), which is directly related to the value of the cost functional concerned when the intial state is nonzero, of the Riccati differential equation via Bellman’s principle of optimality. This should be familiar to anyone who is familiar with the theory of LQR. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H∞ control problem, and H∞ control with stochastic disturbances (compare [5, 8, 9, 10] for example). In fact, it is the recogniton that the H∞ and LQR problems may be solved via the same Riccati equation that enables us to develop the present simple way of incorporating a stocahstic exogenous input into H∞ control, in addition to nonzero initial conditions, based on the well-known stochastic LQR theory. We anticipate that this approach will enable control engineers to extend many H∞ results to include stochastic disturbances, capitializing on, and drawing from, the vast pool of results that have already been developed for stochastic LQR/LQG. Hence, we believe that this would be one of the main contributions of the present paper. Note that because we are still working with deteministic system matrices, established and well-known techniques such as the singular-value method or small-μ test, etc. are directly applicable to analyze the designed system [15].
We further note that the worst-case deterministic disturbance is in the form of w(t) = –BwTP0(t)x(t). This is useful, because if we assume the uncertain A coefficient matrix of the open-loop system to be A + Δ and the absence of deterministic disturbances, then the resultant system may be modeled as Ax + w where w = Δx. It follows that if P0BwBwTP0 ≥ ΔTΔ for all admissible Δ, then the controller obtained in our approach here will also guarantee the performance of the closed-loop system with performance bound (46)–(48) when the open-loop system A matrix is subject to perturbation Δ. Hence, here Bw may be construed as some free parameter/weight matrix which, together with γ2, we may choose and vary to search for and construct a P0 that satisfies (35) with dP/dt = 0, and P0BwBwTP0 ≥ ΔTΔ for all admissible Δ (for example, if Δ is assumed to be an unstructured perturbation, [15], then we may assume σ2I ≥ ΔTΔ where σ denotes the singular-value bound), thus solving robust the stochastic control problem with uncertain A-matrix). Note that if we let WB = PoBwBwTP0, then for any P0 obtained we may find Bw = (P0−1WBP0−1)1/2. Thus, we may substitute P0BwBwTP0 = σ2I into (35) with dP/dt = 0 to obtain
0=–PA–ATP+σ2I–γ2PBuR−1BuTP+γ−2ΠE77
as the Riccati equation to be searched over γ−2 for finding the robust performance control problem with unstructured A-matrix perturbations satisfying ΔTΔ ≤ σ2I.
Of course, assuming that just the A matrix may be uncertain is unrealistic in practice. But it does suggest a possibly fruitful approach to address the “worst-case performance when subject to system matrices perturbations” problem if we are able to extend the approach to address uncertainties in the input and output matrices as well. This will be a direction of our future research.