Optimal Stochastic H<sup>∞</sup> Controller Synthesis via LQR Theory

Yung Kuan Foo; Yeng Chai Soh

doi:10.5772/intechopen.1004889

Abstract

Existing published works on H^∞ and mixed H^∞/H² control generally solve the problem via game theory and saddle-point solutions. We provide an alternative solution via the stochastic LQR approach. Specifically, we consider the problem of H^∞ state regulation problem for continuous-time systems which are subjected to both energy-bounded deterministic disturbances and stochastic white noises, and uncertain initial system state x(0) satisfying E{x(0)} = 0 and E{x(0)x(0)^T} ≤ X₀. We derive feedback laws which minimize sup_{||w(t)||2 ≤ 1}E{||Π^1/2x(t)||_T² + ||R^1/2u(t)||_T²}, where Π and R are positive-definite weight matrices. We further demonstrate how the approach may be extended to solving the robust stochastic LQR control with uncertainty in the system A matrix.

Keywords

H^∞ control
LQ regulator
stochastic regulator
robust control
mixed H²/H^∞ control

Author Information

Show +

Yung Kuan Foo*
- LW E&M Technologies Private Limited, Singapore
Yeng Chai Soh
- College of Engineering, Nanyang Technological University, Singapore

*Address all correspondence to: yungfoo@hotmail.com

1. Introduction

Existing approaches to solving H^∞ control problem typically pose the problem as a differential game where the solution is obtained by finding the minimax saddle point; see [1, 2, 3, 4, 5, 6, 7] for example. This approach is intuitive from the H^∞ perspective since the problem is indeed mathematically a minimax problem. However, when it is found that the Riccati equations obtained are indeed very similar to the standard Riccati equations obtained from related LQR problems with the same system matrices, tremendous interests on how to reconcile and combine the two approaches in a fruitful way have been generated. This has given rise to the once-very-hotly investigated mixed H²/H^∞ control problem [1, 8, 9, 10]. Specifically, the system model considered is of the form:

dxt/dt=Axt+Buut+E2w2t+E∞wωtz2t=C2xt+D2w2tz∞t=C∞xt+D∞w∞t,x0=0E1

Let T₂ and T_∞ denote the closed-loop transfer functions from w₂ to z₂ and from w_∞ to z_∞, respectively. The problem was to minimize the H² norm of T₂ subject to the H^∞ norm of T_∞ smaller than certain prescribed positive value γ. Clearly, if w_∞ = 0, then the H² norm will give a performance measure on the effect of w₂. Conversely, if w₂ = 0, then the H^∞ norm would give a performance bound on the effect of w_∞. But what if w₂ and w_∞(t) are both nonzero? The main focus of such an approach is usually to optimize the H² norm for performance, and the H^∞ norm bound was more for “robustness” against model uncertainties than for worst-case performance. Thus, we start with a signal-based worst-case control problem, but eventually end up with a “robust” (bound on the H^∞ norm) LQR control solution. Such an approach is justifiable but an implicit implication is that the performance is only optimized for the H² norm and hence not, at least not expressly, about worst-case performance optimization. Another fact to note is that though we obtain a solution to guarantee the H^∞ bound of the closed-loop system by solving the relevant Riccati equation (Eq. (36)), such bound is generally not tight. This fact can be easily seen by noting the fact that as we set γ⁻² → 0, we recover the LQR. Obviously, the H^∞ norm of the closed-loop system corresponding to an LQR cannot be ∞; so it is clear that as we set γ² → ∞, the closed-loop H^∞ norm does not approach ∞, and hence, γ does not equal the H^∞ norm in general. An implication of this fact is that the controller obtained via solving Eq. (36) with γ² = γ_o² is not one that minimizes the closed-loop H² norm subject to the H^∞ norm less than or equal to γ_o². For example, suppose the closed-loop H^∞ norm of the pure LQR control system is γ_LQR². Then, it is clear that this LQR is also the optimal solution to the minimum closed-loop H² norm control problem subject to any value of closed-loop H^∞ norm larger than γ_LQR², and this solution ought to be unique. But if we solve Eq. (36) with any finite γ² > γ_LQR², we will not obtain the LQR. This implies that the obtained solution cannot be optimal.

Other salient features of such an approach are: (i) the performance in consideration is usually (there are some exceptions such as [7]) over the infinite time horizon, (ii) the initial state is zero and, (iii) as already pointed out above, the performances in respect of the H² and H^∞ criteria are, loosely speaking, in a mutually-exclusive “either-or” situation; in that, it does not consider the combined effect when both the stochastic and the deterministic disturbances are present simultaneously. In other words, while the H^∞ norm gives a bound on the worst-case performance of the closed-loop system when there is a deterministic disturbance (but with no stochastic noises) and the H² norm gives indication on the performance in the presence of stochastic noise (but in the absence of deterministic disturbances), the combined effect on the closed-loop system when both types of exogenous disturbances are present is not explicitly and quantitatively addressed. Another problem we may identify is that the H² norm obtained in a mixed H²/H^∞ design is actually in regards to the nominal model. When the system matrices are subject to perturbations, such H² norm would also be subject to change. Hence, the question of what would be the worst-case performance of the closed-loop system when the nominal model has been subject to perturbations remains to be answered. In other words, stability robustness may be established but nothing about the worst-case performance has been said, at least not quantitatively.

In this chapter, we revisit the mixed H²/H^∞ control problem but from a different perspective. Specifically, we pose the original problem as a stochastic H^∞ control problem where the system is subject to both possibly “worst-case” (bounded) deterministic disturbances and stochastic “zero-mean” white noises simultaneously. We are interested in minimizing the worst-case combined effect of these disturbances, along with an uncertain nonzero initial condition [6, 11], on the closed-loop performance of the system. It should be clear that these problems would arise when one is interested to model and drive the state of a system from some uncertain nonzero value to the origin in some optimal way, in the presence of both high and low frequencies noises. Note that while we model the disturbances as two separate noises, they could actually be one single disturbance consisting of both low and high frequencies components, thus modeling a nonzero-mean disturbance.

Our results here will be derived from the Riccati differential equation related to the finite-horizon LQR problem, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we minimize the expected worst-case cost and derive our solution by explicitly working with P(0) of the finite-horizon LQR Riccati equation, which is directly related to the value of the cost functional concerned when the initial state is nonzero. Clearly, if we assume the deterministic noise to be absent, the problem reduces to a stochastic LQR problem, and if we assume the stochastic white noise to be absent, we recover the standard H^∞ control problem. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H^∞ control problem, and H^∞ control with stochastic disturbances (compare [5, 8, 9, 10] for example).

It should be noted that the problem we consider here is different from the problem considered in some existing stochastic H^∞ control literature. For example, in [5, 10], the stochastic H^∞ approach was to incorporate stochastic disturbances into the system matrices to obtain a stochastic system with multiplicative white noise perturbations. This approach is quite different from the standard and familiar stochastic LQR/LQG approach, where the system matrices are assumed to be deterministic, not subjecting to stochastic perturbation or state-dependent multiplicative noises.

The organization of the chapter is as follows. In Section 2, we formally formulate the control problem of our interest. In Section 3, we state and derive some mathematical preliminaries. In Section 4, we consider how one may synthesize a finite-horizon controller with uncertain initial state. In Section 5, we consider steady-state solutions. Section 6 considers control with output feedback when the states are not directly measureable. In Section 7, we give an example. Section 8 contains our conclusions.

Notation: ||.||₂ denotes Euclidean norm of the vector; ||.||_L2 the L2-norm; Δ denotes “equal by definition”; E{.} denotes taking expectation; (∫^T_. dt) denotes the definite integral with respect to t from 0 to T; ||x(t)||_T² denotes E(∫^Tx(t)^Tx(t) dt), and A ≥ B means A-B is positive semi-definite.

2. Problem formulation

Consider the continuous-time linear time-invariant system with input disturbances w(t) and v(t):

dxt/dt=Axt+Buut+Bwwt+vtE2

Ex0=0,Ex0x0T≤X0E3

A, B_u, B_w are known real matrices, u(t) is the control input, w(t) is an unknown but bounded exogenous input with E{ʃ^Tw(t)^Tw(t)dt} ≤ 1, v(t) is a stationary Gaussian zero-mean white noise with known covariance matrix ϴ. For simplicity, we assume that the system is controllable and observable in the case of output feedback.

The feedback control problem of our interest is to find a linear causal control law u(t) = K(t)x(t) to minimize

Ju=Π1/2xtT2+R1/2utT2+ExTTQxTE4

where Π and R are positive-definite and Q is positive-semi-definite weight matrices.

Note that (4) may be re-written as

Ju=Π1/2xtT2+R1/2KtxtT2+EQ1/2xT2E5

In what follows, we shall sometimes drop the argument t (especially for the system matrices), or denote z(t) as z_t, for notational simplicity when there is no danger of confusion.

3. Preliminaries

Lemma 3.1: Let A, B, L = L^T, and Q_L = Q_L^T be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution P₀(t) of the differential Riccati equation (DRE)

dPt/dt=–PtA–ATPt+PtBBTPt–LPT=QL.E6

Then the minimum of the cost functional

JΔ¯∫TxTLx+wTwdt+xTTQLxTE7

for the system dx/dt = Ax(t) + Bw(t) + v, E{v(t)} = 0, E{v(t)v(t)^T} = ϴ, E{x(0)} = 0, and E{x(0)x(0)^T} = X₀ is given by

Jmin=TrPo0X0+∫TPotΘdtE8

The minimizing w(t) is given by

wt=–BTPotxtE9

Proof: Well-known stochastic LQR result (see Ref. [12], p. 221).

We next give a similar result concerning the deterministic problem and the related Riccati differential equation, which we will find useful later in studying some of the properties of RDE (6).

Lemma 3.2: Let A(t), B(t), L(t) = L(t)^T, and Q_L = Q_L^T be given matrices. Suppose there exists on the interval 0 ≤ t ≤ T a solution P₀(t) of the differential Riccati equation (DRE)

dPt/dt=–PtA–ATPt+PtBBTPt–LPT=QL.E10

Then the cost functional

ηΔ¯∫TxTLx+wTwdt+xTTQLxTE11

for system dx/dt = Ax(t) + Bw(t): x(0) = x₀ is given by

η=wt+BTPotxtT2+x0TPo0x0.E12

The minimum value of η is given by x₀^TP₀(0)x₀, and the minimizing w(t) is given by

wt=–BTPotxtE13

Proof: See Ref. [13], (proof Theorem 1, p. 131.)

Remark 3.1: Note that Q_L and L are symmetric but not necessarily nonnegative definite. Furthermore, as pointed out in [13], because L is not assumed to be nonnegative definite, it is not generally true that a minimum η exists. However, we know from LQR theory that such minimum η exists when L is nonnegative definite. Hence, it is reasonable to expect that a minimum η exists if L is not “exceedingly” negative definite.

Lemma 3.3: Let A, B, L = L^T ≤ 0, and Q_L = Q_L^T be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution P_s(t) of the differential Riccati inequality (DRI), [14]:

dPt/dt≥–PtA–ATPt+PtBBTPt–LPT=QL.E14

Then for the system dx/dt = Ax(t) + Bu(t): x(0) = x₀, η as defined in (11) satisfies:

η≥wt+BTPstxtT2+x0TPs0x0.E15

Proof: Since L ≤ 0, there exists L₁ ≤ L such that

dPst/dt=–PstA–ATPst+PstBBTPst–L1PsT=QL.E16

Let

η1Δ¯∫TxTL1x+wTwdt+xTTQLxTE17

Clearly, η ≥ η₁ for any x and w since L ≥ L₁. Applying Lemma 3.2 with L replaced with L₁ gives η₁ = ||w(t) + B^TP_s(t)x(t)||_T² + x₀^TP_s(0)x₀ and completes the proof.

Remark 3.2: Since the inequality sign in (14) is non-strict, the set of solutions that solve Lemma 3.3 contains P₀(t) of Lemma 3.2 as a member.

Remark 3.3: In Lemmas 3.1-3.3, P(t) is independent of the initial condition x₀, even though the cost η is a function of x₀.

Proposition 3.1: Let P₀(t) be the solution that satisfies Lemma 3.2 and P₁(t) be any solution that satisfies (14). Then

Po0≥P10E18

Proof: Since η ≥ η₁ for any x and w, it follows that x₀^TP₀(0)x₀ ≥ x₀^TP₁(0)x₀ for any x₀. Thus (18) must follow.

Remark 3.4: Note time 0 may be any time before time T. So (18) also implies P₀(t) ≥ P₁(t), all t < T.

Proposition 3.2: Let P₁(t) be a solution and P₂(t) be any solution that satisfies (14). Then P₁(t) also satisfies Lemma 3.2 if and only if

TrP10≥TrP20E19

for all admissible P₂(t).

Proof: Necessity is obvious from Proposition 3.1. To prove sufficiency, suppose there is a P₁(t) that satisfies (19) but not a solution to Lemma 3.2. Since P₁(t) is a solution to (14), it follows from Proposition 3.1 that P₀(0) ≥ P₁(0). But P₀(0) is also an admissible solution of Lemma 3.3. Then, since Tr[P₁(0)] ≥ Tr[P₀(0)] but P₁(0) ≠ P₀(0), it follows that P₀(0) ≥ P₁(0) does not hold. This contradicts Proposition 3.1 and completes the proof.

Proposition 3.3: Let P₁(t) be a solution to Lemma 3.2 and P₂(t) be any solution that satisfies (14). Then the following statements are equivalent:

TrP10≥TrP20dTrP1t/dt≤dTrP2t/dtE20

Proof: Obvious consequence of Bellman’s principle of optimality, since both P₁(t) and P₂(t) are solutions computed backward in time from final time T with P_s(T) = Q_L.

Remark 3.4: Proposition 3.3 tells us that in order to find the minimizing w(t) which minimizes (7) and (11) from the set of sub-optimal solutions characterized by Lemma 3.3, it suffices to search for one with the maximum TrP(t) or minimum d{TrP(t)}/dt over all tϵ[0,T]. These properties will be useful later when we search for the optimal controller via linear matrix inequality (LMI) as follow.

Consider the LMI:

Γt+PA+ATP–LPBBTPI≥0E21

Applying Schur transformation to (21) then gives us Γ(t) ≥ –P(t)A–A^TP(t) + P(t)BB^TP(t)–L. It follows from Proposition 3.3 that the Γ(t) that solves minTrΓ subject to (21) at all tϵ[0,T] will give us the desired dP/dt that solves dP/dt = –PA–A^TP + PBB^TP(t)–L for all tϵ[0,T], where Γ = dP/dt.

Before proceeding further, let us recollect what we have done so far. Lemmas 3.1 and 3.2 show that the minimizing w(t) is the same whether the system is subjected to an input stochastic white noise or not. However, the minimum costs resulted will be different as given by (8) and x₀^TP₀(0)x₀ (implicitly given by (12)), respectively. Then from Lemma 3.3 to Proposition 3.3, we transformed the problem of finding P(t) for (6) into an equivalent LMI problem.

4. Finite-horizon state regulation

Let u = Kx where K is the state feedback controller. Let A_cl = A + B_uK. Let

L=–Π+KTRKE22

Then it is clear

Ju=∫TxTΠ+KTRKxdt+xTTQxTE23

We are interested in finding a w(t) to maximize J_u. For this purpose, we consider

–γ−2∫TxTΠ+KTRKxdt+xTTQxT+wT2–wT2E24

Let

dPt/dt=–PtAcl–AclTPt+PtBBTPt+γ−2Π+KTRKPT=–γ−2Q.E25

Lemma 4.1: Let A_cl = A + B_uK and P₀ be given by (25) for some chosen value of γ². Then for the system (2), the state feedback control law u = Kx achieves the close loop performance for all admissible w(t):

Ju≤γ2‖w‖T2–TrPo0X0–∫TPotΘdtE26

And the worst case w₀ achieves

Ju=γ2‖wo‖T2–TrPo0X0–∫TPotΘdtE27

Proof: From Lemma 3.1,

minw–γ−2∫TxTΠ+KTRKxdt+wT2–wT2=TrPo0X0+∫TPotΘdt–wT2

Multiply throughout by –γ² we obtain:

maxwJu–γ2wT2+γ2wT2=–γ2TrPo0X0+∫TPotΘdt+wT2E28

This completes the proof.

Remark 4.2: From (26), we recover the conventional H^∞ inequality [4, 12]:

Ju≤γ2‖w‖T2ifx0=0andΘ=0E29

Corollary 4.1: Let A_cl = A + B_uK and P₀ be given by Lemma 4.1. If w₀ = –B_w^TP₀(t)x(t), then

Ju=γ2‖wo‖T2–TrPo0X0–∫TPotΘdtE30

Proof: Obvious from (28) and Lemma 3.1. This completes the proof.

To derive the optimal controller for Lemma 4.1, We substitute A_cl = [A + B_uK] into (6) with P(T) = −γ⁻²Q. Thus we obtain

dPt/dt+PtA+BuK+A+BuKTPt–PtBwBwTPt–γ−2Π+KTRK=0E31

From (27), it is clear that the “larger” the TrP₀(0)X₀ and Trʃ^TP₀(t)ϴdt, the smaller J_u would be. This implies that we want to find the least negative-definite P₀(t) for all t < T. Proposition 3.2 implies that we should search for the P₀(t) with the least negative TrP(t). Also, since we will be working backward in time when solving (31), this implies that we want to find a K(t) which minimizes Tr{dP/dt} at every instant of t (principle of optimality), to maximize TrP₀(t) (Proposition 3.1 and 3.2). To achieve this, we note

dPt/dt=–PtA+BuK–A+BuKTPt+PtBwBwTPt+γ−2Π+KTRKE32

Differentiate Tr{dP/dt} with respect to K and set to 0:

0=–2PtBu+2γ−2KTRE33

K=γ2R−1BuTPd2TrdP/dt/dK2=2γ−2R>0E34

Substituting (34) into (32), we obtain

dP/dt=–PA–ATP+PBwBwT–γ2BuR−1BuTP+γ−2ΠE35

If we let γ⁻²P_γ = –P, then we may show that (35) is equivalent to

–dPγ/dt=PγA+ATPγ–PγBuR−1BuT–γ−2BwBwTPγ+ΠE36

Note that (36) is the same RDE obtained in the conventional approach to H^∞ control with R = I and trivial initial state, ([4, 12]). We have, thus, established the connection between (35) and (36). However in our derivation here, we are able to concurrently derive a quantitative cost corresponding to the solution based on the initial condition and the simultaneous disturbances, as is in the case for stochastic LQR control.

Theorem 4.1: Let A_cl = A + B_uK and P₀ be given by (35) with P(T) = −γ⁻²Q. Then K = γ²R⁻¹B_u^TP achieves

Ju≤γ2‖w‖T2–TrPo0X0–∫TPotΘdtE37

And the worst case w₀ achieves

Ju=γ2‖w‖T2–TrPo0X0–∫TPotΘdtE38

Proof: Substituting K = γ²R⁻¹B_u^TP into (25) and applying Lemma 4.1 completes the proof.

To compute ||w₀||_T, the system with w₀ = –B_w^TP₀(t)x(t) is given by

dx/dt=Ax+BuKx–BwBwTPox+v=A+BuK–BwBwTPox+v=Awx+vE39

Define

ExtxtT=XtE40

Then X_t satisfies [12]:

dXt/dt=AwXt+XtAwT+Θ,Xt0=X0E41

E‖wo‖T2=E∫TxtTPoBwBwTPoxtdt=Tr∫TPoBwBwTPoXtdtE42

To check if the value of γ² selected is the correct one, we require γ² to be such that ||w₀||_T² = 1. This could be achieved by varying and searching over values of γ². We note that ||w_o||_T² is generally a decreasing function of γ², as is already known from conventional H^∞ control theory.

Once the desired value of γ² and hence P(t), 0 ≤ t ≤ T, are found, we may then compute the cost via Theorem 4.1. We summarize the results in the following conceptual algorithm:

Conceptual Algorithm 1.

Step 1: Choose a value of γ² and compute P(t) with (35). If P(t) does not exist over the entire time interval [0, T], the chosen γ² is too small. Set lower bound γ_l² = γ², increase γ², and repeat this step.

Step 2: If P(t) exists over the entire time interval [0, T], compute ||w₀||_T². If ||w₀||_T² = 1, we have the desired solution. The optimal state control law is given by (34).

Step 3: If ||w₀||_T² > 1, the chosen γ² is too small. Set lower bound γ_l² = γ², increase γ², and go to step 1.

Step 4: If ||w₀||_T² < 1, the chosen γ² is too large. Set upper bound γ_u² = γ², decrease γ², and go to step 1.

Remark 4.1 It is important to note the presence of ϴ in (41). In other words, v does have an impact on the value of X_t and hence w₀. This distinguishes our approach here from the previous mixed H²/H^∞ solutions which do not explicitly take ϴ into consideration.

5. Infinite-horizon state regulation

It is well known from conventional H^∞ control theory ([4, 12, 15]) that if γ² is large enough, the feedback gain would approach a steady-state value K_∞ far from the final time. In such a case, the feedback system would approach a linear time-invariant system. Unfortunately, this is not the steady-state solution for us, at least not for the problem formulated in Section 2. This is because as T → ∞ and as system approaches steady-state, lim_T → ∞{Trʃ^TϴP_o(t)dt} → lim_T → ∞Tr{TϴP_o(∞)} → –∞ will dominate the other two cost components, namely TrP₀(∞)X₀ and lim_T → ∞||w||_T = ||w||_L2 < ∞. It follows that for formulations which require ||w||_L2 ≤ ∞ and finite x₀, the optimal infinite-horizon solution is indeed the LQR since it minimizes Tr{ϴP_o(∞)} with γ⁻² = 0.

Instead, we shall explore an alternative formulation. First, we note that there exists a steady-state solution P_∞ to the algebraic Riccati equation means P_∞ is the solution to the RDE with –γ⁻²Q = P(t) = P_∞, 0 ≤ t ≤ T. Substituting this into the RHS of (26), we obtain and define

ηss≤γ2E‖w‖T2–TrP∞X0+TΘP∞E43

where

ηssΔ¯EΠ1/2xT2+R1/2uT2–γ2xTTP∞xTE44

Dividing (43) by T gives

ηss/T≤γ2E‖w‖T2/T–TrP∞X0/T–TrΘP∞E45

Assume lim_T → ∞E{||w||_T²}/T = p². Note that p² may be interpreted as “power”. We obtain the following result:

Lemma 5.1: Let P_∞, K_∞, and A_cl = A + B_uK_∞ be the steady-state solutions corresponding to (35) with T → ∞. If A + B_uK_∞–B_wB_w^TP_∞ is stable, then K_∞ achieves the infinite-horizon performance

η∞Δ¯limT→∞ηss/T≤γ2p2–TrΘP∞E46

Proof: If A + B_uK_∞–B_wB_w^TP_∞ is stable, it implies that if we let Q = −γ⁻²P_∞, then P(t) = P_∞ exists for all t < ∞. It follows that (45) holds. Noting lim_T → ∞{P_∞X₀/T} = 0 completes the proof.

Remark 5.1: Substituting lim_T → ∞E{||w||_T²}/T = p² into and setting ϴ = 0 in (46) shows that the H^∞ norm of the closed-loop system from w to η_ss is bounded by γ. Similarly, setting w = 0 shows that the (from map) steady-state LQR cost averaged over T as T → ∞ would be γ²Tr{–ϴP_∞}.

Finally, since A + B_uK–B_wB_w^TP_∞ is stable, there exists a X_∞ which satisfies the following Lyapunov equation:

A+BuK–BwBwTP∞X∞+X∞A+BuK–BwBwTP∞+Θ=0E47

So w₀ = B_w^TP_∞x gives

E‖wo‖T2=∫TTrBwTP∞TBwTP∞X∞dt=TrBwTP∞TBwTP∞X∞TE48

since B_w, P_∞, and X_∞ are all constant matrices. Dividing both sides of (48) by T gives

limT→∞E‖wo‖T2/T=TrBwTP∞TBwTP∞X∞.E49

where X_∞ is given by (47).

We may further assume that lim_t → ∞E{||w||₂²} = lim_T → ∞||w||_T²/T ≤ 1, and ϴ is unknown with worst-case Trϴ ≤ σ where σ ≥ 0. This would correspond to the system being subject to both worst case w(t) and v(t), where v(t) belongs to the class of white noises that satisfies E{v^Tv} ≤ σ. The controller designed may then be interpreted as one which optimizes expected worst-case performance in the presence of both bounded-power deterministic disturbances and stochastic white noises. We summarize the procedure of how to find such a controller in conceptual algorithm 2.

Conceptual Algorithm 2.

Step 1: Guess a value of γ² and compute P_∞ from (35) with dP/dt = 0. If P_∞ < 0 does not exist, the chosen γ² is too small. Set lower bound γ_l² = γ², increase γ², and repeat this step.

Step 2: If P_∞ exists, find ϴ_o which solves min_Trϴ=σ{TrP_∞ϴ}. Compute X_∞ from (47) with ϴ = ϴ_o and compute p² = lim_t → ∞||w_o||₂ via (48). If p² = 1, we have the desired solution.

Step 3: p² > 1, the chosen γ² is too small. Set lower bound γ_l² = γ², increase γ², and go to step 1.

Step 4: p² < 1, the chosen γ² is too large. Set upper bound γ_u² = γ², increase γ², and go to step 1.

6. H^∞ control with output feedback

In this section, we consider the case where the state of the system is not available for feedback. In this case, we consider dynamical controllers of the form u(t) = K(t)_*z(t) where z(t) denotes the measured output available to the controller and _* denotes the convolution operator. Let

zt=CztxE50

Assume that the state equation of the controller is given by

dxK/dt=AKtxK+BKtzE51

u=CKtxK+DKtz=CKxK+DKCzxE52

The closed-loop system may then be represented by the augmented state-space model with x_clT Δ [x^Tx_K^T]^T, [3, 16]:

dxcl/dt=Aclxcl+Bclw+I0TvΔ¯Aclxcl+Bclw+ζE53

Acl=Ap+BGCE54

Bcl=BwT0TE55

Ap=A000B=Bu00IC=Cz00IG=DKCKBKkAKkE56

Although in the above formulation, the dimensions of A and A_K need not be the same, we shall assume theirs to be the same, for simplicity and better performance. We shall also set x_K(0) = E{x₀} = 0. Note that the covariance of the stochastic input in (53) is E{ζζ^T} = [I 0]^Tϴ[I 0].

u(t) in (52) may be further re-written as

u==YGVTV+YGYTCzYxclE57

since

DK=I0GI0TΔ¯YGYTE58

CK=I0G0ITΔ¯YGVTE59

Substitute (57) into the cost functional:

xTΠx+uTRu=xclTYTПY+YGVTV+YGYTCzYTRYGVTV+YGYTCzYxclE60

and define

QzΔ¯YTПY+YGVTV+YGYTCzTTRYGVTV+YGYTCzY

Then it follows

xTΠx+uTRu=xclTQzxclE61

Now, substituting all these parameters for (25), we obtain

dPt/dt+PtAcl+AclTPt–PtBclBclTPt–γ−2Qz=0PT=–YTQYE62

Since A_cl and [YGV^TV + YGY^TC_zY] are affine in G, (62) may be transformed into a LMI in G.

Γ+PAcl+AclTP–γ−2YTПYPBclYGVTV+YGYTCzYTBclTPI0YGVTV+YGYTCzY0γ2R−1≥0E63

Hence, (62) may be solved by, at each time step t, finding Γ(t) and G(t) by solving the following optimization problem:

minGTrΓsubject to63E64

With Γ(t) = dP/dt found, P(t-ε) may be computed (note that a minimum dP/dt would give a maximum P(t-ε) given P(t). Hence, we may iteratively work backward (numerically) toward time 0 to construct P(t) and G(t).

With G(t) found, the worst case w may be found as

wot=–BclTPotxclt,E65

and let

Aw=Ap+BGC–BclTPoE66

The rest of the development is hence similar to the full information case and left to the reader.

Conceptual Algorithm 3.

Step 1: Choose a value of γ² and find G(t) and P(t) via (63)–(64). With G(t), compute A_cl and B_cl from (54) and (55), respectively.

Step 2: With the P(t) obtained, compute ||w₀||_T² with x_cl(0) = 0. If ||w₀||_T² = 1, we have the optimal solution with C_K and D_K defining the optimal control law (52).

Step 3: If ||w₀||_T² > 1, the chosen γ² is too small. Set lower bound γ_l² = γ², increase γ², and go to step 1.

Step 4: If ||w₀||_T² < 1, the chosen γ² is too large. Set upper bound γ_u² = γ², decrease γ², and go to step 1.

Another variant output feedback problem is if we assume

zt=Cztx+Dww+mE67

where we assume D_wB_w^T = 0, and m a stationary zero-mean white noise process with E{mm^T} = ʘ and E{mv^T} = 0. In this case, we may consider a strictly causal controller

dxK/dt=AKtxK+BKtzu=CKtxKi.e.DK=0E68

Other than mathematical simplicity, choosing a strictly causal controller when the plant is not strictly proper may also be desirable for robustness purposes, because a strictly proper return ratio is generally more robust against un-modeled dynamics at high frequencies.

The augmented system in this case is given by:

dxcl/dt=Aclxcl+Bclw+BξξBξ=I0T+BGI0TE69

where A_cl and G are given by (54)-(56) with D_K = 0,

ξ=vTmT,andEξξT=diagΘ⊖E70

where diag[ϴ, ʘ] denotes a block diagonal matrix with diagonal sub-blocks ϴ and ʘ.

Bcl=BwT0T+BGDwT0TE71

u=CKxK=CKVxclE72

Thus

Π1/2xtT2+R1/2utT2=∫TxclTYTПY+VTCKTRCKVxcldtE73

Hence, we may solve the problem by letting

Qz=–YTПY+VTCKTRCKVE74

and substitute (54), (71), (56) with D_K = 0, and (74) into (62) to obtain the appropriate RDE. The rest of the details and steps are similar to above D_w = 0 case and left to the reader.

7. Example

We consider the example (modified) from ([11], p. 335).

dx/dt=x+u+2w+vE75

and

Π=100,R=1,Q=0,

We further assume

Ev=0,EvvT=1,Ex0=0,Ex0x0T=1E76

We shall apply Conceptual Algorithm 2 to find a linear time-invariant controller that minimizes the infinite-horizon performance index η_∞Δ lim_T → ∞(η_ss/T) of Section 5. By iterating with different values of γ², steady-state solutions for various relevant parameters are tabulated (Table 1) below:

γ²	–P_o^∞	X_∞	E\|\|w\|\|₂²	η_∞	η_∞/p²	η_∞/(p² + q²)
4.5	8,44	0.118	33.6	42.04	1.25	1.22
5	5.58	0.094	11.74	17.32	1.47	1.35
8	2.035	0.063	1.03	3.065	2.97	1.51
10	1.468	0.058	0.50	1.968	3.94	1.31

Table 1.

Iterations with different values of γ².

where q²ΔTrϴ = 1 (note also p²ΔE||w||₂²). It can be seen that the desired solution occurs at about γ² = 8. We also see that values of lim_t → ∞E||w||₂ and ϴ would determine the desirable value of γ² to be used. Note also that while η_∞/p² monotonically increases with γ², η_∞/(p² + Trϴ) has a maximum with respect to γ² which is the worst-case combination of w and v from a power-ratio point of view. Note that p² + Trϴ is a measure of the combined power of the two disturbances.

8. Conclusion

We have considered the H^∞ state regulation problem for which the initial state of the plant is possibly nonzero, and the system is subject to worst-case nonzero-mean exogeneous input and zero-mean white noise disturbance. It should be clear that the two signals may be combined to represent a nonzero-mean disturbance which consists of both high and low frequencies components.

We note that our results here were derived from the Riccati differential equation related to the LQR, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we derive our solution by maximizing P(0), which is directly related to the value of the cost functional concerned when the intial state is nonzero, of the Riccati differential equation via Bellman’s principle of optimality. This should be familiar to anyone who is familiar with the theory of LQR. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H^∞ control problem, and H^∞ control with stochastic disturbances (compare [5, 8, 9, 10] for example). In fact, it is the recogniton that the H^∞ and LQR problems may be solved via the same Riccati equation that enables us to develop the present simple way of incorporating a stocahstic exogenous input into H^∞ control, in addition to nonzero initial conditions, based on the well-known stochastic LQR theory. We anticipate that this approach will enable control engineers to extend many H^∞ results to include stochastic disturbances, capitializing on, and drawing from, the vast pool of results that have already been developed for stochastic LQR/LQG. Hence, we believe that this would be one of the main contributions of the present paper. Note that because we are still working with deteministic system matrices, established and well-known techniques such as the singular-value method or small-μ test, etc. are directly applicable to analyze the designed system [15].

We further note that the worst-case deterministic disturbance is in the form of w(t) = –B_w^TP₀(t)x(t). This is useful, because if we assume the uncertain A coefficient matrix of the open-loop system to be A + Δ and the absence of deterministic disturbances, then the resultant system may be modeled as Ax + w where w = Δx. It follows that if P₀B_wB_w^TP₀ ≥ Δ^TΔ for all admissible Δ, then the controller obtained in our approach here will also guarantee the performance of the closed-loop system with performance bound (46)–(48) when the open-loop system A matrix is subject to perturbation Δ. Hence, here B_w may be construed as some free parameter/weight matrix which, together with γ², we may choose and vary to search for and construct a P₀ that satisfies (35) with dP/dt = 0, and P₀B_wB_w^TP₀ ≥ Δ^TΔ for all admissible Δ (for example, if Δ is assumed to be an unstructured perturbation, [15], then we may assume σ²I ≥ Δ^TΔ where σ denotes the singular-value bound), thus solving robust the stochastic control problem with uncertain A-matrix). Note that if we let W_B = P_oB_wB_w^TP₀, then for any P₀ obtained we may find B_w = (P₀⁻¹W_BP₀⁻¹)^1/2. Thus, we may substitute P₀B_wB_w^TP₀ = σ²I into (35) with dP/dt = 0 to obtain

0=–PA–ATP+σ2I–γ2PBuR−1BuTP+γ−2ΠE77

as the Riccati equation to be searched over γ⁻² for finding the robust performance control problem with unstructured A-matrix perturbations satisfying Δ^TΔ ≤ σ²I.

Of course, assuming that just the A matrix may be uncertain is unrealistic in practice. But it does suggest a possibly fruitful approach to address the “worst-case performance when subject to system matrices perturbations” problem if we are able to extend the approach to address uncertainties in the input and output matrices as well. This will be a direction of our future research.

References

1. Bernstein DS, Haddad M. LQG control with an H-infinity performance bound: A Riccati equation approach. IEEE Transactions on Automatic Control. 1989;34(3):293-305
2. Doyle JC, Glover K, Khargonekar PP, Francis BA. State-space solution to standard H₂ and H_∞ control problems. IEEE Transactions on Automatic Control. 1989;34(8):1691-1696
3. Dullerud GE, Paganini F. A Course in Robust Control Theory – A Convex Approach. New York, N.Y.: Springer; 2000
4. Green M, Limebeer DJN. Linear Robust Control. Englewood Cliffs NJ: Prentice-Hall; 1995
5. Hinrichsen D, Pritchard AJ. Stochastic H^∞. SIAM Journal on Control and Optimization. 1998;36:1505
6. Foo YK. H^∞ control with initial condition. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(9):867-871
7. Li X, Xu J, Zhang H. Standard solution to mixed H²/H∞ control with regular Riccati equation. IET Control Theory and Applications. 2020;14(20):3643-3651
8. Khargonekar PP, Rotea MA. Mixed H²/H∞ control: A convex optimization approach. IEEE Transactions on Automatic Control. 1991;36(7):824-837
9. Saberi A, Chen BM, Sannuti P, Ly U. Simultaneous H²/H^∞ optimal control: The state feedback case. Automatica. 1993;29(6):1611
10. Zhang W, Xie L, Chen BS. Stochastic H²/H^∞ Control: A Nash Game Approach. FL, USA: CRC Press; 2017
11. Khargonekar PP, Nagpal KM, Poola KR. H^∞ control with transient. SIAM Journal on Control and Optimization. 1991;29:1372-1393
12. Burl JB. Linear Optimal Control, H² and H^∞ Methods. California, USA: Addison - Wesley; 1999
13. Brockett RW. Finite Dimensional Linear Systems. New York: Wiley; 1970
14. Foo YK. Strengthened H^∞ control via state feedback: A majorization approach using algebraic Riccati inequalities. IEEE Transactions on Automatic Control. 2004;49(5):824-827
15. Skogestad S, Postlethwaite I. Multivariable Feedback: Analysis and Design. West Sussex, England: Wiley; 1996
16. Skelton RE, Iwasaki T, Grigoriadis KM. A Unified Approach to Linear Control Design. New York: Taylor and Francis; 1998

[1] 1. Bernstein DS, Haddad M. LQG control with an H-infinity performance bound: A Riccati equation approach. IEEE Transactions on Automatic Control. 1989;34(3):293-305

[2] 2. Doyle JC, Glover K, Khargonekar PP, Francis BA. State-space solution to standard H₂ and H_∞ control problems. IEEE Transactions on Automatic Control. 1989;34(8):1691-1696

[3] 3. Dullerud GE, Paganini F. A Course in Robust Control Theory – A Convex Approach. New York, N.Y.: Springer; 2000

[4] 4. Green M, Limebeer DJN. Linear Robust Control. Englewood Cliffs NJ: Prentice-Hall; 1995

[5] 5. Hinrichsen D, Pritchard AJ. Stochastic H^∞. SIAM Journal on Control and Optimization. 1998;36:1505

[6] 6. Foo YK. H^∞ control with initial condition. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(9):867-871

[7] 7. Li X, Xu J, Zhang H. Standard solution to mixed H²/H∞ control with regular Riccati equation. IET Control Theory and Applications. 2020;14(20):3643-3651

[8] 8. Khargonekar PP, Rotea MA. Mixed H²/H∞ control: A convex optimization approach. IEEE Transactions on Automatic Control. 1991;36(7):824-837

[9] 9. Saberi A, Chen BM, Sannuti P, Ly U. Simultaneous H²/H^∞ optimal control: The state feedback case. Automatica. 1993;29(6):1611

[10] 10. Zhang W, Xie L, Chen BS. Stochastic H²/H^∞ Control: A Nash Game Approach. FL, USA: CRC Press; 2017

[11] 11. Khargonekar PP, Nagpal KM, Poola KR. H^∞ control with transient. SIAM Journal on Control and Optimization. 1991;29:1372-1393

[12] 12. Burl JB. Linear Optimal Control, H² and H^∞ Methods. California, USA: Addison - Wesley; 1999

[13] 13. Brockett RW. Finite Dimensional Linear Systems. New York: Wiley; 1970

[14] 14. Foo YK. Strengthened H^∞ control via state feedback: A majorization approach using algebraic Riccati inequalities. IEEE Transactions on Automatic Control. 2004;49(5):824-827

[15] 15. Skogestad S, Postlethwaite I. Multivariable Feedback: Analysis and Design. West Sussex, England: Wiley; 1996

[16] 16. Skelton RE, Iwasaki T, Grigoriadis KM. A Unified Approach to Linear Control Design. New York: Taylor and Francis; 1998

Optimal Stochastic H^∞ Controller Synthesis via LQR Theory

Stochastic Processes - Theoretical Advances and Applications in Complex Systems [Working Title]

Abstract

Keywords

Author Information

Yung Kuan Foo*

Yeng Chai Soh

1. Introduction

2. Problem formulation

3. Preliminaries

4. Finite-horizon state regulation

5. Infinite-horizon state regulation

6. H^∞ control with output feedback

7. Example

Table 1.

8. Conclusion

References

Optimal Stochastic H∞ Controller Synthesis via LQR Theory

Stochastic Processes - Theoretical Advances and Applications in Complex Systems [Working Title]

Abstract

Keywords

Author Information

Yung Kuan Foo*

Yeng Chai Soh

1. Introduction

2. Problem formulation

3. Preliminaries

4. Finite-horizon state regulation

5. Infinite-horizon state regulation

6. H∞ control with output feedback

7. Example

Table 1.

8. Conclusion

References

Optimal Stochastic H^∞ Controller Synthesis via LQR Theory

6. H^∞ control with output feedback