Open access peer-reviewed chapter - ONLINE FIRST

Optimal Stochastic H Controller Synthesis via LQR Theory

Written By

Yung Kuan Foo and Yeng Chai Soh

Submitted: 28 February 2024 Reviewed: 28 February 2024 Published: 04 June 2024

DOI: 10.5772/intechopen.1004889

Stochastic Processes - Theoretical Advances and Applications in Complex Systems IntechOpen
Stochastic Processes - Theoretical Advances and Applications in C... Edited by Don Kulasiri

From the Edited Volume

Stochastic Processes - Theoretical Advances and Applications in Complex Systems [Working Title]

Prof. Don Kulasiri

Chapter metrics overview

10 Chapter Downloads

View Full Metrics

Abstract

Existing published works on H and mixed H/H2 control generally solve the problem via game theory and saddle-point solutions. We provide an alternative solution via the stochastic LQR approach. Specifically, we consider the problem of H state regulation problem for continuous-time systems which are subjected to both energy-bounded deterministic disturbances and stochastic white noises, and uncertain initial system state x(0) satisfying E{x(0)} = 0 and E{x(0)x(0)T} ≤ X0. We derive feedback laws which minimize sup||w(t)||2 ≤ 1E{||Π1/2x(t)||T2 + ||R1/2u(t)||T2}, where Π and R are positive-definite weight matrices. We further demonstrate how the approach may be extended to solving the robust stochastic LQR control with uncertainty in the system A matrix.

Keywords

  • H control
  • LQ regulator
  • stochastic regulator
  • robust control
  • mixed H2/H control

1. Introduction

Existing approaches to solving H control problem typically pose the problem as a differential game where the solution is obtained by finding the minimax saddle point; see [1, 2, 3, 4, 5, 6, 7] for example. This approach is intuitive from the H perspective since the problem is indeed mathematically a minimax problem. However, when it is found that the Riccati equations obtained are indeed very similar to the standard Riccati equations obtained from related LQR problems with the same system matrices, tremendous interests on how to reconcile and combine the two approaches in a fruitful way have been generated. This has given rise to the once-very-hotly investigated mixed H2/H control problem [1, 8, 9, 10]. Specifically, the system model considered is of the form:

dxt/dt=Axt+Buut+E2w2t+Ewωtz2t=C2xt+D2w2tzt=Cxt+Dwt,x0=0E1

Let T2 and T denote the closed-loop transfer functions from w2 to z2 and from w to z, respectively. The problem was to minimize the H2 norm of T2 subject to the H norm of T smaller than certain prescribed positive value γ. Clearly, if w = 0, then the H2 norm will give a performance measure on the effect of w2. Conversely, if w2 = 0, then the H norm would give a performance bound on the effect of w. But what if w2 and w(t) are both nonzero? The main focus of such an approach is usually to optimize the H2 norm for performance, and the H norm bound was more for “robustness” against model uncertainties than for worst-case performance. Thus, we start with a signal-based worst-case control problem, but eventually end up with a “robust” (bound on the H norm) LQR control solution. Such an approach is justifiable but an implicit implication is that the performance is only optimized for the H2 norm and hence not, at least not expressly, about worst-case performance optimization. Another fact to note is that though we obtain a solution to guarantee the H bound of the closed-loop system by solving the relevant Riccati equation (Eq. (36)), such bound is generally not tight. This fact can be easily seen by noting the fact that as we set γ−2 → 0, we recover the LQR. Obviously, the H norm of the closed-loop system corresponding to an LQR cannot be ∞; so it is clear that as we set γ2 → ∞, the closed-loop H norm does not approach ∞, and hence, γ does not equal the H norm in general. An implication of this fact is that the controller obtained via solving Eq. (36) with γ2 = γo2 is not one that minimizes the closed-loop H2 norm subject to the H norm less than or equal to γo2. For example, suppose the closed-loop H norm of the pure LQR control system is γLQR2. Then, it is clear that this LQR is also the optimal solution to the minimum closed-loop H2 norm control problem subject to any value of closed-loop H norm larger than γLQR2, and this solution ought to be unique. But if we solve Eq. (36) with any finite γ2 > γLQR2, we will not obtain the LQR. This implies that the obtained solution cannot be optimal.

Other salient features of such an approach are: (i) the performance in consideration is usually (there are some exceptions such as [7]) over the infinite time horizon, (ii) the initial state is zero and, (iii) as already pointed out above, the performances in respect of the H2 and H criteria are, loosely speaking, in a mutually-exclusive “either-or” situation; in that, it does not consider the combined effect when both the stochastic and the deterministic disturbances are present simultaneously. In other words, while the H norm gives a bound on the worst-case performance of the closed-loop system when there is a deterministic disturbance (but with no stochastic noises) and the H2 norm gives indication on the performance in the presence of stochastic noise (but in the absence of deterministic disturbances), the combined effect on the closed-loop system when both types of exogenous disturbances are present is not explicitly and quantitatively addressed. Another problem we may identify is that the H2 norm obtained in a mixed H2/H design is actually in regards to the nominal model. When the system matrices are subject to perturbations, such H2 norm would also be subject to change. Hence, the question of what would be the worst-case performance of the closed-loop system when the nominal model has been subject to perturbations remains to be answered. In other words, stability robustness may be established but nothing about the worst-case performance has been said, at least not quantitatively.

In this chapter, we revisit the mixed H2/H control problem but from a different perspective. Specifically, we pose the original problem as a stochastic H control problem where the system is subject to both possibly “worst-case” (bounded) deterministic disturbances and stochastic “zero-mean” white noises simultaneously. We are interested in minimizing the worst-case combined effect of these disturbances, along with an uncertain nonzero initial condition [6, 11], on the closed-loop performance of the system. It should be clear that these problems would arise when one is interested to model and drive the state of a system from some uncertain nonzero value to the origin in some optimal way, in the presence of both high and low frequencies noises. Note that while we model the disturbances as two separate noises, they could actually be one single disturbance consisting of both low and high frequencies components, thus modeling a nonzero-mean disturbance.

Our results here will be derived from the Riccati differential equation related to the finite-horizon LQR problem, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we minimize the expected worst-case cost and derive our solution by explicitly working with P(0) of the finite-horizon LQR Riccati equation, which is directly related to the value of the cost functional concerned when the initial state is nonzero. Clearly, if we assume the deterministic noise to be absent, the problem reduces to a stochastic LQR problem, and if we assume the stochastic white noise to be absent, we recover the standard H control problem. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H control problem, and H control with stochastic disturbances (compare [5, 8, 9, 10] for example).

It should be noted that the problem we consider here is different from the problem considered in some existing stochastic H control literature. For example, in [5, 10], the stochastic H approach was to incorporate stochastic disturbances into the system matrices to obtain a stochastic system with multiplicative white noise perturbations. This approach is quite different from the standard and familiar stochastic LQR/LQG approach, where the system matrices are assumed to be deterministic, not subjecting to stochastic perturbation or state-dependent multiplicative noises.

The organization of the chapter is as follows. In Section 2, we formally formulate the control problem of our interest. In Section 3, we state and derive some mathematical preliminaries. In Section 4, we consider how one may synthesize a finite-horizon controller with uncertain initial state. In Section 5, we consider steady-state solutions. Section 6 considers control with output feedback when the states are not directly measureable. In Section 7, we give an example. Section 8 contains our conclusions.

Notation: ||.||2 denotes Euclidean norm of the vector; ||.||L2 the L2-norm; Δ denotes “equal by definition”; E{.} denotes taking expectation; (∫T. dt) denotes the definite integral with respect to t from 0 to T; ||x(t)||T2 denotes E(∫Tx(t)Tx(t) dt), and A ≥ B means A-B is positive semi-definite.

Advertisement

2. Problem formulation

Consider the continuous-time linear time-invariant system with input disturbances w(t) and v(t):

dxt/dt=Axt+Buut+Bwwt+vtE2
Ex0=0,Ex0x0TX0E3

A, Bu, Bw are known real matrices, u(t) is the control input, w(t) is an unknown but bounded exogenous input with ETw(t)Tw(t)dt} ≤ 1, v(t) is a stationary Gaussian zero-mean white noise with known covariance matrix ϴ. For simplicity, we assume that the system is controllable and observable in the case of output feedback.

The feedback control problem of our interest is to find a linear causal control law u(t) = K(t)x(t) to minimize

Ju=Π1/2xtT2+R1/2utT2+ExTTQxTE4

where Π and R are positive-definite and Q is positive-semi-definite weight matrices.

Note that (4) may be re-written as

Ju=Π1/2xtT2+R1/2KtxtT2+EQ1/2xT2E5

In what follows, we shall sometimes drop the argument t (especially for the system matrices), or denote z(t) as zt, for notational simplicity when there is no danger of confusion.

Advertisement

3. Preliminaries

Lemma 3.1: Let A, B, L = LT, and QL = QLT be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution P0(t) of the differential Riccati equation (DRE)

dPt/dt=PtAATPt+PtBBTPtLPT=QL.E6

Then the minimum of the cost functional

JΔ¯TxTLx+wTwdt+xTTQLxTE7

for the system dx/dt = Ax(t) + Bw(t) + v, E{v(t)} = 0, E{v(t)v(t)T} = ϴ, E{x(0)} = 0, and E{x(0)x(0)T} = X0 is given by

Jmin=TrPo0X0+TPotΘdtE8

The minimizing w(t) is given by

wt=BTPotxtE9

Proof: Well-known stochastic LQR result (see Ref. [12], p. 221).

We next give a similar result concerning the deterministic problem and the related Riccati differential equation, which we will find useful later in studying some of the properties of RDE (6).

Lemma 3.2: Let A(t), B(t), L(t) = L(t)T, and QL = QLT be given matrices. Suppose there exists on the interval 0 ≤ t ≤ T a solution P0(t) of the differential Riccati equation (DRE)

dPt/dt=PtAATPt+PtBBTPtLPT=QL.E10

Then the cost functional

ηΔ¯TxTLx+wTwdt+xTTQLxTE11

for system dx/dt = Ax(t) + Bw(t): x(0) = x0 is given by

η=wt+BTPotxtT2+x0TPo0x0.E12

The minimum value of η is given by x0TP0(0)x0, and the minimizing w(t) is given by

wt=BTPotxtE13

Proof: See Ref. [13], (proof Theorem 1, p. 131.)

Remark 3.1: Note that QL and L are symmetric but not necessarily nonnegative definite. Furthermore, as pointed out in [13], because L is not assumed to be nonnegative definite, it is not generally true that a minimum η exists. However, we know from LQR theory that such minimum η exists when L is nonnegative definite. Hence, it is reasonable to expect that a minimum η exists if L is not “exceedingly” negative definite.

Lemma 3.3: Let A, B, L = LT ≤ 0, and QL = QLT be given matrices. Suppose there exits on the interval 0 ≤ t ≤ T a solution Ps(t) of the differential Riccati inequality (DRI), [14]:

dPt/dtPtAATPt+PtBBTPtLPT=QL.E14

Then for the system dx/dt = Ax(t) + Bu(t): x(0) = x0, η as defined in (11) satisfies:

ηwt+BTPstxtT2+x0TPs0x0.E15

Proof: Since L ≤ 0, there exists L1 ≤ L such that

dPst/dt=PstAATPst+PstBBTPstL1PsT=QL.E16

Let

η1Δ¯TxTL1x+wTwdt+xTTQLxTE17

Clearly, η ≥ η1 for any x and w since L ≥ L1. Applying Lemma 3.2 with L replaced with L1 gives η1 = ||w(t) + BTPs(t)x(t)||T2 + x0TPs(0)x0 and completes the proof.

Remark 3.2: Since the inequality sign in (14) is non-strict, the set of solutions that solve Lemma 3.3 contains P0(t) of Lemma 3.2 as a member.

Remark 3.3: In Lemmas 3.1-3.3, P(t) is independent of the initial condition x0, even though the cost η is a function of x0.

Proposition 3.1: Let P0(t) be the solution that satisfies Lemma 3.2 and P1(t) be any solution that satisfies (14). Then

Po0P10E18

Proof: Since η ≥ η1 for any x and w, it follows that x0TP0(0)x0 ≥ x0TP1(0)x0 for any x0. Thus (18) must follow.

Remark 3.4: Note time 0 may be any time before time T. So (18) also implies P0(t) ≥ P1(t), all t < T.

Proposition 3.2: Let P1(t) be a solution and P2(t) be any solution that satisfies (14). Then P1(t) also satisfies Lemma 3.2 if and only if

TrP10TrP20E19

for all admissible P2(t).

Proof: Necessity is obvious from Proposition 3.1. To prove sufficiency, suppose there is a P1(t) that satisfies (19) but not a solution to Lemma 3.2. Since P1(t) is a solution to (14), it follows from Proposition 3.1 that P0(0) ≥ P1(0). But P0(0) is also an admissible solution of Lemma 3.3. Then, since Tr[P1(0)] ≥ Tr[P0(0)] but P1(0) ≠ P0(0), it follows that P0(0) ≥ P1(0) does not hold. This contradicts Proposition 3.1 and completes the proof.

Proposition 3.3: Let P1(t) be a solution to Lemma 3.2 and P2(t) be any solution that satisfies (14). Then the following statements are equivalent:

TrP10TrP20dTrP1t/dtdTrP2t/dtE20

Proof: Obvious consequence of Bellman’s principle of optimality, since both P1(t) and P2(t) are solutions computed backward in time from final time T with Ps(T) = QL.

Remark 3.4: Proposition 3.3 tells us that in order to find the minimizing w(t) which minimizes (7) and (11) from the set of sub-optimal solutions characterized by Lemma 3.3, it suffices to search for one with the maximum TrP(t) or minimum d{TrP(t)}/dt over all tϵ[0,T]. These properties will be useful later when we search for the optimal controller via linear matrix inequality (LMI) as follow.

Consider the LMI:

Γt+PA+ATPLPBBTPI0E21

Applying Schur transformation to (21) then gives us Γ(t) ≥ –P(t)AATP(t) + P(t)BBTP(t)–L. It follows from Proposition 3.3 that the Γ(t) that solves minTrΓ subject to (21) at all tϵ[0,T] will give us the desired dP/dt that solves dP/dt = –PAATP + PBBTP(t)–L for all tϵ[0,T], where Γ = dP/dt.

Before proceeding further, let us recollect what we have done so far. Lemmas 3.1 and 3.2 show that the minimizing w(t) is the same whether the system is subjected to an input stochastic white noise or not. However, the minimum costs resulted will be different as given by (8) and x0TP0(0)x0 (implicitly given by (12)), respectively. Then from Lemma 3.3 to Proposition 3.3, we transformed the problem of finding P(t) for (6) into an equivalent LMI problem.

Advertisement

4. Finite-horizon state regulation

Let u = Kx where K is the state feedback controller. Let Acl = A + BuK. Let

L=Π+KTRKE22

Then it is clear

Ju=TxTΠ+KTRKxdt+xTTQxTE23

We are interested in finding a w(t) to maximize Ju. For this purpose, we consider

γ2TxTΠ+KTRKxdt+xTTQxT+wT2wT2E24

Let

dPt/dt=PtAclAclTPt+PtBBTPt+γ2Π+KTRKPT=γ2Q.E25

Lemma 4.1: Let Acl = A + BuK and P0 be given by (25) for some chosen value of γ2. Then for the system (2), the state feedback control law u = Kx achieves the close loop performance for all admissible w(t):

Juγ2wT2TrPo0X0TPotΘdtE26

And the worst case w0 achieves

Ju=γ2woT2TrPo0X0TPotΘdtE27

Proof: From Lemma 3.1,

minwγ2TxTΠ+KTRKxdt+wT2wT2=TrPo0X0+TPotΘdtwT2

Multiply throughout by –γ2 we obtain:

maxwJuγ2wT2+γ2wT2=γ2TrPo0X0+TPotΘdt+wT2E28

This completes the proof.

Remark 4.2: From (26), we recover the conventional H inequality [4, 12]:

Juγ2wT2ifx0=0andΘ=0E29

Corollary 4.1: Let Acl = A + BuK and P0 be given by Lemma 4.1. If w0 = –BwTP0(t)x(t), then

Ju=γ2woT2TrPo0X0TPotΘdtE30

Proof: Obvious from (28) and Lemma 3.1. This completes the proof.

To derive the optimal controller for Lemma 4.1, We substitute Acl = [A + BuK] into (6) with P(T) = −γ−2Q. Thus we obtain

dPt/dt+PtA+BuK+A+BuKTPtPtBwBwTPtγ2Π+KTRK=0E31

From (27), it is clear that the “larger” the TrP0(0)X0 and TrʃTP0(t)ϴdt, the smaller Ju would be. This implies that we want to find the least negative-definite P0(t) for all t < T. Proposition 3.2 implies that we should search for the P0(t) with the least negative TrP(t). Also, since we will be working backward in time when solving (31), this implies that we want to find a K(t) which minimizes Tr{dP/dt} at every instant of t (principle of optimality), to maximize TrP0(t) (Proposition 3.1 and 3.2). To achieve this, we note

dPt/dt=PtA+BuKA+BuKTPt+PtBwBwTPt+γ2Π+KTRKE32

Differentiate Tr{dP/dt} with respect to K and set to 0:

0=2PtBu+2γ2KTRE33
K=γ2R1BuTPd2TrdP/dt/dK2=2γ2R>0E34

Substituting (34) into (32), we obtain

dP/dt=PAATP+PBwBwTγ2BuR1BuTP+γ2ΠE35

If we let γ−2Pγ = –P, then we may show that (35) is equivalent to

dPγ/dt=PγA+ATPγPγBuR1BuTγ2BwBwTPγ+ΠE36

Note that (36) is the same RDE obtained in the conventional approach to H control with R = I and trivial initial state, ([4, 12]). We have, thus, established the connection between (35) and (36). However in our derivation here, we are able to concurrently derive a quantitative cost corresponding to the solution based on the initial condition and the simultaneous disturbances, as is in the case for stochastic LQR control.

Theorem 4.1: Let Acl = A + BuK and P0 be given by (35) with P(T) = −γ−2Q. Then K = γ2R−1BuTP achieves

Juγ2wT2TrPo0X0TPotΘdtE37

And the worst case w0 achieves

Ju=γ2wT2TrPo0X0TPotΘdtE38

Proof: Substituting K = γ2R−1BuTP into (25) and applying Lemma 4.1 completes the proof.

To compute ||w0||T, the system with w0 = –BwTP0(t)x(t) is given by

dx/dt=Ax+BuKxBwBwTPox+v=A+BuKBwBwTPox+v=Awx+vE39

Define

ExtxtT=XtE40

Then Xt satisfies [12]:

dXt/dt=AwXt+XtAwT+Θ,Xt0=X0E41
EwoT2=ETxtTPoBwBwTPoxtdt=TrTPoBwBwTPoXtdtE42

To check if the value of γ2 selected is the correct one, we require γ2 to be such that ||w0||T2 = 1. This could be achieved by varying and searching over values of γ2. We note that ||wo||T2 is generally a decreasing function of γ2, as is already known from conventional H control theory.

Once the desired value of γ2 and hence P(t), 0 ≤ t ≤ T, are found, we may then compute the cost via Theorem 4.1. We summarize the results in the following conceptual algorithm:

Conceptual Algorithm 1.

Step 1: Choose a value of γ2 and compute P(t) with (35). If P(t) does not exist over the entire time interval [0, T], the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and repeat this step.

Step 2: If P(t) exists over the entire time interval [0, T], compute ||w0||T2. If ||w0||T2 = 1, we have the desired solution. The optimal state control law is given by (34).

Step 3: If ||w0||T2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.

Step 4: If ||w0||T2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, decrease γ2, and go to step 1.

Remark 4.1 It is important to note the presence of ϴ in (41). In other words, v does have an impact on the value of Xt and hence w0. This distinguishes our approach here from the previous mixed H2/H solutions which do not explicitly take ϴ into consideration.

Advertisement

5. Infinite-horizon state regulation

It is well known from conventional H control theory ([4, 12, 15]) that if γ2 is large enough, the feedback gain would approach a steady-state value K far from the final time. In such a case, the feedback system would approach a linear time-invariant system. Unfortunately, this is not the steady-state solution for us, at least not for the problem formulated in Section 2. This is because as T → ∞ and as system approaches steady-state, limT → ∞{TrʃTϴPo(t)dt} → limT → ∞Tr{TϴPo(∞)} → –∞ will dominate the other two cost components, namely TrP0(∞)X0 and limT → ∞||w||T = ||w||L2 < ∞. It follows that for formulations which require ||w||L2 ≤ ∞ and finite x0, the optimal infinite-horizon solution is indeed the LQR since it minimizes Tr{ϴPo(∞)} with γ−2 = 0.

Instead, we shall explore an alternative formulation. First, we note that there exists a steady-state solution P to the algebraic Riccati equation means P is the solution to the RDE with –γ−2Q = P(t) = P, 0 ≤ t ≤ T. Substituting this into the RHS of (26), we obtain and define

ηssγ2EwT2TrPX0+TΘPE43

where

ηssΔ¯EΠ1/2xT2+R1/2uT2γ2xTTPxTE44

Dividing (43) by T gives

ηss/Tγ2EwT2/TTrPX0/TTrΘPE45

Assume limT → ∞E{||w||T2}/T = p2. Note that p2 may be interpreted as “power”. We obtain the following result:

Lemma 5.1: Let P, K, and Acl = A + BuK be the steady-state solutions corresponding to (35) with T → ∞. If A + BuKBwBwTP is stable, then K achieves the infinite-horizon performance

ηΔ¯limTηss/Tγ2p2TrΘPE46

Proof: If A + BuKBwBwTP is stable, it implies that if we let Q = −γ−2P, then P(t) = P exists for all t < ∞. It follows that (45) holds. Noting limT → ∞{PX0/T} = 0 completes the proof.

Remark 5.1: Substituting limT → ∞E{||w||T2}/T = p2 into and setting ϴ = 0 in (46) shows that the H norm of the closed-loop system from w to ηss is bounded by γ. Similarly, setting w = 0 shows that the (from map) steady-state LQR cost averaged over T as T → ∞ would be γ2Tr{–ϴP}.

Finally, since A + BuKBwBwTP is stable, there exists a X which satisfies the following Lyapunov equation:

A+BuKBwBwTPX+XA+BuKBwBwTP+Θ=0E47

So w0 = BwTPx gives

EwoT2=TTrBwTPTBwTPXdt=TrBwTPTBwTPXTE48

since Bw, P, and X are all constant matrices. Dividing both sides of (48) by T gives

limTEwoT2/T=TrBwTPTBwTPX.E49

where X is given by (47).

We may further assume that limt → ∞E{||w||22} = limT → ∞||w||T2/T ≤ 1, and ϴ is unknown with worst-case Trϴ ≤ σ where σ ≥ 0. This would correspond to the system being subject to both worst case w(t) and v(t), where v(t) belongs to the class of white noises that satisfies E{vTv} ≤ σ. The controller designed may then be interpreted as one which optimizes expected worst-case performance in the presence of both bounded-power deterministic disturbances and stochastic white noises. We summarize the procedure of how to find such a controller in conceptual algorithm 2.

Conceptual Algorithm 2.

Step 1: Guess a value of γ2 and compute P from (35) with dP/dt = 0. If P < 0 does not exist, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and repeat this step.

Step 2: If P exists, find ϴo which solves minTrϴ=σ{TrPϴ}. Compute X from (47) with ϴ = ϴo and compute p2 = limt → ∞||wo||2 via (48). If p2 = 1, we have the desired solution.

Step 3: p2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.

Step 4: p2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, increase γ2, and go to step 1.

Advertisement

6. H control with output feedback

In this section, we consider the case where the state of the system is not available for feedback. In this case, we consider dynamical controllers of the form u(t) = K(t)*z(t) where z(t) denotes the measured output available to the controller and * denotes the convolution operator. Let

zt=CztxE50

Assume that the state equation of the controller is given by

dxK/dt=AKtxK+BKtzE51
u=CKtxK+DKtz=CKxK+DKCzxE52

The closed-loop system may then be represented by the augmented state-space model with xclT Δ [xTxKT]T, [3, 16]:

dxcl/dt=Aclxcl+Bclw+I0TvΔ¯Aclxcl+Bclw+ζE53
Acl=Ap+BGCE54
Bcl=BwT0TE55
Ap=A000B=Bu00IC=Cz00IG=DKCKBKkAKkE56

Although in the above formulation, the dimensions of A and AK need not be the same, we shall assume theirs to be the same, for simplicity and better performance. We shall also set xK(0) = E{x0} = 0. Note that the covariance of the stochastic input in (53) is E{ζζT} = [I 0]Tϴ[I 0].

u(t) in (52) may be further re-written as

u==YGVTV+YGYTCzYxclE57

since

DK=I0GI0TΔ¯YGYTE58
CK=I0G0ITΔ¯YGVTE59

Substitute (57) into the cost functional:

xTΠx+uTRu=xclTYTПY+YGVTV+YGYTCzYTRYGVTV+YGYTCzYxclE60

and define

QzΔ¯YTПY+YGVTV+YGYTCzTTRYGVTV+YGYTCzY

Then it follows

xTΠx+uTRu=xclTQzxclE61

Now, substituting all these parameters for (25), we obtain

dPt/dt+PtAcl+AclTPtPtBclBclTPtγ2Qz=0PT=YTQYE62

Since Acl and [YGVTV + YGYTCzY] are affine in G, (62) may be transformed into a LMI in G.

Γ+PAcl+AclTPγ2YTПYPBclYGVTV+YGYTCzYTBclTPI0YGVTV+YGYTCzY0γ2R10E63

Hence, (62) may be solved by, at each time step t, finding Γ(t) and G(t) by solving the following optimization problem:

minGTrΓsubject to63E64

With Γ(t) = dP/dt found, P(t-ε) may be computed (note that a minimum dP/dt would give a maximum P(t-ε) given P(t). Hence, we may iteratively work backward (numerically) toward time 0 to construct P(t) and G(t).

With G(t) found, the worst case w may be found as

wot=BclTPotxclt,E65

and let

Aw=Ap+BGCBclTPoE66

The rest of the development is hence similar to the full information case and left to the reader.

Conceptual Algorithm 3.

Step 1: Choose a value of γ2 and find G(t) and P(t) via (63)(64). With G(t), compute Acl and Bcl from (54) and (55), respectively.

Step 2: With the P(t) obtained, compute ||w0||T2 with xcl(0) = 0. If ||w0||T2 = 1, we have the optimal solution with CK and DK defining the optimal control law (52).

Step 3: If ||w0||T2 > 1, the chosen γ2 is too small. Set lower bound γl2 = γ2, increase γ2, and go to step 1.

Step 4: If ||w0||T2 < 1, the chosen γ2 is too large. Set upper bound γu2 = γ2, decrease γ2, and go to step 1.

Another variant output feedback problem is if we assume

zt=Cztx+Dww+mE67

where we assume DwBwT = 0, and m a stationary zero-mean white noise process with E{mmT} = ʘ and E{mvT} = 0. In this case, we may consider a strictly causal controller

dxK/dt=AKtxK+BKtzu=CKtxKi.e.DK=0E68

Other than mathematical simplicity, choosing a strictly causal controller when the plant is not strictly proper may also be desirable for robustness purposes, because a strictly proper return ratio is generally more robust against un-modeled dynamics at high frequencies.

The augmented system in this case is given by:

dxcl/dt=Aclxcl+Bclw+BξξBξ=I0T+BGI0TE69

where Acl and G are given by (54)-(56) with DK = 0,

ξ=vTmT,andEξξT=diagΘE70

where diag[ϴ, ʘ] denotes a block diagonal matrix with diagonal sub-blocks ϴ and ʘ.

Bcl=BwT0T+BGDwT0TE71
u=CKxK=CKVxclE72

Thus

Π1/2xtT2+R1/2utT2=TxclTYTПY+VTCKTRCKVxcldtE73

Hence, we may solve the problem by letting

Qz=YTПY+VTCKTRCKVE74

and substitute (54), (71), (56) with DK = 0, and (74) into (62) to obtain the appropriate RDE. The rest of the details and steps are similar to above Dw = 0 case and left to the reader.

Advertisement

7. Example

We consider the example (modified) from ([11], p. 335).

dx/dt=x+u+2w+vE75

and

Π=100,R=1,Q=0,

We further assume

Ev=0,EvvT=1,Ex0=0,Ex0x0T=1E76

We shall apply Conceptual Algorithm 2 to find a linear time-invariant controller that minimizes the infinite-horizon performance index ηΔ limT → ∞ss/T) of Section 5. By iterating with different values of γ2, steady-state solutions for various relevant parameters are tabulated (Table 1) below:

γ2–PoXE||w||22ηη/p2η/(p2 + q2)
4.58,440.11833.642.041.251.22
55.580.09411.7417.321.471.35
82.0350.0631.033.0652.971.51
101.4680.0580.501.9683.941.31

Table 1.

Iterations with different values of γ2.

where q2ΔTrϴ = 1 (note also p2ΔE||w||22). It can be seen that the desired solution occurs at about γ2 = 8. We also see that values of limt → ∞E||w||2 and ϴ would determine the desirable value of γ2 to be used. Note also that while η/p2 monotonically increases with γ2, η/(p2 + Trϴ) has a maximum with respect to γ2 which is the worst-case combination of w and v from a power-ratio point of view. Note that p2 + Trϴ is a measure of the combined power of the two disturbances.

Advertisement

8. Conclusion

We have considered the H state regulation problem for which the initial state of the plant is possibly nonzero, and the system is subject to worst-case nonzero-mean exogeneous input and zero-mean white noise disturbance. It should be clear that the two signals may be combined to represent a nonzero-mean disturbance which consists of both high and low frequencies components.

We note that our results here were derived from the Riccati differential equation related to the LQR, instead of like most existing approaches which are based on differential games theory. Rather than looking for a minimax saddle point as a solution, we derive our solution by maximizing P(0), which is directly related to the value of the cost functional concerned when the intial state is nonzero, of the Riccati differential equation via Bellman’s principle of optimality. This should be familiar to anyone who is familiar with the theory of LQR. We believe that this approach offers us a new (in addition to the existing works such as [1, 2, 3, 4, 5], for example), and arguably also simpler, perspective to look at the solution to the H control problem, and H control with stochastic disturbances (compare [5, 8, 9, 10] for example). In fact, it is the recogniton that the H and LQR problems may be solved via the same Riccati equation that enables us to develop the present simple way of incorporating a stocahstic exogenous input into H control, in addition to nonzero initial conditions, based on the well-known stochastic LQR theory. We anticipate that this approach will enable control engineers to extend many H results to include stochastic disturbances, capitializing on, and drawing from, the vast pool of results that have already been developed for stochastic LQR/LQG. Hence, we believe that this would be one of the main contributions of the present paper. Note that because we are still working with deteministic system matrices, established and well-known techniques such as the singular-value method or small-μ test, etc. are directly applicable to analyze the designed system [15].

We further note that the worst-case deterministic disturbance is in the form of w(t) = –BwTP0(t)x(t). This is useful, because if we assume the uncertain A coefficient matrix of the open-loop system to be A + Δ and the absence of deterministic disturbances, then the resultant system may be modeled as Ax + w where w = Δx. It follows that if P0BwBwTP0 ≥ ΔTΔ for all admissible Δ, then the controller obtained in our approach here will also guarantee the performance of the closed-loop system with performance bound (46)(48) when the open-loop system A matrix is subject to perturbation Δ. Hence, here Bw may be construed as some free parameter/weight matrix which, together with γ2, we may choose and vary to search for and construct a P0 that satisfies (35) with dP/dt = 0, and P0BwBwTP0 ≥ ΔTΔ for all admissible Δ (for example, if Δ is assumed to be an unstructured perturbation, [15], then we may assume σ2I ≥ ΔTΔ where σ denotes the singular-value bound), thus solving robust the stochastic control problem with uncertain A-matrix). Note that if we let WB = PoBwBwTP0, then for any P0 obtained we may find Bw = (P0−1WBP0−1)1/2. Thus, we may substitute P0BwBwTP0 = σ2I into (35) with dP/dt = 0 to obtain

0=PAATP+σ2Iγ2PBuR1BuTP+γ2ΠE77

as the Riccati equation to be searched over γ−2 for finding the robust performance control problem with unstructured A-matrix perturbations satisfying ΔTΔ ≤ σ2I.

Of course, assuming that just the A matrix may be uncertain is unrealistic in practice. But it does suggest a possibly fruitful approach to address the “worst-case performance when subject to system matrices perturbations” problem if we are able to extend the approach to address uncertainties in the input and output matrices as well. This will be a direction of our future research.

References

  1. 1. Bernstein DS, Haddad M. LQG control with an H-infinity performance bound: A Riccati equation approach. IEEE Transactions on Automatic Control. 1989;34(3):293-305
  2. 2. Doyle JC, Glover K, Khargonekar PP, Francis BA. State-space solution to standard H2 and H control problems. IEEE Transactions on Automatic Control. 1989;34(8):1691-1696
  3. 3. Dullerud GE, Paganini F. A Course in Robust Control Theory – A Convex Approach. New York, N.Y.: Springer; 2000
  4. 4. Green M, Limebeer DJN. Linear Robust Control. Englewood Cliffs NJ: Prentice-Hall; 1995
  5. 5. Hinrichsen D, Pritchard AJ. Stochastic H. SIAM Journal on Control and Optimization. 1998;36:1505
  6. 6. Foo YK. H control with initial condition. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(9):867-871
  7. 7. Li X, Xu J, Zhang H. Standard solution to mixed H2/H∞ control with regular Riccati equation. IET Control Theory and Applications. 2020;14(20):3643-3651
  8. 8. Khargonekar PP, Rotea MA. Mixed H2/H∞ control: A convex optimization approach. IEEE Transactions on Automatic Control. 1991;36(7):824-837
  9. 9. Saberi A, Chen BM, Sannuti P, Ly U. Simultaneous H2/H optimal control: The state feedback case. Automatica. 1993;29(6):1611
  10. 10. Zhang W, Xie L, Chen BS. Stochastic H2/H Control: A Nash Game Approach. FL, USA: CRC Press; 2017
  11. 11. Khargonekar PP, Nagpal KM, Poola KR. H control with transient. SIAM Journal on Control and Optimization. 1991;29:1372-1393
  12. 12. Burl JB. Linear Optimal Control, H2 and H Methods. California, USA: Addison - Wesley; 1999
  13. 13. Brockett RW. Finite Dimensional Linear Systems. New York: Wiley; 1970
  14. 14. Foo YK. Strengthened H control via state feedback: A majorization approach using algebraic Riccati inequalities. IEEE Transactions on Automatic Control. 2004;49(5):824-827
  15. 15. Skogestad S, Postlethwaite I. Multivariable Feedback: Analysis and Design. West Sussex, England: Wiley; 1996
  16. 16. Skelton RE, Iwasaki T, Grigoriadis KM. A Unified Approach to Linear Control Design. New York: Taylor and Francis; 1998

Written By

Yung Kuan Foo and Yeng Chai Soh

Submitted: 28 February 2024 Reviewed: 28 February 2024 Published: 04 June 2024