A New Method for Evaluating Air Quality Using an Ideal Near-Gray Function Cluster Correlation Analysis Method

A sample, which comes from monitored data reports from some environmental management services, is first classified by ideal near gray function cluster analysis. Then, the sample level is determined by gray correlation analysis, and comprehensive assessment conclusions are drawn based on the degree of correlation between the sample classification and the levels specified in GB3095-2012.

Classification of the sample to be evaluated

Establishment of the evaluation index sequence matrix for the selected sample

Let S be a sequence of grouping objects, that is, S = {s1, s2…, sm}; Xis a sequence of variables influencing the air, i.e. X= {X1, X2…, Xm}; XI am are the original surveillance data for sI (I = 1, 2…, m ) and Xk (k = 1, 2…, m ); I and m represent the number of objects considered in clustering; k and mare the number of influence indices that are the pollutants mentioned above. As a result, the following matrix can be established (Eq. 1).

$$S = begin {array} {* {20} c} {s_ {1}} {s_ {2}} ldots {s_ {m}} end {array} left[ {begin{array}{*{20}c} {x_{11} } & {x_{12} } & ldots & {x_{1n} } {x_{21} } & {x_{22} } & ldots & {x_{2n} } ldots & ldots & ldots & ldots {x_{m1} } & {x_{m2} } & ldots & {x_{mn} } end{array} } right]$$

(1)

Establish the matrix of clusters of close gray functions with ideal value

Let X0= {X01, X02 …, X0m} be the sequence of ideal values ​​corresponding to each influencing index. The principle for determining the ideal value is as follows (Eqs. 2, 3, 4).

The first situation: The higher the influence index (Xk) is, the better the air quality; in this case, the ideal value

$$x_ {0k} = max left {{x_ {ik}, i = 1,2, ldots, m} right }, k = 1,2, ldots, n.$$

(2)

The second situation: the smaller the influence index (Xk) is, the better the air quality; in this case, the ideal value

$$x_ {0k} = min left {{x_ {ik}, i = 1,2, ldots, m} right }, k = 1,2, ldots, n.$$

(3)

Third, air quality is better when the influence index (Xk) displays a moderate value and the ideal value is

$$x_ {0k} = { text {M}}.$$

(4)

According to the ideal value X0k (Eqs. 2, 3 or Eq. 4) and the monitored original data (XI am), the value of the gray closing function YesI am is calculated using (Eq. 5).

$$y_ {ik} = frac {{x_ {ok}}} {{x_ {ik}}} ; left ({i = 1,2, ldots, m; k = 1,2, ldots , n} right)$$

(5)

or XI am are the original controlled data and X0k is the ideal value corresponding to the k-th influencing index. In addition, the value of the function YesI am is dimensionless, and YesI am?? [0,1]. YesI am denotes the degree of correlation of sI and s0for the k-th index. Specifically, the larger YesI am is, the closest sI is at the ideal value s0, and the smallest YesI am is, the furthest sI is of s0.

Thus, the following near gray matrix Yes can be established (Eq. 6).

$$Y = left[ {begin{array}{*{20}c} {y_{11} } & {y_{12} } & ldots & {y_{1n} } {y_{21} } & {y_{22} } & ldots & {y_{2n} } begin{gathered} ldots hfill y_{m1} hfill end{gathered} & begin{gathered} ldots hfill y_{m2} hfill end{gathered} & begin{gathered} ldots hfill ldots hfill end{gathered} & begin{gathered} ldots hfill y_{mn} hfill end{gathered} {y_{01} } & {y_{02} } & {…} & {y_{0n} } end{array} } right]$$

(6)

In that case, Yes is the value of the closing function in gray. What’s more, (Yes01, Yes02…, Yes0m) = (1.1…, 1)1 ×m is the ideal sequence, and the largest YesI am it’s better sI is; the biggest YesI am is equal to 1.

Classification of the sample to be evaluated

Since the influence of each influence index is different, the weight of each influence index must be taken into account. Let PI be the global analysis value of sI. PI can be expressed as follows (Eq. 7)

$$P_ {i} = sum limits_ {k = 1} ^ {n} {Wy_ {ik}} left ({i = 1,2 ldots, m} right)$$

(7)

or W is the weight of each influence index, and since the number of indexes is k, number of W values ​​is also k(W1, W2…, Wk). Corresponding, the following equation can be established (Eq. 8).

$$W_ {k} = frac {{ sum limits_ {i = 1} ^ {m} {X _ {{i { text {k}}}}}}} {{ sum limits_ {i = 1} ^ {m} { sum limits_ {k = 1} ^ {n} {X_ {ik}}}}} ; left ({k = 1,2 ldots, n} right)$$

(8)

Based on the actual value of the full scan PI, Pj= (P1, P2…, Pm)T. The following equation (Eq. 9) can be used to calculate the gray closure value PI of PI in relationship with Pj.

$$P_ {ij} = frac {{ min (p_ {i}, p_ {j})}} {{ max (p_ {i}, p_ {j})}} ; left ({i , j = 1,2 ldots, m} right)$$

(9)

Then,

$$P = left ({P_ {ij}} right) _ {m times m}.$$

(ten)

Yes P (Eq. 10) satisfies the following three conditions: (1) reflexivity, where PI= 1 (I = j); (2) symmetry, where PI= PI am; and (3) normativity, wherePI?? [0,1], we can select the appropriate threshold value from the Pmatrix, intercept branches with weight values ​​less than, which is the similarity coefficient4.5, and establish the classification (S_ {t} ^ { prime} ) ( t= 1, 2…, vs) when the level λ meets the relevant requirement. (S_ {t} ^ { prime} ) represents each classification of air in a given region. The following equations (Eqs. 11, 12) can be established.

$$S_ {t} ^ { prime} = left ({S_ {1} ^ { prime}, S_ {2} ^ { prime} ldots, S_ {c} ^ { prime}} right ) ^ {{ text {T}}}$$

(11)

$$S_ {tk} ^ { prime} = left ({S_ {t1} ^ { prime}, S_ {t2} ^ { prime} ldots, S_ {tn} ^ { prime}} right )$$

(12)

or (S_ {t} ^ { prime} ) is the t-th classification, (S_ {tk} ^ { prime} ) is the kth index of the tth classification, tis the number of classifications, andkis the number of influence indices.

(S_ {tk} ^ { prime} ) can be expressed in the following matrix form (Eq. 13).

$$S_ {tk} ^ { prime} = left[ {begin{array}{*{20}c} {s_{11}^{prime } } & {s_{12}^{prime } } & ldots & {s_{1n}^{prime } } {s_{21}^{prime } } & {s_{22}^{prime } } & ldots & {s_{2n}^{prime } } ldots & ldots & ldots & ldots {s_{cc}^{prime } } & {s_{c2}^{prime } } & ldots & {s_{cn}^{prime } } end{array} } right]$$

(13)

Analysis of the degree of correlation of the sample to be evaluated

Let (S_ {t} ^ { prime} ) be the sample to be evaluated, and let X= ( X1, X2…, Xm), which is the set of influence indices mentioned above and is the evaluation index used for (S_ {t} ^ { prime} ). Let ({ text {S}} _ {0} ^ { prime} ) be the air quality classification indicated in theGB3095-2012. Then the equation for the correlation coefficient is as follows (Eq. 14)14.

$$zeta_ {t} (k) = frac {{ mathop { min} limits_ {t in c} mathop { min} limits_ {k in n} left | {S_ {t} ^ { prime} (k) – { text {S}} _ {0} ^ { prime} (k)} right | + epsilon mathop { max} limits_ {t in c} mathop { max} limits_ {k in n} left | {S_ {t} ^ { prime} (k) – { text {S}} _ {0} ^ { prime} (k)} right |}} {{ left | {S_ {t} ^ { prime} (k) – { text {S}} _ {0} ^ { prime} (k)} right | + epsilon mathop { max} limits_ {t in c} mathop { max} limits_ {k in n} left | {S_ {t} ^ { prime} (k) – { text {S}} _ {0} ^ { prime} (k)} right |}}$$

(14)

or??t ( k) is the correlation coefficient and is the resolution coefficient, with a general value of 0.54.5.

In addition, the degree of correlation ( Rt) the equation is as follows (Eq. 15).

$$R_ {t} = frac {1} {n} sum limits_ {k = 1} ^ {n} { zeta_ {t}} (k)$$

(15)

The value ofRt is calculated using (Eq. 15). The maximum value ofRt indicates that the sample to be evaluated has the highest degree of correlation with the level of air quality considered. Therefore, the sample is classified accordingly. 