Research on the establishment of NDVI long-term data set based on … – Nature.com

Data

This paper selects parts of China and surrounding areas as the research area. The research data selects the NDVI data of MODIS (NDVIm) and AVHRR (NDVIa) sensors on Terra and Aqua, and the NDVI data of VIRR (NDVIv) sensors on Fengyun satellite31. (I) Compare the NDVIv with the NDVIa, and the NDVIa and NDVIm. (II) Find out the functional relationship between NDVIa and NDVIm, and the functional relationship between NDVIv and NDVIa through comparison. (III) use NDVIa to correct NDVIv data to a level equivalent to NDVIm.

The data used in this study include (see Table 1): NDVIa from 1982 to 2015, NDVIm from 2000 to 2019, and NDVIv from 2015 to 2020, all of which have a resolution of 0.05. Because in 2005, there are both NDVIa data and NDVIm data. Therefore, we use the data of this year to compare NDVIa and NDVIm, and explore the correlation between the two. Because in 2015, there are both NDVIv data and NDVIa data. Therefore, we used the data of this year to compare NDVIv and NDVIa and explore the correlation between the two. Finally, we compared the corrected NDVIv of 2019 with the NDVIm of 2019 to verify the success of the model we constructed.

Figure1 shows the spectral response function curves of different satellite sensors in the visible and near-infrared spectrum32. By comparison, it can be found that in the visible light band, the spectral response function of MODIS is narrower than AVHRR, and the spectral response function of AVHRR is narrower than VIRR. In the near-infrared band, MODIS still has the narrowest spectral response function, followed by VIRR, and AVHRR has the widest spectral response function. The channel, wavelength range, corresponding spectrum and sub-satellite resolution information of MODIS, AVHRR, and VIRR sensors are shown in Table 2.

Spectral response function curves of different satellite sensors in the visible and near-infrared spectrum29.

Linear model is a form of machine learning model. The form of linear model is relatively simple and easy to model. The linear model contains some important basic ideas in machine learning. Many more powerful nonlinear models can be obtained by introducing hierarchical structure or high-dimensional mapping on the basis of linear models. There are many forms of linear models, and linear regression is a common one. Linear regression tries to learn a linear model to predict the real-valued output markers as accurately as possible. By establishing a linear model on the data set, a loss function is established, and finally the model parameters are determined with the goal of optimizing the cost function, so as to obtain the model for subsequent prediction. The general linear regression algorithm process is as presented in Fig.2.

Schematic diagram of the linear regression algorithm flow.

The detailed procedure is as follows33:

The data is standardized and preprocessed. The preprocessing includes data cleaning, screening, organization, etc., so that the data can be input into the machine learning model as feature variables.

Different machine learning algorithms are selected to train a separate data set, and find the best machine learning model, establish a machine learning model based on the normalized vegetation index product retrieved by Fengyun satellite.

Verify and output the long-term series normalized vegetation index of the Fengyun satellite.

For 20012005, there are both AVHRR NDVI data and MODIS NDVI data. Therefore, we used the data of these 5years to compare NDVIa and NDVIm and explore the correlation between the two. Because 2015 has both VIRR's NDVI data and AVHRR's NDVI data. Therefore, we used the data of this year to compare NDVIv and NDVIa and explore the correlation between the two. Finally, we compared the corrected NDVIv of 2019 with the NDVIm of 2019 to verify the success of the model we constructed.

The linear machine learning model is used to construct the optimal functional relationship between the NDVIa and the NDVIm. The formula is as presented in formula (1):

$${text{Y}}_{{{text{NDVIm}}}} = left{ {{text{k2}}00{1},{text{k2}}00{2},{text{k2}}00{3},{text{k2}}00{4},{text{k2}}00{5},{text{kmin}},{text{kmax}},{text{kave}}} right} times {text{X}}_{{{text{NDVIa}}}} + left{ {{text{m2}}00{1},{text{m2}}00{2},{text{m2}}00{3},{text{m2}}00{4},{text{m2}}00{5},{text{mmin}},{text{mmax}},{text{mmean}}} right}$$

(1)

In the formula, XNDVIa is the NDVI value of AVHRR, YNDVIm is the NDVI value of MODIS, k is the coefficient value of the linear function relationship between NDVIa and NDVIm, k2001, k2002, k2003, k2004, k2005, kmin, kmax, kave are the coefficients of 2001, 2002, 2003, 2004, 2005, the 5-year minimum, 5-year maximum, and the 5-year coefficient average respectively. m is the intercept of the linear function relationship between the NDVIa and the NDVIm, m2001, m2002, m2003, m2004, m2005, mmin, mmax, mmean are the intercept of 2001, 2002, 2003, 2004, 2005 Year, 5-year minimum, 55-year maximum, and 5-year average respectively.

Through multiple cross-comparison analysis, the optimal coefficient k and the optimal coefficient m are selected, and then the optimal functional relationship between NDVIa and NDVIm is determined.

Based on the above analysis, we continue to construct the functional relationship between NDVIa and NDVIv, according to formula (2).

$${text{X}}_{{{text{NDVIa}}}} = {text{aZ}}_{{{text{NDVIv}}}} + {text{b}}{.}$$

(2)

In the formula (2), ZNDVIv is the NDVI value of VIRR, XNDVIa is the NDVI value of AVHRR, a is the coefficient value of the linear function relationship between the NDVIv and the NDVIa fitting, and b is the intercept of the linear function relationship between NDVIv and NDVIa fitting.

Replacing the functional relationship between NDVIa and NDVIv into the optimal NDVIa and NDVIm functional relationships filtered out to obtain the refitted NDVIv, which is Yvir_ndvi in the formula (3). The functional relationship formula of the simulated NDVIv is as follows (3):

$${text{C}}_{{{text{NDVIcv}}}} = {text{k}}_{{{text{NDVIa}}}} + {text{m}} = {text{k}}left( {{text{aZ}}_{{{text{NDVIv}}}} + {text{b}}} right) + {text{m}} = {text{kaZ}}_{{{text{NDVIv}}}} + {text{kb}} + {text{m}}{.}$$

(3)

In the formula, CNDVIcv is the corrected NDVIv(NDVIcv), k is the optimal coefficient of the correlation between NDVIa and NDVIm, and m is the optimal intercept of the correlation between NDVIa and NDVIm.

The data of 2005 were selected to compare NDVIm and NDVIa in some parts of China and surrounding areas. The data of 2015 were selected to compare NDVIv and NDVIa in some parts of China and surrounding areas. Through analysis, the correlation among NDVIv, NDVIa and NDVIm is found.

See the article here:
Research on the establishment of NDVI long-term data set based on ... - Nature.com

Related Posts

Comments are closed.