风速-功率曲线广泛应用于风电机组的功率预测、状态监测和故障诊断,其主要构建方法是使用风电场SCADA(supervisory control and data acquisition)数据进行拟合。然而由于弃风限电、仪表故障等因素,SCADA数据中存在部分功率异常数据。为保证拟合结果的准确可靠,应首先剔除这些异常数据。文中提出了一种风电机组功率异常数据剔除方法:首先使用分位数方法剔除距离正常数据较远的离散点,而后结合K-means聚类方法和改进时序方法剔除中部堆积点,最后使用分位数方法和DBSCAN(density-based spatial clustering of applications with noise)聚类方法的组合方法剔除距离正常数据较近的离散点。文中分别使用仿真数据集和实测数据集对分位数方法、基本时序方法及文中方法进行对比测试,结果表明,文中方法最优,对中部堆积点和离散点均有良好剔除效果。
英文摘要:
Wind speed-power curve is widely used in power prediction, condition monitoring and fault diagnosis of wind turbine. Its main construction method is to fit the supervisory control and data acquisition (SCADA) data. However, due to wind abandonment, power limitation, instrument failure and other factors, some abnormal power data exist in SCADA data. In order to ensure the accuracy and reliability of fitting results, these abnormal data should be eliminated first. In this paper, a method for eliminating abnormal data of wind turbine is proposed. Firstly, the quantile method is used to eliminate the discrete points far from the normal data. Then, K-means clustering method and improved time series method are combined to eliminate the central accumulation points. Finally, the combination method of quantile method and density-based spatial clustering of applications with noise (DBSCAN) clustering method is used to eliminate the discrete points close to the normal data. In this paper, the quantile method, the basic time series method and the method in this paper are compared and tested by using the simulation data set and the measured data set respectively. The results show that the proposed method is optimal and has a good effect on eliminating both the middle accumulation points and discrete points.