With the continuous development of renewable energy represented by wind power and photovoltaic, flexible DC transmission systems have been vigorously developed in recent years due to the ability of clean and efficiently transport of long-distance and large-capacity renewable energy. Fast and accurate identification of fault areas and determination of fault types, achieving DC line protection schemes with selectivity, quickness, sensitivity and reliability are urgent needs for the development of flexible DC (direct current) transmission systems. Theoretical analysis shows that there are significant differences in transient voltage across the line smoothing reactor when the DC line of the MMC-HVDC transmission system fault happens. Discrete wavelet analysis is used to extract the RMS values of the transient voltages on both sides of the line smoothing reactor for comparison, so as to distinguish the faults in and outside the protection area. Then the K-means clustering algorithm is used, the voltage and current data of single-ended protection unit collected in a short time window after DC line faults occur are collected from different fault types, different fault locations. And clustering analysis is carried out to obtain the centroid and threshold of these data. After the process of data training, the corresponding protection criteria are determined to realize the identification of the fault poles. And the simulation results of PSCAD/EMTDC show that the protection scheme is not affected by the fault location and transition resistance, and can quickly and accurately detect the faults of the MMC-HVDC transmission lines and identify fault type.