Non-intrusive load decomposition, as a key technology for energy monitoring of power grids and home users, can quantify energy consumption and provide data support for rational energy distribution. Although the existing algorithms have greatly improved the accuracy of power decomposition in the same data set, the model has poor generalization and low accuracy of decomposition across data sets. To this end, this paper proposes a sequence translation optimization model based on the sliding window method, and uses transfer learning to achieve cross-dataset decomposition of the algorithm. The model reads the time series of the active power of the main power supply in a sliding window, uses the sequence-to-point model pre-training based on the LSTM codec, and obtains the training model through transfer learning to achieve load decomposition in different data sets. The results of calculation examples show that the proposed deep learning model has high decomposition performance and accuracy in training and testing between different data sets, which improves the generalization ability of the algorithm.