Abstract: Multi-Kernel Learning Relevance Vector Machine can integrate multiple feature spaces, and outputs the probability belonging to each state. In this paper, cost-sensitive mechanism was introduced into the Multi-Kernel Learning Relevance Vector Machine, and constructed the Cost-Sensitive Multi-Kernel Learning Relevance Vector Machine, the algorithm is based on Bayesian risk theory to predict the fault category of samples, reaching the goal of minimum cost of misdiagnosis, and overcame the problem of not taking account of the difference cost of misdiagnosis. To solve the problem of its kernel function parameters need to be set artificially, K-fold Cross Validation combined Particle Swarm Optimization was adapted to optimize the kernel function parameters. Case analysis based on dissolved gas analysis data shows that CS-MKL-RVM not only has higher diagnosis accuracy, but also has lowest misdiagnosis cost, when compared with BP Neural Network, Support Vector Machine and Multi-Kernel Learning Relevance Vector Machine.