Considering the transmission characteristic of carbon emission flow and power flow in power grid, this paper proposes the concept of carbon-energy combined-flow. Further this paper adopts a PSO-Q(λ) learning algorithm for optimal carbon-energy combined-flow. The carbon emission loss, active power loss and voltage deviation are chosen as the optimization objectives. The algorithm converts the load sections and controllable variables to status and action,and searches for the optimal action strategy via continuous fault testing, action correction and iteration dynamically. Simulation in an IEEE 118-bus system indicates that the PSO-Q(λ)learning algorithm, which improve the convergence speed and maintain the abilities of seeking the global excellent result, providing a feasible and effective way to carbon-energy combined-flow on-line receding horizon optimization in a complex power grid.