Under the framework of the new power system, the randomness and volatility of renewable energy, along with load initiative and electronic power grid integration, pose new challenges to the operation mode and real-time control of the power grid. Reactive power and voltage optimization control serve as the foundation for ensuring a safe and stable operation of the power grid. The paper proposes a reactive power and voltage optimization control method based on deep reinforcement learning to address issues caused by large-scale integration of renewable energy systems and flexible configuration of energy storage systems. The proposed method establishes a comprehensive reactive power optimization model considering operational efficiency, economy, and security. By utilizing Markov decision process, we transform the reactive power optimization problem into sequential decision optimization through reinforcement learning while fully considering time-space coupling characteristics of reactive power regulation. Deep Deterministic Policy Gradient (DDPG) is employed to solve this model. Finally, an improved IEEE 33 example is simulated to verify both effectiveness and reliability in optimizing reactive power decision-making process.