Abstract:The competition between parallel flights is becoming increasingly fierce. In this study, to improve the airline’s revenue, the flights and the passengers were separately modeled in the ticket sale system. The problem of dynamic pricing of flights was modeled as Markov game, and the Logit choice model was used to model for the mixed-type passengers. The multi-agent reinforcement learning was adopted to solve the problem in reality. The results indicated that the number of convergence for WoLF-PHC algorithm was more than that of the Nash-Q, but the WoLF-PHC algorithm had higher convergence frequency with strong adaptability. In addition, the pricing strategy of flight ticket sale process was different from that of other perishable products, which generally reflected an upward trend. The pricing strategy would also be adjusted with the modification of environment parameters of passengers. The pricing policy obtained by WoLF-PHC algorithm has positive effects on improving revenue.