Skip to content

Markov Decision Process Framework (Dynamic Pricing)

A flexible Python framework for defining and solving time-dependent MDPs, including utilities to estimate stochastic transitions from data and compute optimal policies.

2025
MDPOptimizationSimulationDynamic pricing

The problem

Many pricing and capacity-constrained problems are sequential: today’s decision changes tomorrow’s state. That’s an MDP—but real-world MDPs often have:

  • time-dependent states
  • stochastic transitions learned from data
  • constraints and business rules

What I built

A framework that makes it straightforward to:

  • define states/actions/rewards with explicit time dependence
  • estimate transition probabilities from observed data
  • compute optimal policies (and compare to baselines)

Why it’s interesting

  • clear abstractions (state/action/reward)
  • measurable evaluation (policy performance under simulation)
  • practical handling of constraints (not just textbook RL)