Markov Decision Process Framework (Dynamic Pricing)
A flexible Python framework for defining and solving time-dependent MDPs, including utilities to estimate stochastic transitions from data and compute optimal policies.
2025
MDPOptimizationSimulationDynamic pricing
The problem
Many pricing and capacity-constrained problems are sequential: today’s decision changes tomorrow’s state. That’s an MDP—but real-world MDPs often have:
- time-dependent states
- stochastic transitions learned from data
- constraints and business rules
What I built
A framework that makes it straightforward to:
- define states/actions/rewards with explicit time dependence
- estimate transition probabilities from observed data
- compute optimal policies (and compare to baselines)
Why it’s interesting
- clear abstractions (state/action/reward)
- measurable evaluation (policy performance under simulation)
- practical handling of constraints (not just textbook RL)