Gridworld MDP
A stochastic gridworld with editable walls. We’ll solve for the optimal policy via value iteration and then simulate rollouts under transition noise.
Controls
-0.050
Solve
γ = 0.99 · iterations 55 · Δ 7.45e-8
V(start) ≈ 0.390
Sim steps: 0
Grid (click edges to toggle walls)
Start (0,0) · Terminal (9,9)
Tip: click near an edge inside a cell to toggle a wall. Start is S, terminal is T.