Gridworld MDP

A stochastic gridworld with editable walls. We’ll solve for the optimal policy via value iteration and then simulate rollouts under transition noise.

Controls

Grid size10×10

p(intended)0.80

Step cost-0.020

Terminal reward1.00

Bump penalty-0.050

Solve

γ = 0.99 · iterations 55 · Δ 7.45e-8

V(start) ≈ 0.390

Sim steps: 0

Show policyShow values

Sim seed23

Grid (click edges to toggle walls)

Start (0,0) · Terminal (9,9)

Tip: click near an edge inside a cell to toggle a wall. Start is S, terminal is T.