We study the problem of dynamically assigning jobs to workers with two key aspects: (i) workers gain or lose familiarity with jobs over time based on whether they are assigned or unassigned to the jobs, and (ii) the availability of workers and jobs is stochastic. This problem is motivated by applications in operating room management, where a fundamental challenge is maintaining familiarity across the workforce over time by accounting for heterogeneous worker learning rates and stochastic availability. We model this problem as a Markov decision process (MDP) for which exact solutions are intractable. This is due to complex dynamics of endogenous state—tracking familiarity levels—and exogenously evolving stochastic combinatorial constraints—ensuring assignment feasibility. To simplify the dynamics, we develop familiarity-agnostic policies and show their near-optimality when worker learning rates are low. To approach the constraints, we design Lagrangian relaxation policies and establish their near-optimality when the “cost of feasibility” in ensuring these constraints is small. For scenarios where both these policies are suboptimal (i.e., learning rates and the cost of feasibility are high), we propose an approximate linear programming policy that jointly models the dynamics and the constraints but requires simple approximation architectures for computational tractability. By conducting numerical experiments, we show that our best-proposed policy delivers small optimality gaps and beats benchmarks. More broadly, our MDP and policies offer new insights and algorithms to dynamically manage workforce familiarity in stochastic environments, with applications extending beyond operating room management.
Note: Research papers posted on SSRN, including any findings, may differ from the final version chosen for publication in academic journals.