Optimal Inventory Management with Model Predictive Control


Suppose you operate a store that sells a particular good. Every day there is a certain amount of demand for your goods, which results in sales and revenue. However, you can only sell goods if you have enough inventory. With insufficient inventory to meet the demand, you lose sales and revenue. Ideally you would then always maintain sufficient inventory, but you do not know the demand ahead of time, and it varies somewhat randomly. In this post I explore how we can solve this trifling problem. Code for the experiment is available here.

The Inventory Management Problem

We will formulate the inventory management problem as a controllable dynamical system. A basic mathematical model of the inventory problem for a shop selling one good is $$ \begin{aligned} q_{t+1} &= q_t + u_t - s_t \\ y_{t+1} &\sim P(y|y_t,...,y_0) \\ s_{t+1} &= \begin{cases} y_t, \ y_t < q_t \\ q_t\ \text{otherwise} \end{cases}\\ k_{t+1} &= k_t + \alpha s_t - \beta u_t - \gamma q_t \end{aligned} $$ The variables are

  • $q$ - inventory
  • $u$ - new inventory ordered
  • $y$ - demand
  • $s$ - number of units actually sold
  • $k$ - capital

$\alpha$ is the revenue we earn per unit sold, $\beta$ is the price per new unit ordered, and $\gamma$ is the price to store each unit of unsold inventory.

We can write this as a system $$ x_{t+1} = Ax_t + Bu_t + Cs_t(y_t) $$ where $$ x_t = \begin{bmatrix} q_t \\ k_t \end{bmatrix} $$ $$ A = \begin{bmatrix} 1 & 0 \\ -\gamma & 1 \end{bmatrix} $$ $$ B = \begin{bmatrix} 1 \\ -\beta \end{bmatrix} $$ $$ C = \begin{bmatrix} -1 \\ \alpha \end{bmatrix} $$ So we see that the inventory management problem is essentially a control problem, with an additional stochastic exogeneous variable.

The matrix $A$ has one repeated eigenvalue of $1$ with an eigenvector of $[0,1]^T$, which tells us that if we have $0$ inventory, our capital will remain the same for all time. This makes sense in our model as, without inventory, there are no costs but also no sales.

Problem Statement

The problem we face is to decide how to place our orders now to maximize future revenue.

In mathematical terms the inventory management problem can be stated as: how do we pick the order amounts $u_0,...,u_{T-1}$ to maximize the final capital, $k_T$, subject to the stochastic demand $y_t$ and system of equations outlined above?

Solving this problem is not that easy. To be able to optimize the controls now, we need to know what the system will do in the future. Second, the fact that we can run out of inventory makes the system nonlinear, which makes finding solutions hard. We could simplify the problem or make certain assumptions, e.g. constant demand, allowing backorders. This could allow us to find an analytical solution. But in this post I want to explore how to solve the full inventory management problem without simplifications.

What this means for us, is that we need to be able to forecast demand. Then to deal with the nonlinearities we can use numerical solutions to find the optimal controls. In particular, I will show you how we can use Receding Horizon Control (RHC) together with Random Search and a time series forecasting method to create an inventory control optimization program.

Demand Forecasting

To be able to forecast demand we need to define a model for how the demand evolves over time. Typically demand tends to fluctuate with seasonal variations, which our model needs to incorporate. For example, modeling how weekends tend to be busier for retail sales. We can do this using an autoregressive model.

We will use a weekly seasonal model here. With our model the demand evolves according to $$ y_t = c+y_{t-7}+\epsilon_t $$ Here, $c$ is a constant trend, $\epsilon_t$ is normal random noise and the term $y_{t-7}$ indicates the demand 7 days ago. To forecast the series we would then estimate the trend and noise parameters from historical data. Our model captures weekly seasonality, but can not handle quarterly or yearly seasonality (e.g. modeling vacation periods, busy seasons etc.).

Receding Horizon Control

I wrote about RHC in an earlier post. RHC is a method for optimizing the control of dynamical systems. Instead of optimizing the controls for the entire trajectory we are considering, RHC plans only a short number of steps ahead. The controls are then optimized for this shorter horizon. We then execute one or more of the controls and replan for a new horizon.

Random Search

Random search is a basic optimization method, typically used to optimize hyperparameters for machine learning algorithms. It is gradient-free and so can be used when the objective to be optimized is non-differentiable, like with out inventory control problem. The way random search optimizes, is by defining a probability distribution for the parameters we want to optimize. We then sample many possible parameters, calculate the objective with each sample, and then choose the sample that produced the best objective value.

The downside to random search is that it is not particularly clever in how it samples, so we may need many trials to find good parameters. The upside is that it is very generic and broadly applicable.

Optimal Inventory Management with Receding Horizon Control and Random Search

Our final algorithm will work as follows

  • for $t$ from 1 to $T$ do
  • for $n$ from 1 to $N_{trials}$ do
    • Sample trial controls $u_t,...,u_{t+N}$
    • Forecast the demand $y_t,...,y_{t+N}$
    • Evolve the system $N$ steps ahead to determine the final capital $k_{t+N}$
    • if $k_{t+N} > best(k_{t+N})$
    • set $u_t,...,u_{t+N}$ as current best controls
  • Execute the best found control $u_t$ and repeat

Where $best(k_{t+N})$ indicates the highest found final capital for that round of optimization so far.

Example solution

Let us solve the inventory problem now! We will consider an inventory problem over $T=50$ days. We now need to define the parameters in our inventory system, the RHC horizon and the distribution from which to sample trial controls.

For the system parameters we will use $\alpha=2.0$, $\beta=1.0$ and $\gamma=0.5$. We will further assume that the initial inventory and capital are $q_1=0$ and $k_1=10$ respectively.

For the demand, we will assume that the initial weekly demand is $[y_1,...,y_7]=[1,1,2,2,7,10,8]$. Thereafter we assume that the demand evolves according to $$ y_t = 0.5 + y_{t-7} = \epsilon_t $$ with $$ \epsilon_t \sim N(0, 0.8^2) $$ i.e. the noise is normally distributed with 0 mean and a standard deviation of 0.8. We will see that these parameters represent a growing demand with strong weekly seasonality.

For the RHC horizon we will use $N=10$, i.e. we plan 10 days ahead. For the random search algorithm, we assume that we sample all controls from $u_i \sim U(0,20)$, allowing us to order between 0 and 20 units of our good every day.

To compare the RHC inventory management algorithm's performance, I also show results from a naive policy that orders 4 units every period, which is roughly the average demand.

Here are the results from running a simulation. The RHC inventory management algorithm lead to much higher final capital, roughly 120 versus 70 for the naive algorithm. In particular as time goes on the naive algorithm lags further behind the RHC optimized algorithm.

Here is the final demand curve. The weekly seasonal variations are very clear, as well as a general increasing trend in demand. The inability of the naive algorithm to handle the increasing demand is one of the reasons the RHC optimized algorithm wins, it is able to forecast the increased demand and adjust the amount of orders accordingly.

We can now look at the orders and the inventory. With the RHC optimization the orders tend to "front-run" the demand. I.e. if a peak in the demand is forecast, then large orders are placed. If the demand is forecast to be low, then few, or no orders are placed.

So we see that making clever use of demand forecasting and random search can be combined with model predictive control to optimize the ordering schedule for the inventory management problem, at least in the simple model we formulated here. In particular, since we used random search here, we were able to solve the full nonlinear inventory management problem.

There are some caveats. The results shown are for one particular instance of the problem. Since the demand includes random noise, running the code again will produce slightly different results, but the RHC method still usually wins. Here we also knew the exact time series model to use to forecast demand. In practice we would only have an approximate model for demand, which could degrade the performance of the RHC optimization. So this highlights the importance of accurate forecasting models.