Chapter 14: State-Machine DP - Best Time to Buy and Sell Stock with Cooldown

So far, our DP states have primarily been defined by an index or two in an array or string. Now, we'll explore problems where the state needs to capture more than just position—it needs to capture a status. These problems can be modeled as state machines, where each day (or step) involves transitioning from one status to another.

A classic example of this pattern is LeetCode's "Best Time to Buy and Sell Stock with Cooldown" (Problem #309). Here, we want to maximize profit, but our actions are constrained by a "cooldown" rule, making it a perfect candidate for a state-based DP approach.

The Problem:

You are given an array prices where prices[i] is the price of a given stock on the i-th day.

Find the maximum profit you can achieve. You may complete as many transactions as you like (i.e., buy one and sell one share of the stock multiple times) with the following restrictions:

  • You may not engage in multiple transactions simultaneously (i.e., you must sell the stock before you buy again).

  • After you sell your stock, you cannot buy stock on the next day (i.e., cooldown period is 1 day).

For example:

  • prices = [1, 2, 3, 0, 2]

  • Output: 3

  • Explanation: transactions = [buy, sell, cooldown, buy, sell] -> profit = (2-1) + (2-0) = 3

Approach 1: The Recursive Solution with State

For this problem, knowing the day i is not enough. We also need to know if we are currently holding a stock. So, let's define our state by (i, holding), where holding is a boolean.

Our function solve(i, holding) will return the maximum profit we can make from day i onwards.

  1. If holding is False (we can buy):

    • Option A: Buy. We buy the stock at prices[i]. Our profit is prices[i] plus whatever we can make from day i+1 onwards in a holding = True state. So, prices[i] + solve(i + 1, True).

    • Option B: Don't buy (rest). We do nothing. Our profit is whatever we can make from day i+1 onwards in a holding = False state. So, solve(i + 1, False).

    • We take the maximum of these two options.

  2. If holding is True (we can sell):

    • Option A: Sell. We sell the stock at prices[i]. Our profit is +prices[i] plus whatever we can make from day i+2 onwards (due to the cooldown) in a holding = False state. So, prices[i] + solve(i + 2, False).

    • Option B: Don't sell (rest). We do nothing. Our profit is whatever we can make from day i+1 onwards in a holding = True state. So, solve(i + 1, True).

    • We take the maximum of these two options.

The base case is when i >= len(prices), at which point no more profit can be made, so we return 0. We start the process at day 0 in a holding = False state.

Approach 2: Memoization with a Decorator (Top-Down DP)

The recursive state is (i, holding), which is a tuple and can be easily cached. Applying our decorator will prevent re-computation and make the solution efficient.

With memoization, each state (i, holding) is computed just once. The time complexity becomes O(n) because there are n * 2 possible states.

Approach 3: Iteration (Bottom-Up DP)

For the bottom-up approach, we can track the maximum profit for three different states at the end of each day i:

  • hold: The maximum profit if we are holding a stock.

  • sold: The maximum profit if we just sold a stock today.

  • rest: The maximum profit if we are "resting" (not holding a stock and not in cooldown).

We can iterate through the prices and update these three state variables:

  • hold_new = max(hold, rest - price): The only way to be holding today is if we were holding yesterday, OR if we were resting yesterday and bought today.

  • sold_new = hold + price: The only way to sell today is if we were holding yesterday.

  • rest_new = max(rest, sold): The only way to be resting today is if we were resting yesterday, OR if we finished our cooldown from selling yesterday.

We can initialize hold to a very small number (since buying costs money), and sold and rest to 0.

This O(1) space, O(n) time solution is extremely efficient. It elegantly captures the state transitions without recursion or large DP arrays.

Conclusion

The "Buy and Sell Stock with Cooldown" problem is a gateway to a powerful DP paradigm. By defining the state to include not just the position but also the status of our agent (holding, sold, resting), we can model complex decision-making processes. This state-machine approach is applicable to a wide range of problems where actions in the present constrain actions in the future.

Last updated