Chapter 12: Reconstructing All Solutions - Word Break II

In the previous chapter, we solved a decision problem: can a string be segmented? Now, we'll tackle its natural successor, a constructive problem: find all possible ways to segment the string. This requires a significant evolution in our DP approach. Instead of storing a boolean true/false, our DP state must store the actual list of valid segmentations.

This pattern of reconstructing solutions is common in dynamic programming. LeetCode problem #140, "Word Break II," is the classic example. It builds directly on the logic of its predecessor but challenges us to manage and combine partial results.

The Problem:

Given a string s and a dictionary of strings wordDict, add spaces in s to construct a sentence where each word is a valid dictionary word. Return all such possible sentences in any order.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

For example:

s = "catsanddog", wordDict = ["cat", "cats", "and", "sand", "dog"]
Output: ["cats and dog", "cat sand dog"]
s = "pineapplepenapple", wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]
Output: ["pine apple pen apple", "pineapple pen apple", "pine applepen apple"]

Approach 1: Recursive Backtracking

The most direct way to think about this is with a backtracking recursive function. Let's define solve(i) to return a list of all valid sentence endings for the suffix s[i:].

We iterate through all possible split points j. If s[i:j] is a valid word, we recursively call solve(j) to get all possible ways to segment the rest of the string. Then, we combine our current word with each of those subsequent segmentations.

The base case for the recursion is when i reaches the end of the string. This signifies a complete, valid segmentation, and we should return a list containing an empty string [""]. This empty string acts as a sentinel to allow the previous call to properly construct its sentences.

from typing import List

def wordBreak_backtracking(s: str, wordDict: List[str]) -> List[str]:
    word_set = set(wordDict)

    def solve(i):
        # Base case: we've reached the end. Return a list with an empty string
        # to signify a valid end-point for the previous call to build upon.
        if i == len(s):
            return [""]

        results = []
        # Try every possible end point for the current word
        for j in range(i + 1, len(s) + 1):
            word = s[i:j]
            if word in word_set:
                # Get all possible sentences for the rest of the string
                sub_sentences = solve(j)
                for sub_sentence in sub_sentences:
                    # Append the current word to each sub-sentence
                    if sub_sentence: # If it's not the final empty string
                        results.append(word + " " + sub_sentence)
                    else: # If it is the final empty string, just add the word
                        results.append(word)
        return results

    return solve(0)

# Trying it out
# print(wordBreak_backtracking("catsanddog", ["cat", "cats", "and", "sand", "dog"]))
# This is very slow for inputs with many overlapping subproblems.

This approach works, but just like in the previous chapter, it's highly inefficient because it repeatedly solves for the same suffix s[i:].

Approach 2: Memoization with a Decorator (Top-Down DP)

This problem is a perfect fit for top-down DP with memoization. The recursive structure is complex, making it much more intuitive than a bottom-up approach. We can cache the list of results for each starting index i.

The logic is identical to the backtracking solution, but we'll use a dictionary (or lru_cache) to store the computed list of sentences for each i.

import functools
from typing import List

class Solution:
    def wordBreak_memoized(self, s: str, wordDict: List[str]) -> List[str]:
        word_set = set(wordDict)

        @functools.lru_cache(maxsize=None)
        def solve(i):
            if i == len(s):
                return [""]

            results = []
            for j in range(i + 1, len(s) + 1):
                word = s[i:j]
                if word in word_set:
                    sub_sentences = solve(j)
                    for sub_sentence in sub_sentences:
                        if sub_sentence:
                            results.append(word + " " + sub_sentence)
                        else:
                            results.append(word)
            return results

        return solve(0)

# Trying it out
# solver = Solution()
# print(solver.wordBreak_memoized("pineapplepenapple", ["apple", "pen", "applepen", "pine", "pineapple"]))

Memoization drastically improves performance. The function solve(i) is now computed only once for each index i. While the number of possible sentences can still be large, we avoid the redundant computations of the brute-force method.

Approach 3: Iteration (Bottom-Up DP)

A bottom-up solution is also possible, though often less intuitive for reconstruction problems. Here, dp[i] will store the list of all valid sentences for the prefix s[:i].

Initialize dp as a list of empty lists, of size n+1.
We can think of dp[0] as representing a valid starting point.
Iterate i from 1 to n. For each i, we will compute all segmentations of s[:i].
Inside, iterate j from 0 to i-1. This j is our potential split point.
Check if s[j:i] is a valid word.
If it is, then for every previously found sentence in dp[j], we can form a new sentence by appending s[j:i].
The final result is dp[n].

def wordBreak_iterative(s: str, wordDict: List[str]) -> List[str]:
    word_set = set(wordDict)
    n = len(s)
    dp = [[] for _ in range(n + 1)]
    # We don't need a special value for dp[0], we'll handle it as a special case.

    for i in range(1, n + 1):
        for j in range(i):
            word = s[j:i]
            if word in word_set:
                if j == 0:
                    dp[i].append(word)
                else:
                    # For each valid sentence ending at j, append the new word.
                    for sentence in dp[j]:
                        dp[i].append(sentence + " " + word)

    return dp[n]

# Trying it out
# print(wordBreak_iterative("catsanddog", ["cat", "cats", "and", "sand", "dog"]))

While this bottom-up approach works, the top-down memoized solution often maps more directly to the problem's recursive nature and can be easier to write correctly.

Conclusion

"Word Break II" demonstrated a critical dynamic programming skill: solution reconstruction. You learned that by changing the value stored in your DP table from a simple boolean or number to a list of partial solutions, you can solve a whole new class of problems. This transition from decision problems to constructive problems is a key step in becoming a proficient DP practitioner.

PreviousChapter 11: Segmenting a String - Word Break NextChapter 13: Conditional Transitions - Decode Ways

Last updated 1 month ago