Saturday, November 28, 2020

Interesting Properties of Data Structures

1. The stack data structure can come in handy here in representing this recursive structure of the problem. We can't really process this from the inside out because we don't have an idea about the overall structure. But, the stack can help us process this recursively i.e. from outside to inwards.

2. The capability of finding matching prefix is where the data structure called Trie would shine, comparing the hashset data structure. Not only can Trie tell the membership of a word, but also it can instantly find the words that share a given prefix. As it turns out, the choice of data structure (Trie VS. hashset), which could end with a solution that ranks either the top 5\%5% or the bottom 5\%5%.


Wednesday, November 25, 2020

Below are the Big O performance of common functions of different Java Collections.

List                 | Add  | Remove | Get  | Contains | Next | Data Structure
---------------------|------|--------|------|----------|------|---------------
ArrayList            | O(1) |  O(n)  | O(1) |   O(n)   | O(1) | Array
LinkedList           | O(1) |  O(1)  | O(n) |   O(n)   | O(1) | Linked List
CopyOnWriteArrayList | O(n) |  O(n)  | O(1) |   O(n)   | O(1) | Array



Set                   |    Add   |  Remove  | Contains |   Next   | Size | Data Structure
----------------------|----------|----------|----------|----------|------|-------------------------
HashSet               | O(1)     | O(1)     | O(1)     | O(h/n)   | O(1) | Hash Table
LinkedHashSet         | O(1)     | O(1)     | O(1)     | O(1)     | O(1) | Hash Table + Linked List
EnumSet               | O(1)     | O(1)     | O(1)     | O(1)     | O(1) | Bit Vector
TreeSet               | O(log n) | O(log n) | O(log n) | O(log n) | O(1) | Red-black tree
CopyOnWriteArraySet   | O(n)     | O(n)     | O(n)     | O(1)     | O(1) | Array
ConcurrentSkipListSet | O(log n) | O(log n) | O(log n) | O(1)     | O(n) | Skip List



Queue                   |  Offer   | Peak |   Poll   | Remove | Size | Data Structure
------------------------|----------|------|----------|--------|------|---------------
PriorityQueue           | O(log n) | O(1) | O(log n) |  O(n)  | O(1) | Priority Heap
LinkedList              | O(1)     | O(1) | O(1)     |  O(1)  | O(1) | Array
ArrayDequeue            | O(1)     | O(1) | O(1)     |  O(n)  | O(1) | Linked List
ConcurrentLinkedQueue   | O(1)     | O(1) | O(1)     |  O(n)  | O(n) | Linked List
ArrayBlockingQueue      | O(1)     | O(1) | O(1)     |  O(n)  | O(1) | Array
PriorirityBlockingQueue | O(log n) | O(1) | O(log n) |  O(n)  | O(1) | Priority Heap
SynchronousQueue        | O(1)     | O(1) | O(1)     |  O(n)  | O(1) | None!
DelayQueue              | O(log n) | O(1) | O(log n) |  O(n)  | O(1) | Priority Heap
LinkedBlockingQueue     | O(1)     | O(1) | O(1)     |  O(n)  | O(1) | Linked List



Map                   |   Get    | ContainsKey |   Next   | Data Structure
----------------------|----------|-------------|----------|-------------------------
HashMap               | O(1)     |   O(1)      | O(h / n) | Hash Table
LinkedHashMap         | O(1)     |   O(1)      | O(1)     | Hash Table + Linked List
IdentityHashMap       | O(1)     |   O(1)      | O(h / n) | Array
WeakHashMap           | O(1)     |   O(1)      | O(h / n) | Hash Table
EnumMap               | O(1)     |   O(1)      | O(1)     | Array
TreeMap               | O(log n) |   O(log n)  | O(log n) | Red-black tree
ConcurrentHashMap     | O(1)     |   O(1)      | O(h / n) | Hash Tables
ConcurrentSkipListMap | O(log n) |   O(log n)  | O(1)     | Skip List

Monday, November 2, 2020

Designing Data Intensive Application | Notes 1 | Introduction

Data-intensive applications are pushing the boundaries of what is possible by making use of these technological developments. We call an application data-intensive if data is its primary challenge—the quantity of data, the complexity of data, or the speed at which it is changing—as opposed to compute-intensive, where CPU cycles are the bottleneck

The tools and technologies that help data-intensive applications store and process data have been rapidly adapting to these changes. New types of database systems (“NoSQL”) have been getting lots of attention, but message queues, caches, search indexes, frameworks for batch and stream processing, and related technologies are very important too. Many applications use some combination of these.

Sometimes, when discussing scalable data systems, people make comments along the lines of, “You’re not Google or Amazon. Stop worrying about scale and just use a relational database.” There is truth in that statement: building for scale that you don’t need is wasted effort and may lock you into an inflexible design. In effect, it is a form of premature optimization. However, it’s also important to choose the right tool for the job, and different technologies each have their own strengths and weaknesses. As we shall see, relational databases are important but not the final word on dealing with data.

Ref:https://github.com/ept/ddia-references

This book is arranged into three parts:

In Part I, we discuss the fundamental ideas that underpin the design ofdata-intensive applications. We start in Chapter 1 by discussing what we’re actuallytrying to achieve: reliability, scalability, and maintainability; how we need to think aboutthem; and how we can achieve them. In Chapter 2 we compare several different datamodels and query languages, and see how they are appropriate to different situations. InChapter 3 we talk about storage engines: how databases arrange data on disk so that wecan find it again efficiently. Chapter 4 turns to formats for data encoding (serialization)and evolution of schemas over time.

In Part II, we move from data stored on one machine to data that isdistributed across multiple machines. This is often necessary for scalability, but brings with ita variety of unique challenges. We first discuss replication (Chapter 5),partitioning/sharding (Chapter 6), and transactions (Chapter 7). We thengo into more detail on the problems with distributed systems (Chapter 8) and what itmeans to achieve consistency and consensus in a distributed system (Chapter 9).

In Part III, we discuss systems that derive some datasets from other datasets. Deriveddata often occurs in heterogeneous systems: when there is no one database that can do everythingwell, applications need to integrate several different databases, caches, indexes, and so on. InChapter 10 we start with a batch processing approach to derived data, and we build upon it with stream processing in Chapter 11. Finally, in Chapter 12 we put everythingtogether and discuss approaches for building reliable, scalable, and maintainable applications inthe future.

Wednesday, October 28, 2020

Trie 3 | Leetcode 1268. Search Suggestions System | Amazon | OA 2020 | Keyword Suggestions

 Question:

Given an array of strings products and a string searchWord. We want to design a system that suggests at most three product names from products after each character of searchWord is typed. Suggested products should have common prefix with the searchWord. If there are more than three products with a common prefix return the three lexicographically minimums products.

Return list of lists of the suggested products after each character of searchWord is typed. 

Example 1:

Input: products = ["mobile","mouse","moneypot","monitor","mousepad"], searchWord = "mouse"

Output: [

["mobile","moneypot","monitor"],

["mobile","moneypot","monitor"],

["mouse","mousepad"],

["mouse","mousepad"],

["mouse","mousepad"]

]

Explanation: products sorted lexicographically = ["mobile","moneypot","monitor","mouse","mousepad"]

After typing m and mo all products match and we show user ["mobile","moneypot","monitor"]

After typing mou, mous and mouse the system suggests ["mouse","mousepad"]


Example 2:

Input: products = ["havana"], searchWord = "havana"

Output: [["havana"],["havana"],["havana"],["havana"],["havana"],["havana"]]

Example 3:

Input: products = ["bags","baggage","banner","box","cloths"], searchWord = "bags"

Output: [["baggage","bags","banner"],["baggage","bags","banner"],["baggage","bags"],["bags"]]

Example 4:

Input: products = ["havana"], searchWord = "tatiana"

Output: [[],[],[],[],[],[],[]]


Solution :


class Solution {

    class TrieNode{

        TrieNode[] children=new TrieNode[26];

        LinkedList<String> suggestions=new LinkedList<>();

        boolean isEnd=false;

    }

    

    private void insert(String p,TrieNode root){

        TrieNode node=root;

        for(char ch:p.toCharArray()){

            if(node.children[ch-'a']==null){

               node.children[ch-'a']=new TrieNode(); 

            }

            node=node.children[ch-'a'];

            //add the word in the suggestions

            node.suggestions.add(p);

            Collections.sort(node.suggestions);

            

            if(node.suggestions.size() >3){

               node.suggestions.pollLast(); 

            }

        }

    }

    private List<List<String>> search(String searchWord,TrieNode root){

        List<List<String>> ans=new ArrayList<>();

        TrieNode node=root;

        for(char ch:searchWord.toCharArray()){

            if(node!=null)

                node=node.children[ch-'a'];

            ans.add(node== null ? Arrays.asList():node.suggestions);

            

        }

        return ans;

        

    }

    

    public List<List<String>> suggestedProducts(String[] products, String searchWord) {

        TrieNode root=new TrieNode();

        for(String s:products){

            insert(s,root);

        }

        return search(searchWord,root);

    }

}


Analysis:


Complexity depends on the sorting, the process of building Trie and the length of searchWord. Sorting cost time O(m * n), due to involving comparing String, which cost time O(m) for each comparison, building Trie cost O(m * n). Therefore,

Time: O(m * n + L), space: O(m * n + L * m) - including return list ans, where m = average length of products, n = products.length, L = searchWord.length().

Trie | Part 2 Implement Trie (Prefix Tree)

Leetcode 208. Implement Trie (Prefix Tree)

Implement a trie with insert, search, and startsWith methods.

Example:

Trie trie = new Trie();

trie.insert("apple");

trie.search("apple");   // returns true

trie.search("app");     // returns false

trie.startsWith("app"); // returns true

trie.insert("app");   

trie.search("app");     // returns true

Note:

You may assume that all inputs are consist of lowercase letters a-z.

All inputs are guaranteed to be non-empty strings.


Solution :

class Trie {

    

    static class TrieNode{

        TrieNode[] children=new TrieNode[26];

        boolean isEnd=false;

    }

    

    private TrieNode root;

    /** Initialize your data structure here. */

    public Trie() {

        root=new TrieNode();

    }

    

    /** Inserts a word into the trie. */

    public void insert(String word) {

        TrieNode node=root;

        for(int i=0;i<word.length();i++){

            char ch=word.charAt(i);

            if(node.children[ch-'a']==null){

                node.children[ch-'a']=new TrieNode();

            }

            node=node.children[ch-'a'];

        }

        node.isEnd=true;

        

    }

    

    /** Returns if the word is in the trie. */

    public boolean search(String word) {

        TrieNode node=root;

        for(int i=0;i<word.length();i++){

            char ch=word.charAt(i);

            if(node.children[ch-'a']==null)

                return false;

            node=node.children[ch-'a'];

        }

        return node.isEnd;

        

    }

    

    /** Returns if there is any word in the trie that starts with the given prefix. */

    public boolean startsWith(String prefix) {

        TrieNode node=root;

        for(int i=0;i<prefix.length();i++){

            char ch=prefix.charAt(i);

            if(node.children[ch-'a']==null)

                return false;

            node=node.children[ch-'a'];

        }

        return node != null;

    }

}


/**

 * Your Trie object will be instantiated and called as such:

 * Trie obj = new Trie();

 * obj.insert(word);

 * boolean param_2 = obj.search(word);

 * boolean param_3 = obj.startsWith(prefix);

 */

Trie | Part 1 Introduction

 Trie (pronounce as "Try") is a very popular data structure, usually known as Prefix Tree as well.

Applications:

1. AutoComplete

2. Spell Checker

3. IP Routing (Longest Prefix Matching)

There are several other data structures, like balanced trees and hash tables, which give us the possibility to search for a word in a dataset of strings. Then why do we need trie? Although hash table has O(1) time complexity for looking for a key, it is not efficient in the following operations :

Finding all keys with a common prefix.

Enumerating a dataset of strings in lexicographical order.

Another reason why trie outperforms hash table, is that as hash table increases in size, there are lots of hash collisions and the search time complexity could deteriorate to O(n), where n is the number of keys inserted. Trie could use less space compared to Hash Table when storing many keys with the same prefix. In this case using trie has only O(m) time complexity, where mm is the key length. Searching for a key in a balanced tree costs O(mlogn) time complexity.


Trie node structure

Trie is a rooted tree. Its nodes have the following fields:

Maximum of R links to its children, where each link corresponds to one of R character values from dataset alphabet. In this article we assume that R is 26, the number of lowercase latin letters.

Boolean field which specifies whether the node corresponds to the end of the key, or is just a key prefix.












static class TrieNode{

        TrieNode[] children=new TrieNode[26];
        boolean isEnd=false;
    }


Two of the most common operations in a trie are insertion of a key and search for a key.

Insertion of a key to a trie

We insert a key by searching into the trie. We start from the root and search a link, which corresponds to the first key character. There are two cases :

A link exists. Then we move down the tree following the link to the next child level. The algorithm continues with searching for the next key character.

A link does not exist. Then we create a new node and link it with the parent's link matching the current key character. We repeat this step until we encounter the last character of the key, then we mark the current node as an end node and the algorithm finishes.













    /** Inserts a word into the trie. */

    public void insert(String word) {

        TrieNode node=root;

        for(int i=0;i<word.length();i++){

            char ch=word.charAt(i);

            if(node.children[ch-'a']==null){

                node.children[ch-'a']=new TrieNode();

            }

            node=node.children[ch-'a'];

        }

        node.isEnd=true;

        

    }

Complexity Analysis

Time complexity : O(m), where m is the key length.

In each iteration of the algorithm, we either examine or create a node in the trie till we reach the end of the key. This takes only mm operations.


Space complexity : O(m).

In the worst case newly inserted key doesn't share a prefix with the the keys already inserted in the trie. We have to add mm new nodes, which takes us O(m)O(m) space.


Search for a key in a trie

Each key is represented in the trie as a path from the root to the internal node or leaf. We start from the root with the first key character. We examine the current node for a link corresponding to the key character. There are two cases :


A link exist. We move to the next node in the path following this link, and proceed searching for the next key character.


A link does not exist. If there are no available key characters and current node is marked as isEnd we return true. Otherwise there are possible two cases in each of them we return false :


There are key characters left, but it is impossible to follow the key path in the trie, and the key is missing.

No key characters left, but current node is not marked as isEnd. Therefore the search key is only a prefix of another key in the trie.










  /** Returns if the word is in the trie. */

    public boolean search(String word) {

        TrieNode node=root;

        for(int i=0;i<word.length();i++){

            char ch=word.charAt(i);

            if(node.children[ch-'a']==null)

                return false;

            node=node.children[ch-'a'];

        }

        return node.isEnd;

        

    }

Complexity Analysis

Time complexity : O(m)

Space complexity : O(1)

Reference : https://leetcode.com/problems/implement-trie-prefix-tree/solution/

Saturday, October 10, 2020

Deleting data from ArrayList with a For-loop

 Suppose you want to remove all the Strings which contains "abc" in them.

for (int i = 0; i < size; i++){

    if (data.get(i).getCaption().contains("abc")){

        data.remove(i);

    }

}


The above code will not work. Why?

The Problem here is you are iterating from 0 to size and inside the loop you are deleting items. Deleting the items will reduce the size of the list which will fail when you try to access the indexes which are greater than the effective size(the size after the deleted items).


There are three approaches to do this.


Delete using iterator if you do not want to deal with index.


for (Iterator<Object> it = data.iterator(); it.hasNext();) {

if (it.next().getCaption().contains("_Hardi")) {

    it.remove();

}

}

Else, delete from the end.


for (int i = size-1; i >= 0; i--){

    if (data.get(i).getCaption().contains("_Hardi")){

            data.remove(i);

    }

 }


Otherwise use a while loop :

int i = 0;
while (i < data.size()) {
    if (data.get(i).getCaption().contains("_Hardi"))
        data.remove(i);
    else i++;
}

Plus, Keep an eye for ConcurrentModificationExceltion as well, although it 
will not com in this case.

Tuesday, September 22, 2020

int[] and Integer[] arrays - What is the difference?

This image should help you to understand the difference:






int is a number, it's a primitive type.

Integer is an object.

When you have an array of Integers, you actually have an array of objects. Array of ints is an array of primitive types.

Since arrays are objects, they're allocated on the heap. If it's an array of ints, these ints will be allocated on the heap too, within the array.

Further adding. to this..

There is a difference at run-time.

int[] is an array of primitive int values. Integer[] is an "object" array, holding references to Integer objects.

Most important practical difference: int[] cannot hold null values.

int[] does store primitive types. And the array itself lives on the heap. However, those primitives are allocated as part of the array. They are not stored separately elsewhere on the heap. This is very similar to how a primitive field is part of an object instance: The object is on the heap, and its field is an integral part of that object (whereas for a non-primitive field, only the reference is stored inside the object and the target instance that reference points at is stored separately on the heap).


Ref : this link

Wednesday, September 2, 2020

FAANG : The Journey | Post 10 | Top 50 Google tagged questions. (With Links)

 

  1. Two Sum (https://leetcode.com/problems/two-sum)
  2. Insert Interval (https://leetcode.com/problems/insert-interval)
  3. Text Justification (https://leetcode.com/problems/text-justification)
  4. Minimum Window Substring (https://leetcode.com/problems/minimum-windowsubstring)
  5. Maximal Rectangle (https://leetcode.com/problems/maximal-rectangle)
  6. The Skyline Problem (https://leetcode.com/problems/the-skyline-problem)
  7. Maximal Square (https://leetcode.com/problems/maximal-square)
  8. Meeting Rooms II (https://leetcode.com/problems/meeting-rooms-ii)
  9. Find Median from Data Stream (https://leetcode.com/problems/find-median-from-datastream)
  10. Bulls and Cows (https://leetcode.com/problems/bulls-and-cows)
  11. Count of Smaller Numbers After Self (https://leetcode.com/problems/count-of-smallernumbers-after-self)
  12. Longest Increasing Path in a Matrix (https://leetcode.com/problems/longest-increasingpath-in-a-matrix)
  13. Longest Substring with At Most K Distinct Characters
    (https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters)
  14. Moving Average from Data Stream (https://leetcode.com/problems/moving-averagefrom-data-stream)
  15. Logger Rate Limiter (https://leetcode.com/problems/logger-rate-limiter)
  16. Design Hit Counter (https://leetcode.com/problems/design-hit-counter)
  17. Max Sum of Rectangle No Larger Than K (https://leetcode.com/problems/max-sum-ofrectangle-no-larger-than-k)
  18. Decode String (https://leetcode.com/problems/decode-string)
  19. Evaluate Division (https://leetcode.com/problems/evaluate-division)
  20. Split Array Largest Sum (https://leetcode.com/problems/split-array-largest-sum)
  21. Robot Room Cleaner (https://leetcode.com/problems/robot-room-cleaner)
  22. Random Pick with Weight (https://leetcode.com/problems/random-pick-with-weight)
  23. Subarray Sum Equals K (https://leetcode.com/problems/subarray-sum-equals-k)
  24. Longest Line of Consecutive One in Matrix (https://leetcode.com/problems/longest-lineof-consecutive-one-in-matrix)
  25. Design Search Autocomplete System (https://leetcode.com/problems/design-searchautocomplete-system)
  26. Split Array into Consecutive Subsequences (https://leetcode.com/problems/split-arrayinto-consecutive-subsequences)
  27. 24 Game (https://leetcode.com/problems/24-game)
  28. Minimum Window Subsequence (https://leetcode.com/problems/minimum-windowsubsequence)
  29. Network Delay Time (https://leetcode.com/problems/network-delay-time)
  30. Open the Lock (https://leetcode.com/problems/open-the-lock)
  31. Expressive Words (https://leetcode.com/problems/expressive-words)
  32. Find And Replace in String (https://leetcode.com/problems/find-and-replace-in-string)
  33. Guess the Word (https://leetcode.com/problems/guess-the-word)
  34. Hand of Straights (https://leetcode.com/problems/hand-of-straights)
  35. Shortest Subarray with Sum at Least K (https://leetcode.com/problems/shortestsubarray-with-sum-at-least-k)
  36. X of a Kind in a Deck of Cards (https://leetcode.com/problems/x-of-a-kind-in-a-deck-ofcards)
  37. Minimum Area Rectangle (https://leetcode.com/problems/minimum-area-rectangle)
  38. Validate Stack Sequences (https://leetcode.com/problems/validate-stack-sequences)
  39. Flip Equivalent Binary Trees (https://leetcode.com/problems/flip-equivalent-binarytrees)
  40. Minimum Domino Rotations For Equal Row (https://leetcode.com/problems/minimumdomino-rotations-for-equal-row)
  41. Longest String Chain (https://leetcode.com/problems/longest-string-chain)
  42. Shortest Way to Form String (https://leetcode.com/problems/shortest-way-to-formstring)
  43. Confusing Number II (https://leetcode.com/problems/confusing-number-ii)
  44. Delete Nodes And Return Forest (https://leetcode.com/problems/delete-nodes-andreturn-forest)
  45. Snapshot Array (https://leetcode.com/problems/snapshot-array)
  46. String Transforms Into Another String (https://leetcode.com/problems/string-transformsinto-another-string)
  47. Divide Chocolate (https://leetcode.com/problems/divide-chocolate)
  48. Divide Array in Sets of K Consecutive Numbers (https://leetcode.com/problems/dividearray-in-sets-of-k-consecutive-numbers)
  49. Minimum Distance to Type a Word Using Two Fingers
    (https://leetcode.com/problems/minimum-distance-to-type-a-word-using-two-fingers)
  50. Time Needed to Inform All Employees (https://leetcode.com/problems/time-needed-toinform-all-employees)

Saturday, August 8, 2020

How to solve the Knapsack Problem with dynamic programming

The Knapsack Problem is a really interesting problem in combinatorics — to cite Wikipedia,

“given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.”

From Wikipedia, we see that there are a few variations of the Knapsack Problem: 0–1 knapsack, bounded knapsack, and unbounded knapsack. Today, we’ll be focusing on the most common (and simplest) variation: the 0–1 knapsack problem.

A slightly more interesting and relatable phrasing of the 0–1 knapsack problem is:

Consider a thief gets into a home to rob and he carries a knapsack. There are fixed number of items in the home — each with its own weight and value — Jewelry, with less weight and highest value vs tables, with less value but a lot heavy. To add fuel to the fire, the thief has an old knapsack which has limited capacity. Obviously, he can’t split the table into half or jewelry into 3/4ths. He either takes it or leaves it. Source

Unfortunately, I had some difficulty understanding some parts of the Hackerearth article, which is why I decided to write my own article. This article will be largely based off Hackerearth’s article and code snippets are written in Java. I’ll be tacking on additional explanations and elaborations where I feel they are necessary.

Dynamic programming

We’ll be solving this problem with dynamic programming. Dynamic programming requires an optimal substructure and overlapping sub-problems, both of which are present in the 0–1 knapsack problem, as we shall see.

It’s fine if you don’t understand what “optimal substructure” and “overlapping sub-problems” are (that’s an article for another day). Essentially, it just means a particular flavor of problems that allow us to reuse previous solutions to smaller problems in order to calculate a solution to the current problem. You’ll see what I mean in a bit.

Problem details

Suppose we have a knapsack which can hold int w = 10 weight units. We have a total of int n = 4 items to choose from, whose values are represented by an array int[] val = {10, 40, 30, 50} and weights represented by an array int[] wt = {5, 4, 6, 3}.

Since this is the 0–1 knapsack problem, we can either include an item in our knapsack or exclude it, but not include a fraction of it, or include it multiple times.

Solution

Step 1:

First, we create a 2-dimensional array (i.e. a table) of n + 1 rows and w + 1 columns.

A row number i represents the set of all the items from rows 1— i. For instance, the values in row 3 assumes that we only have items 1, 2, and 3.

A column number j represents the weight capacity of our knapsack. Therefore, the values in column 5, for example, assumes that our knapsack can hold 5 weight units.

Putting everything together, an entry in row i, column j represents the maximum value that can be obtained with items 1, 2, 3 … i, in a knapsack that can hold j weight units.

Let’s call our table mat for matrix. Therefore, int[][] mat = new int[n + 1][w + 1].

Step 2:

We can immediately begin filling some entries in our table: the base cases, for which the solution is trivial. For instance, at row 0, when we have no items to pick from, the maximum value that can be stored in any knapsack must be 0. Similarly, at column 0, for a knapsack which can hold 0 weight units, the maximum value that can be stored in it is 0. (We’re assuming that there are no massless, valuable items.)

We can do this with 2 for loops:

// Populate base cases
int[][] mat = new int[n + 1][w + 1];
for (int r = 0; r < w + 1; r++) {
mat[0][r] = 0;
}
for (int c = 0; c < n + 1; c++) {
mat[c][0] = 0;
}

Step 3 (the crux of the problem):

Now, we want to begin populating our table. As with all dynamic programming solutions, at each step, we will make use of our solutions to previous sub-problems.

I’ll first describe the logic, before showing a concrete example.

The relationship between the value at row i, column j and the values to the previous sub-problems is as follows:

Recall that at row i and column j, we are tackling a sub-problem consisting of items 1, 2, 3 … i with a knapsack of j capacity. There are 2 options at this point: we can either include item i or not. Therefore, we need to compare the maximum value that we can obtain with and without item i.

The maximum value that we can obtain without item i can be found at row i-1, column j. This part’s easy. The reasoning is straightforward: whatever maximum value we can obtain with items 1, 2, 3 … i must obviously be the same maximum value we can obtain with items 1, 2, 3 … i - 1, if we choose not to include item i.

To calculate the maximum value that we can obtain with item i, we first need to compare the weight of item i with the knapsack’s weight capacity. Obviously, if item i weighs more than what the knapsack can hold, we can’t include it, so it does not make sense to perform the calculation. In that case, the solution to this problem is simply the maximum value that we can obtain without item i (i.e. the value in the row above, at the same column).

However, suppose that item i weighs less than the knapsack’s capacity. We thus have the option to include it, if it potentially increases the maximum obtainable value. The maximum obtainable value by including item i is thus = the value of item i itself + the maximum value that can be obtained with the remaining capacity of the knapsack. We obviously want to make full use of the capacity of our knapsack, and not let any remaining capacity go to waste.

Therefore, at row i and column j (which represents the maximum value we can obtain there), we would pick either the maximum value that we can obtain without item i, or the maximum value that we can obtain with item i, whichever is larger.

Here’s a concrete example of what I mean.


Image for post


At row 3 (item 2), and column 5 (knapsack capacity of 4), we can choose to either include item 2 (which weighs 4 units) or not. If we choose not to include it, the maximum value we can obtain is the same as if we only have item 1 to choose from (which is found in the row above, i.e. 0). If we want to include item 2, the maximum value we can obtain with item 2 is the value of item 2 (40) + the maximum value we can obtain with the remaining capacity (which is 0, because the knapsack is completely full already).

At row 3 (item 2), and column 10 (knapsack capacity of 9), again, we can choose to either include item 2 or not. If we choose not to, the maximum value we can obtain is the same as that in the row above at the same column, i.e. 10 (by including only item 1, which has a value of 10). If we include item 2, we have a remaining knapsack capacity of 9 - 4 = 5. We can find out the maximum value that can be obtained with a capacity of 5 by looking at the row above, at column 5. Thus, the maximum value we can obtain by including item 2 is 40 (the value of item 2) + 10 = 50.

We pick the larger of 50 vs 10, and so the maximum value we can obtain with items 1 and 2, with a knapsack capacity of 9, is 50.

The algorithm can be expressed in Java like this:

// Main logic
for (int item = 1; item <= n; item++) {
for (int capacity = 1; capacity <= w; capacity++) {
int maxValWithoutCurr = mat[item - 1][capacity]; // This is guaranteed to exist
int maxValWithCurr = 0; // We initialize this value to 0
int weightOfCurr = wt[item - 1]; // We use item -1 to account for the extra row at the top
if (capacity >= weightOfCurr) { // We check if the knapsack can fit the current item
maxValWithCurr = val[item - 1]; // If so, maxValWithCurr is at least the value of the current item
int remainingCapacity = capacity - weightOfCurr; // remainingCapacity must be at least 0
maxValWithCurr += mat[item - 1][remainingCapacity]; // Add the maximum value obtainable with the remaining capacity
}
mat[item][capacity] = Math.max(maxValWithoutCurr, maxValWithCurr); // Pick the larger of the two
}
}

Step 4 (final solution):

Once the table has been populated, the final solution can be found at the last row in the last column, which represents the maximum value obtainable with all the items and the full capacity of the knapsack.

return mat[n][w];

Working code:

Note: here, I printed the final answer instead of returning it, since everything is housed under public static void main.

import java.util.Arrays;
public class Knapsack {
public static void main(String args[]) {
int w = 10;
int n = 4;
int[] val = {10, 40, 30, 50};
int[] wt = {5, 4, 6, 3};
// Populate base cases
int[][] mat = new int[n + 1][w + 1];
for (int r = 0; r < w + 1; r++) {
mat[0][r] = 0;
}
for (int c = 0; c < n + 1; c++) {
mat[c][0] = 0;
}
// Main logic
for (int item = 1; item <= n; item++) {
for (int capacity = 1; capacity <= w; capacity++) {
int maxValWithoutCurr = mat[item - 1][capacity]; // This is guaranteed to exist
int maxValWithCurr = 0; // We initialize this value to 0
int weightOfCurr = wt[item - 1]; // We use item -1 to account for the extra row at the top
if (capacity >= weightOfCurr) { // We check if the knapsack can fit the current item
maxValWithCurr = val[item - 1]; // If so, maxValWithCurr is at least the value of the current item
int remainingCapacity = capacity - weightOfCurr; // remainingCapacity must be at least 0
maxValWithCurr += mat[item - 1][remainingCapacity]; // Add the maximum value obtainable with the remaining capacity
}
mat[item][capacity] = Math.max(maxValWithoutCurr, maxValWithCurr); // Pick the larger of the two
}
}
System.out.println(mat[n][w]); // Final answer
System.out.println(Arrays.deepToString(mat)); // Visualization of the table
}
}

Thanks for reading!


PS : This is not my original article. The original can be found here