Tackling Hash Table Problems in JavaScript
February 20, 2018
Author: Dave Cohen
Algorithms and data structures can be intimidating. The goal of this post is to explain some approaches to solving hash table algorithm problems providing clear examples with detailed solution explanations. I’m hoping this will help anyone uninitiated alleviate fears and make bigger strides in conquering these challenging problems.
Why Hash Tables?
While attending Fullstack Academy, I participated in an algorithms class taught by alumnus Clément Mihailescu, an employee of Google. I found the class to be enlightening and engaging. (I was also star-struck because he works at Google.) He founded AlgoExpert which is a top-notch platform for practicing interview problems. I highly recommend it for those who are serious about acing their interviews. I was inspired by the power of hash tables after the class and went on a journey to solve as many hash table problems as I could find.
Hash Tables introduction
A hash table is a data structure that provides a dictionary-style lookup where a “key” is provided and a “value” is returned. If you’ve worked with object literals in JavaScript (aka associative arrays), you’re familiar with a common implementation of the hash table data structure. Here’s one:
// association: a "key" maps to a "value"
// in this case, names are keys and their phone numbers are the values:
const phoneNumberLookup = {
'John Smith': '555-123-4567'
'Dan the Man': '555-212-2122'
};
Let’s see a more algorithmic use for the data structure. This example assigns string characters as the “key” and their frequency count as the “value:”
// using the Array.prototype.reduce to count the occurrence of characters:
// association: { [character]: number }
const count = Array.from('aabbccddeee').reduce((accumulator, char) => {
if (char in accumulator) {
accumulator[char]++;
} else {
accumulator[char] = 1;
}
return accumulator;
}, {});
// result:
// {a: 2, b: 2, c: 2, d: 2, e: 3}
// 'a', 'b', 'c', and 'd' occur 2 times, 'e' occurs 3 times
Hash Functions
JavaScript provides us this data structure without us having to know how it works. I want to go a little behind the scenes into the hash function algorithm, its dangers, benefits, and drawbacks.
The process of looking up a value from a given key is particularly clever (in my opinion):
- A key is provided as input to a hash function.
- The hash algorithm converts this key into an integer.
- This integer is an array index of a location in memory.
- The location contains the value data that corresponds with the given key.
Example: “Lisa Smith” goes through a hash function and results in array index 1
. “Sam Doe” goes through a hash function and results in array index 4
. Their respective data is stored at their respective array indices.
If you’re a little wary of the image above where both “John Smith” and “Sandra Dee” point to the same address, you’re not alone. If two or more different hashed keys resulted in the same array index, those are considered “collisions.” Not accounting for collisions is dangerous because we risk losing data. To prevent data loss, the hashing algorithm must have collision resolving methods and (ideally) prevent collisions in the first place. For a data loss example, let’s say both key1
and key2
when hashed resulted in array index 0
. Then, data for key1
is written to that index. Then, data for key2
is written to that index. Finally, if I want the data from key1
, it’ll have been overwritten by the data of key2
! No good.
Let’s talk “Big O.” In terms of time complexity, writing and retrieving data using a hash table are O(1) operations. That’s amazing in terms of efficiency! Less optimal is sorting the data, you’re better off using a different data structure if that’s important for your use case.
I encourage you to check out the articles below for more on “Big O” and in-depth hash table lessons.
Big O Notation:
Hash Tables:
- Hash function | Wikipedia
- Algorithms in JavaScript: Hash Tables | by Rohan Paul | JavaScript in Plain English
- Hashing GeeksforGeeks
- Hash Table Map Data Structure - Interview Cake
Coding Challenge Sites: Leetcode, Hackerrank, Codewars, and more
There are many great “code challenge” websites to practice at computer science problems. When I wanted to work more with hash tables, I discovered that LeetCode has a tagging system which links to 81 hash table problems!
Other great sites for coding challenges:
- HackerRank my general favorite. The “Data Structures” and “Algorithms” tracks are phenomenal. I also enjoyed “10 Days of JavaScript.”
- Codewars - has an excellent community and platform, as well as some very fun problems.
Summary of Approaches
After solving around 6 problems of various difficulty on LeetCode, I observed some general principles and forms:
- Hash tables allow for very fast, O(1), lookups. If my solution to a problem was “timing out,” I was probably over-using array methods.
- Nested
for
loops (which have O(n^2) time efficiency) can often be alleviated with hash tables. - It’s often necessary to make two passes through an array to gather the hash table data to allow for complete processing.
- It’s sometimes possible to make a single pass if by the end of the pass, you have all the data you need for processing.
Javascript objects (what I’ve used for hash tables) are extremely versatile. I was able to implement them with the following patterns:
- a “map” of one data type to another:
{a: 'dog'}
- the “visited” pattern:
{1: true}
. You can also use aSet
for this. - the “count” or “accumulator” pattern:
{a: 2, b: 12}
. This can be implemented with.reduce()
or a simplefor
loop.
Word Pattern (tagged as ‘Easy’)
Link to problem & description: Word Pattern - LeetCode
Basic Gist: If a pattern (like ‘abba’) matches an input string (like ‘dog cat cat dog’) then it’s a match. (‘dog cat cat fish’ wouldn’t be a match.)
My Solution Stats: Status: Accepted. 33 / 33 test cases passed. Runtime: 48 ms
/**
* @param {string} pattern
* @param {string} str
* @return {boolean}
*/
var wordPattern = function(pattern, str) {
if (pattern === '' || str === '') return false;
const arrStr = str.split(' ');
if (arrStr.length !== pattern.length) return false;
let patternHash = {};
let wordHash = {};
for (let i = 0; i < arrStr.length; i++) {
if (!patternHash.hasOwnProperty(pattern[i]) && !wordHash[arrStr[i]]) {
// add the pattern letter to hash table, mapped to current word
patternHash[pattern[i]] = arrStr[i];
// make sure current word is accounted for
wordHash[arrStr[i]] = true;
} else {
// check if the word in the pattern hash matches the current word
if (patternHash[pattern[i]] !== arrStr[i]) {
return false;
}
}
}
return true;
};
Explanation:
- the first 3 lines are guards against invalid input and split the input
str
into an array which was separated by spaces. patternHash
andwordHash
will keep track of slightly different things. (I’ll explain as I go.)
// patternHash example: { a: 'dog', b: 'cat' }
// wordHash example: { dog: true, cat: true }
- we use a
for
loop to iterate over the entire input string as array. -
the first
if
block checks 2 things have NOT happened yet:- if we’ve mapped a word to a letter in the pattern string.
- if we’ve accounted for the current word - which is
arrStr[i]
. - if neither has happened, we can safely initialize both hashes. (as described above).
else
block: now is the moment of truth. IfpatternHash[pattern[i]]
doesn’t match up witharrStr[i]
, we can definitivelyreturn false
. We don’t have an exact match of pattern to words.- finally, if the
for
loop has completed, we can safelyreturn true
. We have a match!
Reflection: This certainly works, but I can probably do this with only one hash table. {dog: true}
doesn’t have extra meaningful data and between my two hashes, I store each word twice.
Two Sum (tagged as ‘Easy’)
Link to problem & description: Two Sum - LeetCode
Basic Gist: Given an array of integers, return indices of the two numbers such that they add up to a specific target. (target could be 9, for example.)
My Solution Stats: 19 / 19 test cases passed. Status: Accepted. Runtime: 84 ms
/**
* @param {number[]} nums
* @param {number} target
* @return {number[]}
*/
var twoSum = function(nums, target) {
let hashed = {};
nums.forEach((n, i) => {
if (!hashed.hasOwnProperty(n)) {
hashed[n] = i;
} else {
hashed[n] = [hashed[n], i];
}
});
let hashedKeys = Object.keys(hashed);
let i = 0;
for (i; i < nums.length; i++) {
let subtractor = target - +hashedKeys[i];
if (subtractor in hashed) {
let check = hashed[hashedKeys[i]];
if (!Array.isArray(check)) {
return [check, hashed[subtractor]];
} else {
return check.slice(0, 2);
}
}
}
};
Explanation:
- I begin by initializing a hash object to store the numbers as unique keys. For the values, each number key initially maps to its index. If there are multiples of a number key, I store an array of the indices. Each value can only be used once, so this prepares for the solution format (which asks for an array of two indices).
hashedKeys
is the hash converted to an array of its keys.- the
for
loop iterates over all the numbers in the array. Everything below runs inside the loop. - the
subtractor
is the number we’re looking for to add with the current number to see if it adds up to thetarget
. I saved it in a variable for easier readability going forward. - Now for the main
if
block which checks if thesubtractor
is present in the hash. -
the
check
variable will be type checked.- If we have a number, that means it’s unique and we can return in the array format requested.
- If it’s an array, that means the values are equivalent, and we can return what we initially stored. I do a slice to ensure that it’s the correct length.
Reflection: I worked on this problem 5 months before working on it recently. I unknowingly created an O(n^2) algorithm by using indexOf
to check the entire array sliced at each index as I iterated over it. The runtime was 548 ms (more than 5 times slower than the hash solution.) I think I can improve my hash solution by implementing it without Object.keys(hash)
. I can’t actually explain (at the moment) why I used that, and didn’t use nums[i]
.
Note: I used the “Two Pass” solution. Read more about that and other solutions here: Two Sum - LeetCode Articles. It may be locked until you’ve solved it yourself.
Single Number (tagged as ‘Easy’)
Link to problem & description: Single Number - LeetCode
Basic Gist: Given an array of integers, every element appears twice except for one. Find that single one.
My Solution Stats: 15 / 15 test cases passed. Status: Accepted. Runtime: 84 ms
/**
* @param {number[]} nums
* @return {number}
*/
var singleNumber = function(nums) {
let once = new Set();
nums.forEach(num => {
if (!once.has(num)) {
once.add(num);
} else {
once.delete(num);
}
});
return Array.from(once)[0];
};
Explanation:
- I realized I could use a
Set
to handle storing unique numbers and for fast lookup. -
Using a
forEach
loop:- The first time a number has been seen, I store it in the
once
set. - If the number had been seen before, I delete it from the
once
set.
- The first time a number has been seen, I store it in the
- Assuming our inputs are valid, we return the only element in the
once
set. We get to it by converting it to an array withArray.from()
and getting the[0]
element.
Reflection: Initially, I created two hashes to make sure the algorithm was running correctly. I realized I could use a Set
later and and simply delete the num
entry in once
, without storing it in twice
. Oddly enough, my runtime went from 64ms to 84ms with the Set
implementation. Maybe objects are optimized to run more quickly than sets (?)
Note: there is a very nice solution explanation here: Single Number - LeetCode Articles. It may be locked until you’ve solved it yourself.
Set Mismatch (tagged as ‘Easy’)
Link to problem & description: Set Mismatch - LeetCode
Basic Gist: You have to find where “an error occurred.” Any given input will have a number missing and a number duplicated. Example: [1,2,2,4]
My Solution Stats: 49 / 49 test cases passed. Status: Accepted. Runtime: 84 ms
/**
* @param {number[]} nums
* @return {number[]}
*/
var findErrorNums = function(nums) {
let visited = {};
let result = [null, null];
let max = 0;
for (let i = 0; i < nums.length; i++) {
let current = nums[i];
if (!visited[current]) {
visited[current] = true;
} else {
// current is the duplicated number
result[0] = current;
}
if (current > max) {
max = current;
}
}
current = max;
while (current > 1) {
let oneBefore = current - 1;
if (!visited[oneBefore]) {
// oneBefore is the skipped number
result[1] = oneBefore;
return result;
}
current--;
}
if (!result[1]) {
result[1] = max + 1;
return result;
}
return result;
};
Explanation:
-
This problem was difficult to optimize because the input numbers were not necessarily sorted. There were a few short cases that needed to be accounted for. Note:
result[0]
is the duplicated number andresult[1]
is the missing number.- input
[2,2]
expected[2,1]
output. - input
[1,1]
expected[1,2]
output. - I’ll describe further down how I accounted for this.
- input
visited
is our initialized hash. We’ll set a given number as a key and ‘true’ as the value to keep track of us visiting the number.result
is an array initialized with 2null
values. This way, I was forcing myself to update it without having an accidental solution.max
is the highest number in the array. I wanted to avoid sorting, so the highest number in the array would be where I search from to find the missing number. More below.-
the
for
loop iterates over all the numbers.- If the number isn’t in
visited
, add it to it. - If the number is in
visited
, it’s the duplicated number. We save it asresult[0]
. - I check all numbers and store it in
max
if it’s the highest so far.
- If the number isn’t in
-
The
while
loop begins atmax
and traverses downward.oneBefore
is our search, it’s simply the current number - 1.- If we don’t see
oneBefore
invisited
, it’s the missing number! We can assign it to our result and return it. - If we do see it, we simply go down and try
current--
.
- After this
while
loop, we now account for the case where both numbers are 1. We know thatresult[1]
should be one number higher than max.
Reflection: I feel like the efficiency of the algorithm is quite good. I only store what’s necessary. Keeping track of the edge cases was a bit cumbersome for me. I’m sure there’s a graceful way to do it without the final check that I do.
Official Solution: Set Mismatch - LeetCode Articles
- Note: It may be locked until you try to solve it!
Longest Harmonious Subsequence (tagged as ‘Easy’)
Link to problem & description: Longest Harmonious Subsequence - LeetCode
Basic Gist: Return the length of the longest sequence of 2 unique numbers which are consecutive. Example subsequences: [3,2,2,2,3]
, [3,2,3,2,3]
, [3,3,3,2,2,2]
. (Other numbers in between aren’t counted.) (This one’s a bit difficult to summarize, so check out the problem description.)
My Solution Stats: 201 / 201 test cases passed. Status: Accepted. Runtime: 108 ms
/**
* @param {number[]} nums
* @return {number}
*/
var findLHS = function(nums) {
let hash = {};
let current;
let i;
let max = 0;
let maxCheck1 = 0;
let maxCheck2 = 0;
for (i = 0; i < nums.length; i++) {
current = nums[i];
if (current in hash) {
hash[current]++;
} else {
hash[current] = 1;
}
if (hash[current + 1]) {
maxCheck1 = hash[current] + hash[current + 1];
}
if (hash[current - 1]) {
maxCheck2 = hash[current] + hash[current - 1];
}
max = Math.max(max, maxCheck1, maxCheck2);
}
return max;
};
Explanation:
- The
hash
in this problem stores a number as unique key and the number of times it occurs as the value. - I initialized all the variables outside of the loop. It isn’t necessary to do it this way, I just felt like it.
-
The
for
loop is the main scene of this algorithm. I iterate over all of the numbers in the input array.- The current number is stored in the
current
variable. - The first
if
block checks if the current number is in the hash table. If so, we increase it’s value by 1. If not, we initialize it to be 1. - The second
if
block checks if the current number plus 1 is in the hash table. If so, we store the total number of occurrences ofcurrent
andcurrent + 1
in themaxCheck1
variable. (This will be used soon below.) - The third and final
if
block checks if the current number minus 1 is in the hash table. If so, we store the total number of occurrences ofcurrent
andcurrent - 1
in themaxCheck2
variable. (This will be used soon below.) max
is the maximum ofmax
,maxCheck1
, andmaxCheck2
.max
represents the length of the longest sequence.
- The current number is stored in the
Reflection: I first implemented this solution in 2 passes. After seeing the official solution, I decided to wrangle mine into the single pass version. There’s even more I could do to clean it up, but I’m pretty happy with it.
Official Solution: Longest Harmonious Subsequence - LeetCode
- Note: It may be locked until you try to solve it!
Parting Notes
I’m no expert at algorithms in general, but I’m definitely enjoying learning about them. The more I learn, the more I’m able to visualize different approaches to problems and begin to optimize my solutions. Please contact me below if you have comments about this article and/or suggestions for improvement.
This “enrichment piece” was written as a student at Fullstack Academy.