I had originally written an <code>ArrayList</code> and stored unique values (usernames, i.e. <code>Strings</code>) in it. I later needed to use the <code>ArrayList</code> to search if a user existed in it. That's <code>O(n)</code> for the search. My tech lead wanted me to change that to a <code>HashMap</code> and store the usernames as keys in the array and values as empty <code>Strings</code>. So, in Java - <pre class="prettyprint"><code>hashmap.put("johndoe",""); </code></pre> I can see if this user exists later by running - <pre class="prettyprint"><code>hashmap.containsKey("johndoe"); </code></pre> This is <code>O(1)</code> right? My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys. My question is, is this a good approach? The efficiency beats <code>ArrayList#contains</code> or an array search in general. It works. My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

Since you have a set of unique values, a <code>Set</code> is the appropriate data structure. You can put your values inside <code>HashSet</code>, an implementation of the <code>Set</code> interface. <blockquote> My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys. </blockquote> The advice of the lead is flawed. <code>Map</code> is not the right abstraction for this, <code>Set</code> is. A <code>Map</code> is appropriate for key-value pairs. But you don't have values, only keys. Example usage: <pre class="prettyprint"><code>Set<String> users = new HashSet<>(Arrays.asList("Alice", "Bob")); System.out.println(users.contains("Alice")); // -> prints true System.out.println(users.contains("Jack")); // -> prints false </code></pre> Using a <code>Map</code> would be awkward, because what should be the type of the values? That question makes no sense in your use case, as you have just keys, not key-value pairs. With a <code>Set</code>, you don't need to ask that, the usage is perfectly natural. <blockquote> This is O(1) right? </blockquote> Yes, searching in a <code>HashMap</code> or a <code>HashSet</code> is O(1) amortized worst case, while searching in a <code>List</code> or an array is O(n) worst case. <hr> Some comments point out that a <code>HashSet</code> is implemented in terms of <code>HashMap</code>. That's fine, at that level of abstraction. At the level of abstraction of the task at hand --- to store a collection of unique usernames, using a set is a natural choice, more natural than a map.

This is basically how <code>HashSet</code> is implemented, so I guess you can say it's a good approach. You might as well use <code>HashSet</code> instead of your <code>HashMap</code> with empty values. For example : <code>HashSet</code>'s implementation of <code>add</code> is <pre class="prettyprint"><code>public boolean add(E e) { return map.put(e, PRESENT)==null; } </code></pre> where <code>map</code> is the backing <code>HashMap</code> and <code>PRESENT</code> is a dummy value. <blockquote> My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it. </blockquote> As I mentioned, the developers of the JDK are using this same approach.

Is it a good idea to store data as keys in HashMap with empty/null values?

Tags:

java

performance

arrays

hashmap

asymptotic-complexity

I had originally written an ArrayList and stored unique values (usernames, i.e. Strings) in it. I later needed to use the ArrayList to search if a user existed in it. That's O(n) for the search.

My tech lead wanted me to change that to a HashMap and store the usernames as keys in the array and values as empty Strings.

So, in Java -

hashmap.put("johndoe","");

I can see if this user exists later by running -

hashmap.containsKey("johndoe");

This is O(1) right?

My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

My question is, is this a good approach? The efficiency beats ArrayList#contains or an array search in general. It works. My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

569

asked Aug 01 '16 05:08

dozer

2 Answers

Since you have a set of unique values, a Set is the appropriate data structure. You can put your values inside HashSet, an implementation of the Set interface.

My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

The advice of the lead is flawed. Map is not the right abstraction for this, Set is. A Map is appropriate for key-value pairs. But you don't have values, only keys.

Example usage:

Set<String> users = new HashSet<>(Arrays.asList("Alice", "Bob"));  System.out.println(users.contains("Alice")); // -> prints true  System.out.println(users.contains("Jack")); // -> prints false

Using a Map would be awkward, because what should be the type of the values? That question makes no sense in your use case, as you have just keys, not key-value pairs. With a Set, you don't need to ask that, the usage is perfectly natural.

This is O(1) right?

Yes, searching in a HashMap or a HashSet is O(1) amortized worst case, while searching in a List or an array is O(n) worst case.

Some comments point out that a HashSet is implemented in terms of HashMap. That's fine, at that level of abstraction. At the level of abstraction of the task at hand --- to store a collection of unique usernames, using a set is a natural choice, more natural than a map.

151

answered Sep 24 '22 09:09

janos

This is basically how HashSet is implemented, so I guess you can say it's a good approach. You might as well use HashSet instead of your HashMap with empty values.

For example :

HashSet's implementation of add is

public boolean add(E e) {     return map.put(e, PRESENT)==null; }

where map is the backing HashMap and PRESENT is a dummy value.

My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

As I mentioned, the developers of the JDK are using this same approach.

answered Sep 24 '22 09:09

Eran

Related questions
                            
                                Explaining Interfaces to Students [closed]
                            
                                How to use @inherited annotation in Java?
                            
                                Testing REST webservices [closed]
                            
                                hibernate 4 and joda-time
                            
                                Check if file exists on remote server using its URL [duplicate]
                            
                                Java, extract just the fractional part of a BigDecimal?
                            
                                Try catch in a JUnit test
                            
                                How to read a properties files and use the values in project Gradle script?
                            
                                Correct way to initialize HashMap and can HashMap hold different value types?
                            
                                Call and receive output from Python script in Java?
                            
                                Clickable widgets in android
                            
                                Allowing Java to use an untrusted certificate for SSL/HTTPS connection
                            
                                any experience with "Play" java web development framework? [closed]
                            
                                SessionTimeout: web.xml vs session.maxInactiveInterval()
                            
                                Get current page programmatically
                            
                                What makes reference comparison (==) work for some strings in Java?
                            
                                What's the difference between Void and no parameter?
                            
                                Cannot set Java breakpoint in Intellij IDEA
                            
                                How to call a method stored in a HashMap? (Java) [duplicate]
                            
                                Dagger 2: Injecting user inputted parameter into object

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With