Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I use a C# Dictionary if I only need fast lookup of keys, and values are irrelevant?

I am in need of a data type that is able to insert entries and then be able to quickly determine if an entry has already been inserted. A Dictionary seems to suit this need (see example). However, I have no use for the dictionary's values. Should I still use a dictionary or is there another better suited data type?

public class Foo {     private Dictionary<string, bool> Entities;      ...      public void AddEntity(string bar)     {         if (!Entities.ContainsKey(bar))         {             // bool value true here has no use and is just a placeholder             Entities.Add(bar, true);         }     }      public string[] GetEntities()     {         return Entities.Keys.ToArray();     }  } 
like image 318
reformed Avatar asked Mar 03 '17 17:03

reformed


People also ask

Is it better to use AC or not?

Leaving your air conditioner on is actually more efficient than frequently turning it on and off. Having your AC on also allows you to better control humidity in your home throughout the day. The lower the humidity in a home, the more comfortable it feels during hot weather.

Is it unhealthy to use AC?

Unless systems are cleaned regularly, air conditioners can be a source of health issues. Air contamination can become a severe problem that contributes to respiratory ailments in people. Additionally, air conditioning at work and home can lead to problems, such as colds, fevers, headaches and fatigue.

Is it good to use AC at 27 degrees?

Studies by various agencies prove that setting the temperature at 27-degree celsius can reduce the AC bill by up to 30%. As per researches, every degree of temperature increased results in a 6% decrease in the energy consumption for a split AC.

At what temperature we should use AC?

The Ministry of Power (Bureau of Energy Efficiency) has said that all room air conditioners (AC) will have to ensure a default temperature setting of temperature in the appliances at 24 degree celsius from January 1, 2020.

Should I use C++ for Stack Overflow?

With C++ you can have very long compile times (which means, of course, more time for Stack Overflow!). Show activity on this post. If you want your code to be understood by virtually any programmer write in C. Show activity on this post. I'm surprised no one's mentioned libraries.

Should I learn C++ or C?

Most compilers use name mangling, and the ones that don't do something at least as messy. If your system lives on its own, as is the case with many applications, then C++ is a fine choice. If your system needs to interact with software not neccesarily written in C++ (most frequently in assembler, or Fortran Libraries) then you are in a tight spot.

When shouldn’t you use a/C Pro?

Below is a list of the situations in which you should not use A/C Pro. Don’t use A/C Pro if: Your car was built before 1994 and has never been converted from R-12 to R-134a.

Should we switch from C++ to C?

If we were to go a step further and switch entirely to C we would gain little and lose the most useful constructs of C++. The biggest practical reason for preferring C is that support is more widespread than C++. There are many platforms, particularly embedded ones, that do not even have C++ compilers.


2 Answers

You can use HashSet<T>.

The HashSet<T> class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

like image 181
Habib Avatar answered Oct 12 '22 04:10

Habib


Habib's answer is excellent, but for multi-threaded environments if you use a HashSet<T> then by consequence you have to use locks to protect access to it. I find myself more prone to creating deadlocks with lock statements. Also, locks yield a worse speedup per Amdahl's law because adding a lock statement reduces the percentage of your code that is actually parallel.

For those reasons, a ConcurrentDictionary<T,object> fits the bill in multi-threaded environments. If you end up using one, then wrap it like you did in your question. Just new up objects to toss in as values as needed, since the values won't be important. You can verify that there are no lock statements in its source code.

If you didn't need mutability of the collection then this would be moot. But your question implies that you do need it, since you have an AddEntity method.

Additional info 2017-05-19 - actually, ConcurrentDictionary does use locks internally, although not lock statements per se--it uses Monitor.Enter (check out the TryAddInternal method). However, it seems to lock on individual buckets within the dictionary, which means there will be less contention than putting the entire thing in a lock statement.

So all in all, ConcurrentDictionary is often better for multithreaded environments.

It's actually quite difficult (impossible?) to make a concurrent hash set using only the Interlocked methods. I tried on my own and kept running into the problem of needing to alter two things at the same time--something that only locking can do in general. One workaround I found was to use singly-linked lists for the hash buckets and intentionally create cycles in a list when one thread needed to operate on a node without interference from other threads; this would cause other threads to get caught spinning around in the same spot until that thread was done with its node and undid the cycle. Sure, it technically didn't use locks, but it did not scale well.

like image 38
Matt Thomas Avatar answered Oct 12 '22 05:10

Matt Thomas