c# - para - ¿Qué es más eficiente: Diccionario TryGetValue o ContainsKey+Item?

new dictionary c# (10)

De la entrada de MSDN en el Método Dictionary.TryGetValue :

Este método combina la funcionalidad del método ContainsKey y la propiedad Item.
Si no se encuentra la clave, el parámetro de valor obtiene el valor predeterminado apropiado para el tipo de valor TValue; por ejemplo, 0 (cero) para tipos enteros, falso para tipos booleanos y nulo para tipos de referencia.
Use el método TryGetValue si su código intenta con frecuencia acceder a claves que no están en el diccionario. Usar este método es más eficiente que capturar la excepción KeyNotFoundException lanzada por la propiedad Item.
Este método se acerca a una operación O (1).

A partir de la descripción, no queda claro si es más eficiente o simplemente más conveniente que llamar a ContainsKey y luego realizar la búsqueda. ¿La implementación de TryGetValue simplemente llama a ContainsKey y luego a Item o en realidad es más eficiente que hacer una sola búsqueda?

En otras palabras, lo que es más eficiente (es decir, cuál realiza menos búsquedas):

Dictionary<int,int> dict; //...// int ival; if(dict.ContainsKey(ikey)) { ival = dict[ikey]; } else { ival = default(int); }

Dictionary<int,int> dict; //...// int ival; dict.TryGetValue(ikey, out ival);

Nota: ¡No estoy buscando un punto de referencia!

¿Por qué no lo pruebas?

Pero estoy bastante seguro de que TryGetValue es más rápido, porque solo hace una búsqueda. Por supuesto, esto no está garantizado, es decir, diferentes implementaciones pueden tener diferentes características de rendimiento.

La forma en que implementaría un diccionario es mediante la creación de una función de Find interna que encuentre la ranura para un elemento, y luego construya el resto encima de eso.

A quien le importa :-)

Probablemente lo esté preguntando porque es un dolor usar TryGetValue que TryGetValue así con un método de extensión.

public static class CollectionUtils { // my original method // public static V GetValueOrDefault<K, V>(this Dictionary<K, V> dic, K key) // { // V ret; // bool found = dic.TryGetValue(key, out ret); // if (found) // { // return ret; // } // return default(V); // } // EDIT: one of many possible improved versions public static TValue GetValueOrDefault<K, V>(this IDictionary<K, V> dictionary, K key) { // initialized to default value (such as 0 or null depending upon type of TValue) TValue value; // attempt to get the value of the key from the dictionary dictionary.TryGetValue(key, out value); return value; }

Entonces solo llama

dict.GetValueOrDefault("keyname")

(dict.GetValueOrDefault("keyname") ?? fallbackValue)

Además de diseñar un microbenchmark que dará resultados precisos en un entorno práctico, puede inspeccionar la fuente de referencia de .NET Framework.

Todos ellos llaman al FindEntry(TKey) que realiza la mayor parte del trabajo y no memoriza su resultado, por lo que llamar a TryGetValue es casi el doble de rápido que ContainsKey + Item .

La interfaz inconveniente de TryGetValue se puede adaptar utilizando un método de extensión :

using System.Collections.Generic; namespace Project.Common.Extensions { public static class DictionaryExtensions { public static TValue GetValueOrDefault<TKey, TValue>( this IDictionary<TKey, TValue> dictionary, TKey key, TValue defaultValue = default(TValue)) { if (dictionary.TryGetValue(key, out TValue value)) { return value; } return defaultValue; } } }

^{Desde C # 7.1, puede reemplazar el default(TValue) con el default simple.} ^{El tipo es inferido.}

Uso:

var dict = new Dictionary<string, string>(); string val = dict.GetValueOrDefault("theKey", "value used if theKey is not found in dict");

Devuelve null para los tipos de referencia cuya búsqueda falla, a menos que se especifique un valor predeterminado explícito.

var dictObj = new Dictionary<string, object>(); object valObj = dictObj.GetValueOrDefault("nonexistent"); Debug.Assert(valObj == null); val dictInt = new Dictionary<string, int>(); int valInt = dictInt.GetValueOrDefault("nonexistent"); Debug.Assert(valInt == 0);

Al hacer un programa de prueba rápido, definitivamente hay una mejora al usar TryGetValue con 1 millón de artículos en un diccionario.

Resultados:

Artículo de ContainsKey + para 1000000 hits: 45ms

TryGetValue por 1000000 hits: 26ms

Aquí está la aplicación de prueba:

static void Main(string[] args) { const int size = 1000000; var dict = new Dictionary<int, string>(); for (int i = 0; i < size; i++) { dict.Add(i, i.ToString()); } var sw = new Stopwatch(); string result; sw.Start(); for (int i = 0; i < size; i++) { if (dict.ContainsKey(i)) result = dict[i]; } sw.Stop(); Console.WriteLine("ContainsKey + Item for {0} hits: {1}ms", size, sw.ElapsedMilliseconds); sw.Reset(); sw.Start(); for (int i = 0; i < size; i++) { dict.TryGetValue(i, out result); } sw.Stop(); Console.WriteLine("TryGetValue for {0} hits: {1}ms", size, sw.ElapsedMilliseconds); }

Como ninguna de las respuestas hasta ahora responde realmente a la pregunta, esta es una respuesta aceptable que encontré después de algunas investigaciones:

Si descompilas TryGetValue, verás que está haciendo esto:

public bool TryGetValue(TKey key, out TValue value) { int index = this.FindEntry(key); if (index >= 0) { value = this.entries[index].value; return true; } value = default(TValue); return false; }

mientras que el método ContainsKey es:

public bool ContainsKey(TKey key) { return (this.FindEntry(key) >= 0); }

por lo tanto, TryGetValue es solo ContainsKey más una búsqueda de matriz si el elemento está presente.

Source

Parece que TryGetValue será casi el doble de rápido que la combinación ContainsKey + Item.

En mi máquina, con un montón de RAM, cuando se ejecuta en modo LIBERAR (no DEBUG), ContainsKey es igual a TryGetValue / try-catch si se encuentran todas las entradas en el Dictionary<> .

ContainsKey supera a todos con diferencia cuando hay solo unas pocas entradas de diccionario que no se encuentran (en mi ejemplo a continuación, establezca MAXVAL en algo más grande que ENTRIES para que se MAXVAL algunas entradas):

Resultados:

Finished evaluation .... Time distribution: Size: 000010: TryGetValue: 53,24%, ContainsKey: 1,74%, try-catch: 45,01% - Total: 2.006,00 Size: 000020: TryGetValue: 37,66%, ContainsKey: 0,53%, try-catch: 61,81% - Total: 2.443,00 Size: 000040: TryGetValue: 22,02%, ContainsKey: 0,73%, try-catch: 77,25% - Total: 7.147,00 Size: 000080: TryGetValue: 31,46%, ContainsKey: 0,42%, try-catch: 68,12% - Total: 17.793,00 Size: 000160: TryGetValue: 33,66%, ContainsKey: 0,37%, try-catch: 65,97% - Total: 36.840,00 Size: 000320: TryGetValue: 34,53%, ContainsKey: 0,39%, try-catch: 65,09% - Total: 71.059,00 Size: 000640: TryGetValue: 32,91%, ContainsKey: 0,32%, try-catch: 66,77% - Total: 141.789,00 Size: 001280: TryGetValue: 39,02%, ContainsKey: 0,35%, try-catch: 60,64% - Total: 244.657,00 Size: 002560: TryGetValue: 35,48%, ContainsKey: 0,19%, try-catch: 64,33% - Total: 420.121,00 Size: 005120: TryGetValue: 43,41%, ContainsKey: 0,24%, try-catch: 56,34% - Total: 625.969,00 Size: 010240: TryGetValue: 29,64%, ContainsKey: 0,61%, try-catch: 69,75% - Total: 1.197.242,00 Size: 020480: TryGetValue: 35,14%, ContainsKey: 0,53%, try-catch: 64,33% - Total: 2.405.821,00 Size: 040960: TryGetValue: 37,28%, ContainsKey: 0,24%, try-catch: 62,48% - Total: 4.200.839,00 Size: 081920: TryGetValue: 29,68%, ContainsKey: 0,54%, try-catch: 69,77% - Total: 8.980.230,00

Aquí está mi código:

using System; using System.Collections.Generic; using System.Diagnostics; namespace ConsoleApplication1 { class Program { static void Main(string[] args) { const int ENTRIES = 10000, MAXVAL = 15000, TRIALS = 100000, MULTIPLIER = 2; Dictionary<int, int> values = new Dictionary<int, int>(); Random r = new Random(); int[] lookups = new int[TRIALS]; int val; List<Tuple<long, long, long>> durations = new List<Tuple<long, long, long>>(8); for (int i = 0;i < ENTRIES;++i) try { values.Add(r.Next(MAXVAL), r.Next()); } catch { --i; } for (int i = 0;i < TRIALS;++i) lookups[i] = r.Next(MAXVAL); Stopwatch sw = new Stopwatch(); ConsoleColor bu = Console.ForegroundColor; for (int size = 10;size <= TRIALS;size *= MULTIPLIER) { long a, b, c; Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Loop size: {0}", size); Console.ForegroundColor = bu; // --------------------------------------------------------------------- sw.Start(); for (int i = 0;i < size;++i) values.TryGetValue(lookups[i], out val); sw.Stop(); Console.WriteLine("TryGetValue: {0}", a = sw.ElapsedTicks); // --------------------------------------------------------------------- sw.Restart(); for (int i = 0;i < size;++i) val = values.ContainsKey(lookups[i]) ? values[lookups[i]] : default(int); sw.Stop(); Console.WriteLine("ContainsKey: {0}", b = sw.ElapsedTicks); // --------------------------------------------------------------------- sw.Restart(); for (int i = 0;i < size;++i) try { val = values[lookups[i]]; } catch { } sw.Stop(); Console.WriteLine("try-catch: {0}", c = sw.ElapsedTicks); // --------------------------------------------------------------------- Console.WriteLine(); durations.Add(new Tuple<long, long, long>(a, b, c)); } Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Finished evaluation .... Time distribution:"); Console.ForegroundColor = bu; val = 10; foreach (Tuple<long, long, long> d in durations) { long sum = d.Item1 + d.Item2 + d.Item3; Console.WriteLine("Size: {0:D6}:", val); Console.WriteLine("TryGetValue: {0:P2}, ContainsKey: {1:P2}, try-catch: {2:P2} - Total: {3:N}", (decimal)d.Item1 / sum, (decimal)d.Item2 / sum, (decimal)d.Item3 / sum, sum); val *= MULTIPLIER; } Console.WriteLine(); } } }

Si está intentando sacar el valor del diccionario, TryGetValue (clave, valor de salida) es la mejor opción, pero si está comprobando la presencia de la clave, para una nueva inserción, sin sobrescribir las claves antiguas, y solo con ese alcance, ContainsKey (clave) es la mejor opción, el punto de referencia puede confirmar esto:

using System; using System.Threading; using System.Diagnostics; using System.Collections.Generic; using System.Collections; namespace benchmark { class Program { public static Random m_Rand = new Random(); public static Dictionary<int, int> testdict = new Dictionary<int, int>(); public static Hashtable testhash = new Hashtable(); public static void Main(string[] args) { Console.WriteLine("Adding elements into hashtable..."); Stopwatch watch = Stopwatch.StartNew(); for(int i=0; i<1000000; i++) testhash[i]=m_Rand.Next(); watch.Stop(); Console.WriteLine("Done in {0:F4} -- pause....", watch.Elapsed.TotalSeconds); Thread.Sleep(4000); Console.WriteLine("Adding elements into dictionary..."); watch = Stopwatch.StartNew(); for(int i=0; i<1000000; i++) testdict[i]=m_Rand.Next(); watch.Stop(); Console.WriteLine("Done in {0:F4} -- pause....", watch.Elapsed.TotalSeconds); Thread.Sleep(4000); Console.WriteLine("Finding the first free number for insertion"); Console.WriteLine("First method: ContainsKey"); watch = Stopwatch.StartNew(); int intero=0; while (testdict.ContainsKey(intero)) { intero++; } testdict.Add(intero, m_Rand.Next()); watch.Stop(); Console.WriteLine("Done in {0:F4} -- added value {1} in dictionary -- pause....", watch.Elapsed.TotalSeconds, intero); Thread.Sleep(4000); Console.WriteLine("Second method: TryGetValue"); watch = Stopwatch.StartNew(); intero=0; int result=0; while(testdict.TryGetValue(intero, out result)) { intero++; } testdict.Add(intero, m_Rand.Next()); watch.Stop(); Console.WriteLine("Done in {0:F4} -- added value {1} in dictionary -- pause....", watch.Elapsed.TotalSeconds, intero); Thread.Sleep(4000); Console.WriteLine("Test hashtable"); watch = Stopwatch.StartNew(); intero=0; while(testhash.Contains(intero)) { intero++; } testhash.Add(intero, m_Rand.Next()); watch.Stop(); Console.WriteLine("Done in {0:F4} -- added value {1} into hashtable -- pause....", watch.Elapsed.TotalSeconds, intero); Console.Write("Press any key to continue . . . "); Console.ReadKey(true); } } }

Este es un verdadero ejemplo. Tengo un servicio que, por cada "Artículo" creado, asocia un número progresivo, este número, cada vez que crea un nuevo artículo, debe encontrarse libre, si elimina un Artículo, el número gratuito se convierte en gratis, por supuesto, esto no está optimizado, ya que tengo una var estática que almacena en caché el número actual, pero en caso de que termine todos los números, puede volver a comenzar de 0 a UInt32.MaxValue

Prueba ejecutada:
Añadiendo elementos a la tabla hash ...
Hecho en 0,5908 - pausa ....
Añadiendo elementos al diccionario ...
Hecho en 0,2679 - pausa ....
Encontrar el primer número gratuito para la inserción.
Primer método: ContainsKey
Hecho en 0,0561 - valor añadido 1000000 en el diccionario - pausa ....
Segundo método: TryGetValue
Hecho en 0,0643 - valor añadido 1000001 en el diccionario - pausa ....
Hashtable de prueba
Hecho en 0,3015 - valor agregado 1000000 en tabla hash - pausa ....
Pulse cualquier tecla para continuar . .

Si algunos de ustedes preguntan si ContainsKeys podría tener una ventaja, incluso he intentado invertir la clave TryGetValue with Contains, el resultado es el mismo.

Entonces, para mí, con una consideración final, todo depende de cómo se comporte el programa.

Todas las respuestas hasta el momento, aunque buenas, pierden un punto vital.

Los métodos en las clases de una API (por ejemplo, el marco .NET) forman parte de una definición de interfaz (no una interfaz C # o VB, sino una interfaz en el significado de informática).

Como tal, generalmente es incorrecto preguntar si llamar a un método de este tipo es más rápido, a menos que la velocidad sea parte de la definición de la interfaz formal (que no lo es en este caso).

Tradicionalmente, este tipo de acceso directo (que combina la búsqueda y la recuperación) es más eficiente independientemente del idioma, la infraestructura, el sistema operativo, la plataforma o la arquitectura de la máquina. También es más legible, porque expresa su intento explícitamente, en lugar de implicarlo (desde la estructura de su código).

Así que la respuesta (de un antiguo truco) es definitivamente ''Sí'' (TryGetValue es preferible a una combinación de ContainsKey y Item [Get] para recuperar un valor de un Diccionario).

Si crees que esto suena extraño, piénsalo así: incluso si las implementaciones actuales de TryGetValue, ContainsKey y Item [Get] no producen ninguna diferencia de velocidad, puedes asumir que es probable que una implementación futura (por ejemplo, .NET v5) lo hará (TryGetValue será más rápido). Piense en la vida útil de su software.

Además, es interesante observar que las tecnologías de definición de interfaz modernas típicas aún rara vez proporcionan algún medio para definir formalmente las restricciones de tiempo. Tal vez .NET v5?

Un punto de referencia rápido muestra que TryGetValue tiene una ligera ventaja:

static void Main() { var d = new Dictionary<string, string> {{"a", "b"}}; var start = DateTime.Now; for (int i = 0; i != 10000000; i++) { string x; if (!d.TryGetValue("a", out x)) throw new ApplicationException("Oops"); if (d.TryGetValue("b", out x)) throw new ApplicationException("Oops"); } Console.WriteLine(DateTime.Now-start); start = DateTime.Now; for (int i = 0; i != 10000000; i++) { string x; if (d.ContainsKey("a")) { x = d["a"]; } else { x = default(string); } if (d.ContainsKey("b")) { x = d["b"]; } else { x = default(string); } } }

Esto produce

00:00:00.7600000 00:00:01.0610000

haciendo que el objeto ContainsKey + Item un 40% más lento, suponiendo una combinación uniforme de éxitos y fallos.

Además, cuando cambio el programa para siempre fallar (es decir, buscar siempre "b" ), las dos versiones se vuelven igualmente rápidas:

00:00:00.2850000 00:00:00.2720000

Cuando lo hago "todos los hits", sin embargo, TryGetValue sigue siendo un claro ganador:

00:00:00.4930000 00:00:00.8110000

TryGetValue será más rápido.

ContainsKey utiliza la misma comprobación que TryGetValue , que se refiere internamente a la ubicación de entrada real. La propiedad Item tiene en realidad una funcionalidad de código casi idéntica a la de TryGetValue , excepto que lanzará una excepción en lugar de devolver false.

El uso de ContainsKey seguido del elemento básicamente duplica la funcionalidad de búsqueda, que es la mayor parte del cálculo en este caso.