objects - ruby select

Cómo contar elementos duplicados en una matriz de Ruby (9)

Tengo una matriz ordenada:

[ ''FATAL <error title="Request timed out.">'', ''FATAL <error title="Request timed out.">'', ''FATAL <error title="There is insufficient system memory to run this query.">'' ]

Me gustaría obtener algo como esto, pero no tiene que ser un hash:

[ {:error => ''FATAL <error title="Request timed out.">'', :count => 2}, {:error => ''FATAL <error title="There is insufficient system memory to run this query.">'', :count => 1} ]

¿Qué tal lo siguiente?

things = [1, 2, 2, 3, 3, 3, 4] things.uniq.map{|t| [t,things.count(t)]}.to_h

De alguna manera se siente más limpio y más descriptivo de lo que realmente estamos tratando de hacer.

Sospecho que también funcionaría mejor con colecciones grandes que las que iteran sobre cada valor.

Prueba de rendimiento de referencia:

a = (1...1000000).map { rand(100)} user system total real inject 7.670000 0.010000 7.680000 ( 7.985289) array count 0.040000 0.000000 0.040000 ( 0.036650) each_with_object 0.210000 0.000000 0.210000 ( 0.214731) group_by 0.220000 0.000000 0.220000 ( 0.218581)

Entonces es bastante más rápido.

Aquí está la matriz de muestra:

a=["aa","bb","cc","bb","bb","cc"]

Seleccione todas las claves únicas.
Para cada clave, las acumularemos en un hash para obtener algo como esto: {''bb'' => [''bb'', ''bb'']}

res = a.uniq.inject({}) {|accu, uni| accu.merge({ uni => a.select{|i| i == uni } })} {"aa"=>["aa"], "bb"=>["bb", "bb", "bb"], "cc"=>["cc", "cc"]}

Ahora puedes hacer cosas como:

res[''aa''].size

El siguiente código imprime lo que solicitó. Dejaré que decidas cómo usar realmente para generar el hash que estás buscando:

# sample array a=["aa","bb","cc","bb","bb","cc"] # make the hash default to 0 so that += will work correctly b = Hash.new(0) # iterate over the array, counting duplicate entries a.each do |v| b[v] += 1 end b.each do |k, v| puts "#{k} appears #{v} times" end

Nota: Me di cuenta de que dijiste que la matriz ya está ordenada. El código anterior no requiere clasificación. Usar esa propiedad puede producir código más rápido.

Implementación simple:

(errors_hash = {}).default = 0 array_of_errors.each { |error| errors_hash[error] += 1 }

Personalmente lo haría de esta manera:

# myprogram.rb a = [''FATAL <error title="Request timed out.">'', ''FATAL <error title="Request timed out.">'', ''FATAL <error title="There is insufficient system memory to run this query.">''] puts a

A continuación, ejecute el programa y canalícelo a uniq -c:

ruby myprogram.rb | uniq -c

Salida:

2 FATAL <error title="Request timed out."> 1 FATAL <error title="There is insufficient system memory to run this query.">

Puede hacer esto de manera muy sucinta (una línea) usando inject :

a = [''FATAL <error title="Request timed out.">'', ''FATAL <error title="Request timed out.">'', ''FATAL <error title="There is insufficient ...">''] b = a.inject(Hash.new(0)) {|h,i| h[i] += 1; h } b.to_a.each {|error,count| puts "#{count}: #{error}" }

Producirá:

1: FATAL <error title="There is insufficient ..."> 2: FATAL <error title="Request timed out.">

Si tienes una matriz como esta:

words = ["aa","bb","cc","bb","bb","cc"]

donde necesita contar elementos duplicados, una solución de una línea es:

result = words.each_with_object(Hash.new(0)) { |word,counts| counts[word] += 1 }

Un enfoque diferente de las respuestas anteriores, utilizando Enumerable#group_by .

[1, 2, 2, 3, 3, 3, 4].group_by(&:itself).map { |k,v| [k, v.count] }.to_h # {1=>1, 2=>2, 3=>3, 4=>1}

Rompiendo eso en sus diferentes llamadas a métodos:

a = [1, 2, 2, 3, 3, 3, 4] a = a.group_by(&:itself) # {1=>[1], 2=>[2, 2], 3=>[3, 3, 3], 4=>[4]} a = a.map { |k,v| [k, v.count] } # [[1, 1], [2, 2], [3, 3], [4, 1]] a = a.to_h # {1=>1, 2=>2, 3=>3, 4=>1}

Enumerable#group_by se agregó en Ruby 1.8.7.

a = [1,1,1,2,2,3] a.uniq.inject([]){|r, i| r << { :error => i, :count => a.select{ |b| b == i }.size } } => [{:count=>3, :error=>1}, {:count=>2, :error=>2}, {:count=>1, :error=>3}]