java - lectura - como leer un contador de luz antiguo

El contador no funciona en el código reductor (1)

Estoy trabajando en un proyecto Big hadoop y hay un KPI pequeño, donde tengo que escribir solo los 10 valores principales para reducir el rendimiento. Para completar este requisito, he usado un contador y rompí el ciclo cuando el contador es igual a 11, pero aún el reductor escribe todos los valores en HDFS.

Este es un código Java bastante simple, pero estoy atascado :(

Para las pruebas, he creado una clase independiente (aplicación Java) para hacer esto y esto está funcionando allí; Me pregunto por qué no funciona en el código reductor.

Por favor, que alguien me ayude y sugiera si me falta algo.

MAPA - REDUCIR CÓDIGO

package comparableTest; import java.io.IOException; import java.nio.ByteBuffer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.IntWritable.Comparator; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.WritableComparator; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.Mapper.Context; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class ValueSortExp2 { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(true); String arguments[] = new GenericOptionsParser(conf, args).getRemainingArgs(); Job job = new Job(conf, "Test commond"); job.setJarByClass(ValueSortExp2.class); // Setup MapReduce job.setMapperClass(MapTask2.class); job.setReducerClass(ReduceTask2.class); job.setNumReduceTasks(1); // Specify key / value job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(Text.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(Text.class); job.setSortComparatorClass(IntComparator2.class); // Input FileInputFormat.addInputPath(job, new Path(arguments[0])); job.setInputFormatClass(TextInputFormat.class); // Output FileOutputFormat.setOutputPath(job, new Path(arguments[1])); job.setOutputFormatClass(TextOutputFormat.class); int code = job.waitForCompletion(true) ? 0 : 1; System.exit(code); } public static class IntComparator2 extends WritableComparator { public IntComparator2() { super(IntWritable.class); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { Integer v1 = ByteBuffer.wrap(b1, s1, l1).getInt(); Integer v2 = ByteBuffer.wrap(b2, s2, l2).getInt(); return v1.compareTo(v2) * (-1); } } public static class MapTask2 extends Mapper<LongWritable, Text, IntWritable, Text> { public void map(LongWritable key,Text value, Context context) throws IOException, InterruptedException { String tokens[]= value.toString().split("//t"); // int empId = Integer.parseInt(tokens[0]) ; int count = Integer.parseInt(tokens[2]) ; context.write(new IntWritable(count), new Text(value)); } } public static class ReduceTask2 extends Reducer<IntWritable, Text, IntWritable, Text> { int cnt=0; public void reduce(IntWritable key, Iterable<Text> list, Context context) throws java.io.IOException, InterruptedException { for (Text value : list ) { cnt ++; if (cnt==11) { break; } context.write(new IntWritable(cnt), value); } } } }

CÓDIGO SIMPLE DE JAVA MUESTRA BIEN

package comparableTest; import java.io.IOException; import java.util.ArrayList; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer.Context; public class TestData { //static int cnt=0; public static void main(String args[]) throws IOException, InterruptedException { ArrayList<String> list = new ArrayList<String>() {{ add("A"); add("B"); add("C"); add("D"); }}; reduce(list); } public static void reduce(Iterable<String> list) throws java.io.IOException, InterruptedException { int cnt=0; for (String value : list ) { cnt ++; if (cnt==3) { break; } System.out.println(value); } } }

Datos de muestra: el encabezado es solo más información, los datos reales provienen de la segunda línea

` ID NAME COUNT (necesita mostrar los 10 primeros desc)

1 Toy Story (1995) 2077

10 GoldenEye (1995) 888

Ayuntamiento 100 (1996) 128

1000 cuajado (1996) 20

1001 Asociado, The (L''Associe) (1982) 0

1002 siguiente paso de Ed (1996) 8

1003 Extreme Measures (1996) 121

1004 Glimmer Man, The (1996) 101

1005 D3: The Mighty Ducks (1996) 142

1006 Chamber, The (1996) 78

1007 Apple Dumpling Gang, The (1975) 232

1008 Davy Crockett, rey de la frontera salvaje (1955) 97

1009 Escape to Witch Mountain (1975) 291

101 Bottle Rocket (1996) 253

1010 Love Bug, The (1969) 242

1011 Herbie Rides Again (1974) 135

1012 Old Yeller (1957) 301

1013 Parent Trap, The (1961) 258

1014 Pollyanna (1960) 136

1015 Homeward Bound: The Incredible Journey (1993) 234

1016 Shaggy Dog, The (1959) 156

1017 Familia suiza Robinson (1960) 276

1018 ¡Ese maldito gato! (1965) 123

1019 20,000 leguas de viaje submarino (1954) 575

102 Sr. Wrong (1996) 60

1020 Cool Runnings (1993) 392

1021 Ángeles en el campo abierto (1994) 247

1022 Cenicienta (1950) 577

1023 Winnie the Pooh y el Blustery Day (1968) 221

1024 Tres Caballeros, El (1945) 126

1025 Espada en la piedra, The (1963) 293

1026 Así que querido para mi corazón (1949) 8

1027 Robin Hood: Prince of Thieves (1991) 344

1028 Mary Poppins (1964) 1011

1029 Dumbo (1941) 568

103 Inolvidable (1996) 33

1030 Pete''s Dragon (1977) 323

1031 Bedknobs and Broomsticks (1971) 319`

Si mueve int cnt=0; dentro del método de reducción (como la primera declaración de este método), obtendrá los primeros 10 valores para cada clave (supongo que esto es lo que quiere).

De lo contrario, como está ahora, su contador seguirá aumentando y omitirá el 11º valor solamente (independientemente de la clave), continuando con el 12º.

Si desea imprimir solo 10 valores (independientemente de la clave), deje la inicialización cnt donde está, y cambie su condición if (cnt > 10) ... Sin embargo, esta no es una buena práctica, por lo que puede necesita reconsiderar tu algoritmo. (suponiendo que no desea 10 valores aleatorios, ¿cómo sabe qué clave se procesará primero en un entorno distribuido, cuando tiene más de 1 reductores y un particionador hash?)