que - Error de replicación de datos en Hadoop

hortonworks (8)

Estoy implementando el Clúster de nodo único de Hadoop en mi máquina siguiendo el tutorial de Michael Noll y he encontrado un error de replicación de datos

Aquí está el mensaje de error completo:

> hadoop@laptop:~/hadoop$ bin/hadoop dfs -copyFromLocal > tmp/testfiles testfiles > > 12/05/04 16:18:41 WARN hdfs.DFSClient: DataStreamer Exception: > org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to > 0 nodes, instead of 1 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > > at org.apache.hadoop.ipc.Client.call(Client.java:740) at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at > $Proxy0.addBlock(Unknown Source) at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > at $Proxy0.addBlock(Unknown Source) at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) > > 12/05/04 16:18:41 WARN hdfs.DFSClient: Error Recovery for block null > bad datanode[0] nodes == null 12/05/04 16:18:41 WARN hdfs.DFSClient: > Could not get block locations. Source file > "/user/hadoop/testfiles/testfiles/file1.txt" - Aborting... > copyFromLocal: java.io.IOException: File > /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to > 0 nodes, instead of 1 12/05/04 16:18:41 ERROR hdfs.DFSClient: > Exception closing file /user/hadoop/testfiles/testfiles/file1.txt : > org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to > 0 nodes, instead of 1 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > > org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to > 0 nodes, instead of 1 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > > at org.apache.hadoop.ipc.Client.call(Client.java:740) at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at > $Proxy0.addBlock(Unknown Source) at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > at $Proxy0.addBlock(Unknown Source) at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

También cuando ejecuto:

bin/stop-all.sh

Dice que datanode no se ha iniciado y, por lo tanto, no se puede detener. Sin embargo, la salida de jps dice que el datanode está presente.

Intenté formatear el namenode , cambiando los permisos de propietario , pero parece que no funciona. Espero no haber perdido ninguna otra información relevante.

Gracias por adelantado.

Aunque resuelto, estoy agregando esto para futuros lectores. El consejo de Cody de inspeccionar el inicio de namenode y datanode fue útil, y una mayor investigación me llevó a eliminar el directorio hadoop-store / dfs. Haciendo esto resolví este error por mi.

Eliminé las propiedades adicionales en el hdfs-site.xml y luego este problema desapareció. Hadoop necesita mejorar sus mensajes de error. Probé cada una de las soluciones anteriores y ninguna funcionó.

En mi caso tuve que borrar:

/tmp/hadoop-<user-name> y formatee y comience a usar sbin/start-dfs.sh

sbin/start-yarn.sh

En mi caso, configuré erróneamente un destino para dfs.name.dir y dfs.data.dir . El formato correcto es

<property> <name>dfs.name.dir</name> <value>/path/to/name</value> </property> <property> <name>dfs.data.dir</name> <value>/path/to/data</value> </property>

La solución que funcionó para mí fue ejecutar namenode y datanode uno por uno y no juntos usando bin/start-all.sh . Lo que sucede con este enfoque es que el error es claramente visible si tiene algún problema para configurar los datanodes en la red y también muchas publicaciones en sugieren que namenode requiere algo de tiempo para comenzar, por lo tanto, debería darse algún tiempo para Comience antes de iniciar los datanodes. Además, en este caso tuve problemas con diferentes ID de namenode y datanodes para los que tuve que cambiar los ID de datanode con el mismo ID que el namenode.

El procedimiento paso a paso será:

Inicie el namenode bin/hadoop namenode . Compruebe si hay errores, en su caso.
Inicie el datanodes bin/hadoop datanode . Compruebe si hay errores, en su caso.
Ahora inicie el rastreador de tareas, rastreador de trabajos usando ''bin / start-mapred.sh''

Me encontré con el mismo problema. Cuando miré en localhost: 50070 , en el resumen del grupo, todas las propiedades se mostraban como 0, excepto "DFS Used% 100". Generalmente, esta situación se produce porque hay algunos errores en los tres archivos * -site.xml en HADOOP_INSTALL / conf y el archivo hosts.

En mi caso, la causa no puede resolver el nombre de host. Resolví el problema simplemente agregando "IP_Address hostname" a / etc / hosts .

Mire su namenode (probablemente http: // localhost: 50070 ) y vea cuántos datanodes dice que tiene.

Si es 0, entonces su nodo de datos no se está ejecutando o no está configurado para conectarse al namenode.

Si es 1, verifique cuánto espacio libre dice que hay en el DFS. Puede ser que el nodo de datos no tenga ningún lugar donde pueda escribir datos (el directorio de datos no existe o no tiene permisos de escritura).

Tuve el mismo problema, eché un vistazo a los registros de datanode y hubo una advertencia que decía que dfs.data.dir tenía permisos incorrectos ... así que los cambié y todo funcionó, lo cual es un poco extraño.

Específicamente, mi "dfs.data.dir" se configuró en "/ home / hadoop / hd_tmp", y el error que obtuve fue:

... ... WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /home/hadoop/hd_tmp/dfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid. ... ...

Así que simplemente ejecuté estos comandos:

Detuve a todos los demonios con "bin / stop-all.sh"
Cambie los permisos del directorio con "chmod -R 755 / home / hadoop / hd_tmp"
Volví a dar formato al namenode con "bin / hadoop namenode -format".
Reinicié los demonios "bin / start-all.sh"
¡Y voilà, el datanode estaba en funcionamiento! (Lo verifiqué con el comando "jsp", donde se mostró un proceso llamado DataNode).

Y luego todo funcionó bien.