¿Cómo representar la imagen de la cámara YUV-NV21 de Android en el fondo en libgdx con OpenGLES 2.0 en tiempo real?

camera opengl-es-2.0 (2)

A diferencia de Android, soy relativamente nuevo en GL / libgdx. La tarea que debo resolver, a saber, la representación de la imagen de previsualización YUV-NV21 de la cámara de Android en el fondo de la pantalla dentro de libgdx en tiempo real es multifacética. Aquí están las principales preocupaciones:

Solo se garantiza que la imagen de vista previa de la cámara de Android esté en el espacio YUV-NV21 (y en el espacio YV12 similar donde los canales U y V no están intercalados sino agrupados). Suponiendo que la mayoría de los dispositivos modernos proporcionarán una conversión RGB implícita es MUY incorrecto, por ejemplo, la versión más reciente de Samsung Note 10.1 2014 solo proporciona los formatos YUV. Como no se puede dibujar nada en la pantalla en OpenGL a menos que esté en RGB, el espacio de color debe convertirse de alguna manera.
El ejemplo en la documentación de libgdx ( Integración de libgdx y la cámara del dispositivo ) utiliza una vista de superficie de Android que está debajo de todo para dibujar la imagen con GLES 1.1. Desde principios de marzo de 2014, la compatibilidad con OpenGLES 1.x se elimina de libgdx debido a que está obsoleta y casi todos los dispositivos son compatibles con GLES 2.0. Si prueba la misma muestra con GLES 2.0, los objetos 3D que dibuje en la imagen serán semitransparentes. Dado que la superficie detrás no tiene nada que ver con GL, esto no se puede controlar realmente. Deshabilitar BLENDING / TRANSLUCENCY no funciona. Por lo tanto, la representación de esta imagen debe hacerse únicamente en GL.
Esto se debe hacer en tiempo real, por lo que la conversión del espacio de color debe ser MUY rápida. La conversión de software usando mapas de bits de Android probablemente será demasiado lenta.
Como una característica adicional, la imagen de la cámara debe ser accesible desde el código de Android para poder realizar otras tareas además de dibujarla en la pantalla, por ejemplo, enviarla a un procesador de imágenes nativo a través de JNI.

La pregunta es, ¿cómo se realiza esta tarea correctamente y lo más rápido posible?

La respuesta breve es cargar los canales de imagen de la cámara (Y, UV) en las texturas y dibujar estas texturas en una malla utilizando un sombreador de fragmentos personalizado que hará la conversión del espacio de color por nosotros. Como este sombreador se ejecutará en la GPU, será mucho más rápido que la CPU y, ciertamente, mucho más rápido que el código Java. Dado que esta malla es parte de GL, cualquier otra forma 3D o sprites se puede dibujar de manera segura sobre o debajo de ella.

Resolví el problema a partir de esta respuesta https://.com/a/17615696/1525238 . Comprendí el método general usando el siguiente enlace: Cómo usar la vista de cámara con OpenGL ES , está escrito para Bada pero los principios son los mismos. Las fórmulas de conversión eran un poco extrañas, así que las reemplacé con las del artículo de Wikipedia Conversión de YUV a / desde RGB .

Los siguientes son los pasos que conducen a la solución:

Explicación YUV-NV21

Las imágenes en vivo de la cámara de Android son imágenes de vista previa. El espacio de color predeterminado (y uno de los dos espacios de color garantizados) es YUV-NV21 para la vista previa de la cámara. La explicación de este formato es muy dispersa, así que lo explicaré aquí brevemente:

Los datos de la imagen están formados por (ancho x alto) x 3/2 bytes. Los primeros bytes de ancho x alto son el canal Y, 1 byte de brillo para cada píxel. Lo siguiente (ancho / 2) x (altura / 2) x 2 = ancho x altura / 2 bytes es el plano UV. Cada dos bytes consecutivos son los bytes de croma V, U (en ese orden según la especificación NV21) para los 2 x 2 = 4 píxeles originales. En otras palabras, el plano UV tiene un tamaño de (ancho / 2) x (altura / 2) píxeles y se reduce la muestra por un factor de 2 en cada dimensión. Además, los bytes de croma U, V están intercalados.

Aquí hay una imagen muy bonita que explica que el YUV-NV12, NV21 es solo U, V bytes volteados:

¿Cómo convertir este formato a RGB?

Como se indica en la pregunta, esta conversión tomaría demasiado tiempo para estar en vivo si se realiza dentro del código de Android. Afortunadamente, se puede hacer dentro de un sombreador GL, que se ejecuta en la GPU. Esto permitirá que se ejecute muy rápido.

La idea general es pasar los canales de nuestra imagen como texturas al sombreado y renderizarlos de una manera que haga la conversión RGB. Para esto, primero tenemos que copiar los canales en nuestra imagen a los buffers que se pueden pasar a las texturas:

byte[] image; ByteBuffer yBuffer, uvBuffer; ... yBuffer.put(image, 0, width*height); yBuffer.position(0); uvBuffer.put(image, width*height, width*height/2); uvBuffer.position(0);

Luego, pasamos estos buffers a texturas GL reales:

/* * Prepare the Y channel texture */ //Set texture slot 0 as active and bind our texture object to it Gdx.gl.glActiveTexture(GL20.GL_TEXTURE0); yTexture.bind(); //Y texture is (width*height) in size and each pixel is one byte; //by setting GL_LUMINANCE, OpenGL puts this byte into R,G and B //components of the texture Gdx.gl.glTexImage2D(GL20.GL_TEXTURE_2D, 0, GL20.GL_LUMINANCE, width, height, 0, GL20.GL_LUMINANCE, GL20.GL_UNSIGNED_BYTE, yBuffer); //Use linear interpolation when magnifying/minifying the texture to //areas larger/smaller than the texture size Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MIN_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MAG_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_S, GL20.GL_CLAMP_TO_EDGE); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_T, GL20.GL_CLAMP_TO_EDGE); /* * Prepare the UV channel texture */ //Set texture slot 1 as active and bind our texture object to it Gdx.gl.glActiveTexture(GL20.GL_TEXTURE1); uvTexture.bind(); //UV texture is (width/2*height/2) in size (downsampled by 2 in //both dimensions, each pixel corresponds to 4 pixels of the Y channel) //and each pixel is two bytes. By setting GL_LUMINANCE_ALPHA, OpenGL //puts first byte (V) into R,G and B components and of the texture //and the second byte (U) into the A component of the texture. That''s //why we find U and V at A and R respectively in the fragment shader code. //Note that we could have also found V at G or B as well. Gdx.gl.glTexImage2D(GL20.GL_TEXTURE_2D, 0, GL20.GL_LUMINANCE_ALPHA, width/2, height/2, 0, GL20.GL_LUMINANCE_ALPHA, GL20.GL_UNSIGNED_BYTE, uvBuffer); //Use linear interpolation when magnifying/minifying the texture to //areas larger/smaller than the texture size Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MIN_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MAG_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_S, GL20.GL_CLAMP_TO_EDGE); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_T, GL20.GL_CLAMP_TO_EDGE);

A continuación, renderizamos la malla que preparamos anteriormente (cubre toda la pantalla). El shader se encargará de representar las texturas encuadernadas en la malla:

shader.begin(); //Set the uniform y_texture object to the texture at slot 0 shader.setUniformi("y_texture", 0); //Set the uniform uv_texture object to the texture at slot 1 shader.setUniformi("uv_texture", 1); mesh.render(shader, GL20.GL_TRIANGLES); shader.end();

Finalmente, el shader asume la tarea de renderizar nuestras texturas a la malla. El sombreador de fragmentos que logra la conversión real se parece a lo siguiente:

String fragmentShader = "#ifdef GL_ES/n" + "precision highp float;/n" + "#endif/n" + "varying vec2 v_texCoord;/n" + "uniform sampler2D y_texture;/n" + "uniform sampler2D uv_texture;/n" + "void main (void){/n" + " float r, g, b, y, u, v;/n" + //We had put the Y values of each pixel to the R,G,B components by //GL_LUMINANCE, that''s why we''re pulling it from the R component, //we could also use G or B " y = texture2D(y_texture, v_texCoord).r;/n" + //We had put the U and V values of each pixel to the A and R,G,B //components of the texture respectively using GL_LUMINANCE_ALPHA. //Since U,V bytes are interspread in the texture, this is probably //the fastest way to use them in the shader " u = texture2D(uv_texture, v_texCoord).a - 0.5;/n" + " v = texture2D(uv_texture, v_texCoord).r - 0.5;/n" + //The numbers are just YUV to RGB conversion constants " r = y + 1.13983*v;/n" + " g = y - 0.39465*u - 0.58060*v;/n" + " b = y + 2.03211*u;/n" + //We finally set the RGB color of our pixel " gl_FragColor = vec4(r, g, b, 1.0);/n" + "}/n";

Tenga en cuenta que estamos accediendo a las texturas Y y UV utilizando la misma variable de coordenadas v_texCoord , esto se debe a que v_texCoord está entre -1.0 y 1.0, que se escala de un extremo de la textura al otro en oposición a las coordenadas reales del píxel de la textura. Esta es una de las mejores características de los shaders.

El código fuente completo

Dado que libgdx es multiplataforma, necesitamos un objeto que se pueda extender de manera diferente en diferentes plataformas que manejen la cámara y el renderizado del dispositivo. Por ejemplo, es posible que desee omitir la conversión del sombreado YUV-RGB por completo si puede obtener el hardware para proporcionarle imágenes RGB. Por esta razón, necesitamos una interfaz de controlador de cámara de dispositivo que será implementada por cada plataforma diferente:

public interface PlatformDependentCameraController { void init(); void renderBackground(); void destroy(); }

La versión de Android de esta interfaz es la siguiente (se supone que la imagen de la cámara en vivo es de 1280x720 píxeles):

public class AndroidDependentCameraController implements PlatformDependentCameraController, Camera.PreviewCallback { private static byte[] image; //The image buffer that will hold the camera image when preview callback arrives private Camera camera; //The camera object //The Y and UV buffers that will pass our image channel data to the textures private ByteBuffer yBuffer; private ByteBuffer uvBuffer; ShaderProgram shader; //Our shader Texture yTexture; //Our Y texture Texture uvTexture; //Our UV texture Mesh mesh; //Our mesh that we will draw the texture on public AndroidDependentCameraController(){ //Our YUV image is 12 bits per pixel image = new byte[1280*720/8*12]; } @Override public void init(){ /* * Initialize the OpenGL/libgdx stuff */ //Do not enforce power of two texture sizes Texture.setEnforcePotImages(false); //Allocate textures yTexture = new Texture(1280,720,Format.Intensity); //A 8-bit per pixel format uvTexture = new Texture(1280/2,720/2,Format.LuminanceAlpha); //A 16-bit per pixel format //Allocate buffers on the native memory space, not inside the JVM heap yBuffer = ByteBuffer.allocateDirect(1280*720); uvBuffer = ByteBuffer.allocateDirect(1280*720/2); //We have (width/2*height/2) pixels, each pixel is 2 bytes yBuffer.order(ByteOrder.nativeOrder()); uvBuffer.order(ByteOrder.nativeOrder()); //Our vertex shader code; nothing special String vertexShader = "attribute vec4 a_position; /n" + "attribute vec2 a_texCoord; /n" + "varying vec2 v_texCoord; /n" + "void main(){ /n" + " gl_Position = a_position; /n" + " v_texCoord = a_texCoord; /n" + "} /n"; //Our fragment shader code; takes Y,U,V values for each pixel and calculates R,G,B colors, //Effectively making YUV to RGB conversion String fragmentShader = "#ifdef GL_ES /n" + "precision highp float; /n" + "#endif /n" + "varying vec2 v_texCoord; /n" + "uniform sampler2D y_texture; /n" + "uniform sampler2D uv_texture; /n" + "void main (void){ /n" + " float r, g, b, y, u, v; /n" + //We had put the Y values of each pixel to the R,G,B components by GL_LUMINANCE, //that''s why we''re pulling it from the R component, we could also use G or B " y = texture2D(y_texture, v_texCoord).r; /n" + //We had put the U and V values of each pixel to the A and R,G,B components of the //texture respectively using GL_LUMINANCE_ALPHA. Since U,V bytes are interspread //in the texture, this is probably the fastest way to use them in the shader " u = texture2D(uv_texture, v_texCoord).a - 0.5; /n" + " v = texture2D(uv_texture, v_texCoord).r - 0.5; /n" + //The numbers are just YUV to RGB conversion constants " r = y + 1.13983*v; /n" + " g = y - 0.39465*u - 0.58060*v; /n" + " b = y + 2.03211*u; /n" + //We finally set the RGB color of our pixel " gl_FragColor = vec4(r, g, b, 1.0); /n" + "} /n"; //Create and compile our shader shader = new ShaderProgram(vertexShader, fragmentShader); //Create our mesh that we will draw on, it has 4 vertices corresponding to the 4 corners of the screen mesh = new Mesh(true, 4, 6, new VertexAttribute(Usage.Position, 2, "a_position"), new VertexAttribute(Usage.TextureCoordinates, 2, "a_texCoord")); //The vertices include the screen coordinates (between -1.0 and 1.0) and texture coordinates (between 0.0 and 1.0) float[] vertices = { -1.0f, 1.0f, // Position 0 0.0f, 0.0f, // TexCoord 0 -1.0f, -1.0f, // Position 1 0.0f, 1.0f, // TexCoord 1 1.0f, -1.0f, // Position 2 1.0f, 1.0f, // TexCoord 2 1.0f, 1.0f, // Position 3 1.0f, 0.0f // TexCoord 3 }; //The indices come in trios of vertex indices that describe the triangles of our mesh short[] indices = {0, 1, 2, 0, 2, 3}; //Set vertices and indices to our mesh mesh.setVertices(vertices); mesh.setIndices(indices); /* * Initialize the Android camera */ camera = Camera.open(0); //We set the buffer ourselves that will be used to hold the preview image camera.setPreviewCallbackWithBuffer(this); //Set the camera parameters Camera.Parameters params = camera.getParameters(); params.setFocusMode(Camera.Parameters.FOCUS_MODE_CONTINUOUS_VIDEO); params.setPreviewSize(1280,720); camera.setParameters(params); //Start the preview camera.startPreview(); //Set the first buffer, the preview doesn''t start unless we set the buffers camera.addCallbackBuffer(image); } @Override public void onPreviewFrame(byte[] data, Camera camera) { //Send the buffer reference to the next preview so that a new buffer is not allocated and we use the same space camera.addCallbackBuffer(image); } @Override public void renderBackground() { /* * Because of Java''s limitations, we can''t reference the middle of an array and * we must copy the channels in our byte array into buffers before setting them to textures */ //Copy the Y channel of the image into its buffer, the first (width*height) bytes are the Y channel yBuffer.put(image, 0, 1280*720); yBuffer.position(0); //Copy the UV channels of the image into their buffer, the following (width*height/2) bytes are the UV channel; the U and V bytes are interspread uvBuffer.put(image, 1280*720, 1280*720/2); uvBuffer.position(0); /* * Prepare the Y channel texture */ //Set texture slot 0 as active and bind our texture object to it Gdx.gl.glActiveTexture(GL20.GL_TEXTURE0); yTexture.bind(); //Y texture is (width*height) in size and each pixel is one byte; by setting GL_LUMINANCE, OpenGL puts this byte into R,G and B components of the texture Gdx.gl.glTexImage2D(GL20.GL_TEXTURE_2D, 0, GL20.GL_LUMINANCE, 1280, 720, 0, GL20.GL_LUMINANCE, GL20.GL_UNSIGNED_BYTE, yBuffer); //Use linear interpolation when magnifying/minifying the texture to areas larger/smaller than the texture size Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MIN_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MAG_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_S, GL20.GL_CLAMP_TO_EDGE); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_T, GL20.GL_CLAMP_TO_EDGE); /* * Prepare the UV channel texture */ //Set texture slot 1 as active and bind our texture object to it Gdx.gl.glActiveTexture(GL20.GL_TEXTURE1); uvTexture.bind(); //UV texture is (width/2*height/2) in size (downsampled by 2 in both dimensions, each pixel corresponds to 4 pixels of the Y channel) //and each pixel is two bytes. By setting GL_LUMINANCE_ALPHA, OpenGL puts first byte (V) into R,G and B components and of the texture //and the second byte (U) into the A component of the texture. That''s why we find U and V at A and R respectively in the fragment shader code. //Note that we could have also found V at G or B as well. Gdx.gl.glTexImage2D(GL20.GL_TEXTURE_2D, 0, GL20.GL_LUMINANCE_ALPHA, 1280/2, 720/2, 0, GL20.GL_LUMINANCE_ALPHA, GL20.GL_UNSIGNED_BYTE, uvBuffer); //Use linear interpolation when magnifying/minifying the texture to areas larger/smaller than the texture size Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MIN_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_MAG_FILTER, GL20.GL_LINEAR); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_S, GL20.GL_CLAMP_TO_EDGE); Gdx.gl.glTexParameterf(GL20.GL_TEXTURE_2D, GL20.GL_TEXTURE_WRAP_T, GL20.GL_CLAMP_TO_EDGE); /* * Draw the textures onto a mesh using our shader */ shader.begin(); //Set the uniform y_texture object to the texture at slot 0 shader.setUniformi("y_texture", 0); //Set the uniform uv_texture object to the texture at slot 1 shader.setUniformi("uv_texture", 1); //Render our mesh using the shader, which in turn will use our textures to render their content on the mesh mesh.render(shader, GL20.GL_TRIANGLES); shader.end(); } @Override public void destroy() { camera.stopPreview(); camera.setPreviewCallbackWithBuffer(null); camera.release(); } }

La parte principal de la aplicación solo garantiza que se llame a init() una vez al principio, se llama a renderBackground() cada ciclo de renderizado y a destroy() se llama una vez al final:

public class YourApplication implements ApplicationListener { private final PlatformDependentCameraController deviceCameraControl; public YourApplication(PlatformDependentCameraController cameraControl) { this.deviceCameraControl = cameraControl; } @Override public void create() { deviceCameraControl.init(); } @Override public void render() { Gdx.gl.glViewport(0, 0, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT | GL20.GL_DEPTH_BUFFER_BIT); //Render the background that is the live camera image deviceCameraControl.renderBackground(); /* * Render anything here (sprites/models etc.) that you want to go on top of the camera image */ } @Override public void dispose() { deviceCameraControl.destroy(); } @Override public void resize(int width, int height) { } @Override public void pause() { } @Override public void resume() { } }

La única otra parte específica de Android es el siguiente código principal de Android extremadamente corto, simplemente crea un nuevo controlador de cámara de dispositivo específico para Android y se lo pasa al objeto principal libgdx:

public class MainActivity extends AndroidApplication { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); AndroidApplicationConfiguration cfg = new AndroidApplicationConfiguration(); cfg.useGL20 = true; //This line is obsolete in the newest libgdx version cfg.a = 8; cfg.b = 8; cfg.g = 8; cfg.r = 8; PlatformDependentCameraController cameraControl = new AndroidDependentCameraController(); initialize(new YourApplication(cameraControl), cfg); graphics.getView().setKeepScreenOn(true); } }

¿Qué tan rápido es?

He probado esta rutina en dos dispositivos. Si bien las mediciones no son constantes en todos los marcos, se puede observar un perfil general:

Samsung Galaxy Note II LTE - (GT-N7105): tiene ARM Mali-400 MP4 GPU.
- La renderización de un cuadro toma alrededor de 5 a 6 ms, con saltos ocasionales a alrededor de 15 ms cada par de segundos
- La línea de representación real ( mesh.render(shader, GL20.GL_TRIANGLES); ) toma constantemente 0-1 ms
- La creación y el enlace de ambas texturas consistentemente toman 1-3 ms en total
- Las copias de ByteBuffer generalmente toman de 1 a 3 ms en total, pero saltan a unos 7 ms de vez en cuando, probablemente debido a que el búfer de imagen se mueve en el montón de JVM
Samsung Galaxy Note 10.1 2014 - (SM-P600): tiene ARM Mali-T628 GPU.
- La renderización de un cuadro toma alrededor de 2-4 ms, con saltos raros a alrededor de 6-10 ms
- La línea de representación real ( mesh.render(shader, GL20.GL_TRIANGLES); ) toma constantemente 0-1 ms
- La creación y el enlace de ambas texturas toman 1-3 ms en total, pero saltan a aproximadamente 6-9 ms cada par de segundos
- Las copias de ByteBuffer generalmente toman 0-2 ms en total, pero saltan a alrededor de 6 ms muy rara vez

No dude en compartir si cree que estos perfiles se pueden hacer más rápido con algún otro método. Espero que este pequeño tutorial haya ayudado.

Para la forma más rápida y optimizada, solo use la extensión GL común

//Fragment Shader #extension GL_OES_EGL_image_external : require uniform samplerExternalOES u_Texture;

Que en java

surfaceTexture = new SurfaceTexture(textureIDs[0]); try { someCamera.setPreviewTexture(surfaceTexture); } catch (IOException t) { Log.e(TAG, "Cannot set preview texture target!"); } someCamera.startPreview(); private static final int GL_TEXTURE_EXTERNAL_OES = 0x8D65;

En el hilo de Java GL

GLES20.glActiveTexture(GLES20.GL_TEXTURE0); GLES20.glBindTexture(GL_TEXTURE_EXTERNAL_OES, textureIDs[0]); GLES20.glUniform1i(uTextureHandle, 0);

La conversión de color ya está hecha para ti. Puedes hacer lo que quieras en el sombreador de fragmentos.

En general, es una solución Libgdx ya que depende de la plataforma. Puede Inicializar las cosas dependientes de la Plataforma en el wraper y enviarlas a la Actividad Libgdx.

Espero que te ahorre tiempo en tu investigación.