c++ - findhomography - Reconstrucción 3D a partir de 2 imágenes sin información sobre la cámara

chess board calibration (3)

el procedimiento me parece bien.
Hasta donde yo sé, con respecto al modelado 3D basado en imágenes, las cámaras están explícitamente calibradas o implícitamente calibradas. no quiere calibrar explícitamente la cámara. harás uso de esas cosas de todos modos. igualar los pares de puntos correspondientes es definitivamente un enfoque muy utilizado.

Soy nuevo en este campo y estoy tratando de modelar una escena simple en 3D de imágenes 2D y no tengo información sobre cámaras. Sé que hay 3 opciones :

Tengo dos imágenes y sé el modelo de mi cámara (intrínsecas) que loadXMLFromFile() de un XML por ejemplo loadXMLFromFile() => stereoRectify() => reprojectImageTo3D()
No los tengo, pero puedo calibrar mi cámara => stereoCalibrate() => stereoRectify() => reprojectImageTo3D()
No puedo calibrar la cámara (es mi caso, porque no tengo la cámara que tomó las 2 imágenes, entonces necesito encontrar pares de puntos en ambas imágenes con SURF, SIFT por ejemplo (puedo usar cualquier blob) detector en realidad), luego calcule los descriptores de estos puntos clave, luego haga coincidir los puntos clave de la imagen a la derecha y la imagen a la izquierda según sus descripciones, y luego encuentre la matriz fundamental a partir de ellos. El procesamiento es mucho más difícil y sería así:
1. detectar puntos clave (SURF, SIFT) =>
2. descriptores de extracción (SURF, SIFT) =>
3. descripciones de comparación y coincidencia (enfoques basados en BruteForce, Flann) =>
4. encontrar mat fundamental ( findFundamentalMat() ) de estos pares =>
5. stereoRectifyUncalibrated() =>
6. reprojectImageTo3D()

Estoy usando el último enfoque y mis preguntas son:

1) ¿Está bien?

2) si está bien, tengo una duda sobre el último paso stereoRectifyUncalibrated() => reprojectImageTo3D() . La firma de la función reprojectImageTo3D() es:

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 ) cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)

Parámetros:

disparity : entrada de un canal de 8 bits sin signo, 16 bits con signo, 32 bits con signo o imagen de disparidad de punto flotante de 32 bits.
_3dImage - _3dImage imagen de punto flotante de 3 canales del mismo tamaño que la disparity . Cada elemento de _3dImage(x,y) contiene coordenadas 3D del punto (x,y) calculado a partir del mapa de disparidad.
Q - matriz de transformación de perspectiva 4x4 que se puede obtener con stereoRectify() .
handleMissingValues - Indica si la función debe manejar valores perdidos (es decir, puntos donde no se calculó la disparidad). Si handleMissingValues=true , los píxeles con la disparidad mínima que corresponde a los valores atípicos (ver StereoBM::operator() ) se transforman en puntos 3D con un valor Z muy grande (actualmente establecido en 10000).
ddepth : profundidad de la matriz de salida opcional. Si es -1, la imagen de salida tendrá una profundidad de CV_32F . ddepth también se puede establecer en CV_16S , CV_32S o `CV_32F ''.

¿Cómo puedo obtener la matriz Q ? ¿Es posible obtener la matriz Q con F , H1 y H2 o de otra manera?

3) ¿Hay otra forma de obtener las coordenadas xyz sin calibrar las cámaras?

Mi código es:

#include <opencv2/core/core.hpp> #include <opencv2/calib3d/calib3d.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/contrib/contrib.hpp> #include <opencv2/features2d/features2d.hpp> #include <stdio.h> #include <iostream> #include <vector> #include <conio.h> #include <opencv/cv.h> #include <opencv/cxcore.h> #include <opencv/cvaux.h> using namespace cv; using namespace std; int main(int argc, char *argv[]){ // Read the images Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE ); Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE ); // check if (!imgLeft.data || !imgRight.data) return 0; // 1] find pair keypoints on both images (SURF, SIFT)::::::::::::::::::::::::::::: // vector of keypoints std::vector<cv::KeyPoint> keypointsLeft; std::vector<cv::KeyPoint> keypointsRight; // Construct the SURF feature detector object cv::SiftFeatureDetector sift( 0.01, // feature threshold 10); // threshold to reduce // sensitivity to lines // Detect the SURF features // Detection of the SIFT features sift.detect(imgLeft,keypointsLeft); sift.detect(imgRight,keypointsRight); std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl; std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl; // 2] compute descriptors of these keypoints (SURF,SIFT) :::::::::::::::::::::::::: // Construction of the SURF descriptor extractor cv::SurfDescriptorExtractor surfDesc; // Extraction of the SURF descriptors cv::Mat descriptorsLeft, descriptorsRight; surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft); surfDesc.compute(imgRight,keypointsRight,descriptorsRight); std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl; // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches) // Construction of the matcher cv::BruteForceMatcher<cv::L2<float> > matcher; // Match the two image descriptors std::vector<cv::DMatch> matches; matcher.match(descriptorsLeft,descriptorsRight, matches); std::cout << "Number of matched points: " << matches.size() << std::endl; // 4] find the fundamental mat :::::::::::::::::::::::::::::::::::::::::::::::::::: // Convert 1 vector of keypoints into // 2 vectors of Point2f for compute F matrix // with cv::findFundamentalMat() function std::vector<int> pointIndexesLeft; std::vector<int> pointIndexesRight; for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) { // Get the indexes of the selected matched keypoints pointIndexesLeft.push_back(it->queryIdx); pointIndexesRight.push_back(it->trainIdx); } // Convert keypoints into Point2f std::vector<cv::Point2f> selPointsLeft, selPointsRight; cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft); cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight); /* check by drawing the points std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin(); while (it!=selPointsLeft.end()) { // draw a circle at each corner location cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2); ++it; } it= selPointsRight.begin(); while (it!=selPointsRight.end()) { // draw a circle at each corner location cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2); ++it; } */ // Compute F matrix from n>=8 matches cv::Mat fundemental= cv::findFundamentalMat( cv::Mat(selPointsLeft), // points in first image cv::Mat(selPointsRight), // points in second image CV_FM_RANSAC); // 8-point method std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl; /* draw the left points corresponding epipolar lines in right image std::vector<cv::Vec3f> linesLeft; cv::computeCorrespondEpilines( cv::Mat(selPointsLeft), // image points 1, // in image 1 (can also be 2) fundemental, // F matrix linesLeft); // vector of epipolar lines // for all epipolar lines for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) { // draw the epipolar line between first and last column cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255)); } // draw the left points corresponding epipolar lines in left image std::vector<cv::Vec3f> linesRight; cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight); for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) { // draw the epipolar line between first and last column cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255)); } // Display the images with points and epipolar lines cv::namedWindow("Right Image Epilines"); cv::imshow("Right Image Epilines",imgRight); cv::namedWindow("Left Image Epilines"); cv::imshow("Left Image Epilines",imgLeft); */ // 5] stereoRectifyUncalibrated():::::::::::::::::::::::::::::::::::::::::::::::::: //H1, H2 – The output rectification homography matrices for the first and for the second images. cv::Mat H1(4,4, imgRight.type()); cv::Mat H2(4,4, imgRight.type()); cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2); // create the image in which we will save our disparities Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S ); Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 ); // Call the constructor for StereoBM int ndisparities = 16*5; // < Range of disparity > int SADWindowSize = 5; // < Size of the block window > Must be odd. Is the // size of averaging window used to match pixel // blocks(larger values mean better robustness to // noise, but yield blurry disparity maps) StereoBM sbm( StereoBM::BASIC_PRESET, ndisparities, SADWindowSize ); // Calculate the disparity image sbm( imgLeft, imgRight, imgDisparity16S, CV_16S ); // Check its extreme values double minVal; double maxVal; minMaxLoc( imgDisparity16S, &minVal, &maxVal ); printf("Min disp: %f Max value: %f /n", minVal, maxVal); // Display it as a CV_8UC1 image imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal)); namedWindow( "windowDisparity", CV_WINDOW_NORMAL ); imshow( "windowDisparity", imgDisparity8U ); // 6] reprojectImageTo3D() ::::::::::::::::::::::::::::::::::::::::::::::::::::: //Mat xyz; //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true); //How can I get the Q matrix? Is possibile to obtain the Q matrix with //F, H1 and H2 or in another way? //Is there another way for obtain the xyz coordinates? cv::waitKey(); return 0; }

Creo que necesita usar StereoRectify para rectificar sus imágenes y obtener Q. Esta función necesita dos parámetros (R y T) para la rotación y la traducción entre dos cámaras. Entonces puedes calcular los parámetros usando solvePnP. Esta función necesita algunas coordenadas reales 3d del cierto objeto y 2d puntos en imágenes y sus puntos correspondientes

StereoRectifyUncalibrated calcula simplemente la transformación de perspectiva plana, no la transformación de rectificación en el espacio de objetos. Es necesario convertir esta transformación planar en transformación de espacio de objetos para extraer matrices Q, y creo que se requieren algunos de los parámetros de calibración de la cámara (como los intrínsecos de la cámara). Es posible que haya algunos temas de investigación en curso con este tema.

Puede haber agregado algunos pasos para estimar la intrínseca de la cámara y extraer la orientación relativa de las cámaras para que su flujo funcione correctamente. Creo que los parámetros de calibración de la cámara son vitales para extraer la estructura 3D adecuada de la escena, si no hay un método de iluminación activo.

También se requieren soluciones basadas en el ajuste del bloque de paquete para refinar todos los valores estimados a valores más precisos.