## Xin-mei Wu* , Fang-li Guan** and Ai-jun Xu*## |

Numbers | [TeX:] $$I_{1}$$ | [TeX:] $$I_{2}$$ | [TeX:] $$I_{3}$$ | [TeX:] $$I_{4}$$ | [TeX:] $$I_{5}$$ | [TeX:] $$I_{6}$$ |
---|---|---|---|---|---|---|

[TeX:] $$y_{1}$$ | 45.00 | 45.00 | 45.00 | 45.00 | 45.00 | 45.00 |

[TeX:] $$y_{2}$$ | 50.35 | 51.05 | 51.59 | 49.78 | 48.74 | 47.91 |

[TeX:] $$y_{3}$$ | 56.40 | 55.36 | 64.33 | 60.08 | 61.59 | 60.99 |

[TeX:] $$y_{4}$$ | 66.34 | 72.48 | 61.24 | 65.99 | 69.91 | 73.62 |

[TeX:] $$y_{5}$$ | 70.03 | 65.47 | 81.26 | 67.28 | 82.31 | 77.22 |

[TeX:] $$y_{6}$$ | 94.91 | 78.90 | 90.26 | 83.54 | 81.74 | 99.12 |

[TeX:] $$y_{7}$$ | 79.09 | 99.68 | 97.33 | 100.45 | 104.27 | 103.76 |

[TeX:] $$y_{8}$$ | 114.10 | 107.99 | 106.91 | 100.23 | 127.06 | 109.91 |

[TeX:] $$y_{9}$$ | 122.64 | 119.63 | 116.49 | 120.33 | 124.32 | 118.93 |

[TeX:] $$y_{10}$$ | 117.24 | 137.79 | 129.57 | 130.12 | 136.04 | 110.40 |

[TeX:] $$y_{11}$$ | 137.94 | 119.36 | 123.13 | 139.19 | 123.29 | 137.63 |

[TeX:] $$y_{12}$$ | 153.43 | 154.18 | 155.12 | 162.88 | 151.05 | 152.64 |

[TeX:] $$y_{13}$$ | 147.57 | 157.29 | 194.69 | 150.60 | 163.18 | 183.68 |

[TeX:] $$y_{14}$$ | 154.58 | 171.54 | 181.73 | 172.29 | 162.24 | 195.40 |

[TeX:] $$y_{15}$$ | 172.33 | 192.72 | 189.07 | 192.20 | 190.60 | 192.52 |

[TeX:] $$y_{16}$$ | 226.15 | 207.61 | 193.33 | 212.60 | 219.43 | 207.77 |

[TeX:] $$y_{17}$$ | 239.35 | 194.01 | 218.59 | 193.19 | 225.88 | 199.82 |

[TeX:] $$y_{18}$$ | 231.79 | 237.05 | 215.06 | 230.12 | 236.40 | 210.73 |

[TeX:] $$y_{19}$$ | 255.18 | 255.28 | 239.22 | 245.36 | 246.67 | 258.52 |

[TeX:] $$y_{20}$$ | 276.53 | 225.38 | 251.38 | 269.27 | 283.88 | 278.13 |

[TeX:] $$y_{21}$$ | 264.24 | 242.15 | 276.81 | 278.70 | 294.44 | 282.98 |

[TeX:] $$y_{22}$$ | 252.99 | 269.33 | 297.21 | 312.88 | 316.30 | 287.61 |

According to Pearson correlation analysis, there is a highly significant linear correlation between the length of each cells in new checkerboard and the distance from the ith row of corners in traditional checkerboard to the camera [TeX:] $$(p<0.01),$$ and the correlation coefficient r is 0.975. The least squares method is used to calculate the derivative of [TeX:] $$f(x), f(x)=0.262.$$

Therefore, when the checkerboard’s first row contains [TeX:] $$d^{*} d \mathrm{mm}$$ cells, the remaining rows are fixed in width, and the length difference [TeX:] $$\Delta d$$ between two adjacent rows is [TeX:] $$0.262 \times d \mathrm{mm}.$$ The new checkerboard is shown in Fig. 3. Corners of this checkerboard can be accurately extracted. Furthermore, the influence of the perspective transformation can be reasonably avoided.

While taking photos on the horizontal ground, due to the perspective transformations, traditional corner detection algorithms, such as Harris and Stephens [27] and Shi [28], are poor in robustness. Additionally, these methods also fail to detect corners when the smartphone rotates at a large angle. Therefore, we optimize Geiger’s corner detection method [29] to extract sub-pixel corners. The corner detection algorithm implementation process is shown in Fig. 4.

The algorithm does not require the size of cells and checkerboards when detect corners, and it is robust enough when extract corners from images with high distortion. The corner extraction results are shown in Table 2.

Table 2.

ID | Initial corner | Sub-pixel corner | Distance (mm) |
---|---|---|---|

1 | (158, 3454) | (158.429, 3453.30) | 895 |

2 | (242, 3377) | (242.818, 3377.01) | 958 |

3 | (332, 3295) | (331.134, 3294.45) | 1034 |

4 | (418, 3216) | (418.893, 3215.49) | 1123 |

5 | (506, 3135) | (506.277, 3135.20) | 1186 |

6 | (590, 3057) | (589.468, 3057.80) | 1288 |

7 | (668, 2986) | (667.982, 2985.12) | 1403 |

Three smartphones, like Xiaomi, Huawei, and iPhone are selected to analyze the relationship among actual imaging angle of object point , the ordinate of image point v, and rotation angle of camera . The camera rotation angle are set as [TeX:] $$-10^{\circ}, 0^{\circ}, 10^{\circ}, 20^{\circ}, 30^{\circ},$$ respectively. The corner detection algorithm mentioned in Section 3 is used to extract pixels, and we use SPSS version 22 do regression analysis based on these data. The results are shown in Fig. 5: Fig. 5(a) shows the relationship of the ordinate pixels and actual imaging angles for three different models of smartphones when [TeX:] $$\beta=10^{\circ};$$ Fig. 5(b) shows the relationship between ordinate pixel values and imaging angles for different camera rotation angles.

As can be seen from Fig. 5, the actual imaging angle of the object point decreases as the ordinate pixel of the corresponding image point increase. And for varying rotation angles and smartphones, the relationship between ordinate pixel and actual imaging angle are different. Additionally, given the same abscissas, the ordinatesis of the image points linearly related to their actual imaging angles, where [TeX:] $$p<0.01$$ and the correlation coefficient [TeX:] $$r \geq 0.99.$$

In photogrammetry, to determine the projection transition between the coordinate systems in the pinhole model, it is necessary to use camera parameters to construct a projection geometric model. We combine Zhang’s calibration method [30] and a camera calibration model with nonlinear distortion term to calibrate camera of smartphone. It can correct nonlinear distortions and acquire camera intrinsic parameters.

According to the pinhole camera imaging principle, image points have the following relationship in the image plane coordinate system and the pixel coordinate system:

where [TeX:] $$d_{x}, d_{y}(\text { unit: } \mathrm{mm})$$ denotes the length and width of pixel on the image plane, respectively. Since a pixel projected on the image plane is a rectangle, the length and width of each physical pixel cannot be kept consistent, [TeX:] $$d_{x}$$ is not equal to [TeX:] $$d_{y^{*}}\left(u_{0}, v_{0}\right)$$ denote the origin o of the image plane coordinate system in the pixel coordinate system. In the camera coordinate system, point [TeX:] $$P_{c}\left(X_{c}, Y_{c}, Z_{c}\right)$$ is projected on the image coordinate system [TeX:] $$(x, y, f).$$ The image plane is perpendicular to the optical axis, and the distance from the origin to the image plane is f. According to the principle of similar triangles, we get:

The transformation from the world coordinate system [TeX:] $$P_{W}\left(X_{W}, Y_{W}, Z_{W}\right)$$ to camera coordinate system [TeX:] $$P_{c}$$ is a rigid body motion, including translation and rotation. So from world coordinate system to camera coordinate system:

Combining Eqs. (6) to (8), the relationship of the coordinate system can be expressed by homogeneous coordinates and matrix as:

where [TeX:] $$\boldsymbol{M}_{\mathrm{int}}$$ denotes the camera intrinsic parameters and [TeX:] $$\boldsymbol{M}_{\mathrm{ext}}$$ denotes the extrinsic parameters. Camera external parameters include rotation matrix R and translation matrix T.

For different models of smartphone and camera rotation angles, the image points’ ordinates and the actual imaging angles of the corresponding object points are extremely negatively linearly related. Thus, we get:

The constant coefficients k and b are related to the camera rotation angle . The camera projection geometric model is shown in Fig. 6. As can be seen from Fig. 6, when an object point is projected on the bottom of image, its takes the minimum value [TeX:] $$90-\theta-\beta$$ , while v takes the effective number of pixels in column coordinates of the image sensor. Then, we have:

When [TeX:] $$\alpha_{\min }+2 \theta>90^{\circ},$$ the field of view (FOV) of the camera is above horizontal line—projection geometry model of shoot is shown in Fig. 6(a), takes the maximum value [TeX:] $$90^{\circ}$$ , v infinitely close to [TeX:] $$V_{0}-\tan \beta^{*} f$$ . If the camera rotates counterclockwise, and v takes the same values. Additionally, when [TeX:] $$\alpha_{\min }+2 \theta<90^{\circ},$$ the FOV is lower than the horizon—projection geometry model of shoot is shown in Fig. 6(b), the maximum of actual imaging angle [TeX:] $$\alpha_{\max }=90-\beta+\theta, \text { and } v=0$$. Therefore, substituting into formula (10) results in:

According to the construction principle of the pinhole camera, the tangent value of is equal to half the length of the camera CMOS or CCD image sensor [TeX:] $$L_{\mathrm{CMOS}}$$ divided by the camera focal length f. The physical unit is converted into pixel units to calculate :

Therefore, combining (5)~(8), [TeX:] $$F(\alpha, \beta)$$ can be obtained:

The imaging principle of smartphone’s camera lens is pinhole imaging whose object point, image point as well as camera optical center are in one line. However, because of the manufacturing error, it is actually not an ideal linear model that lead to nonlinear distortion of the image and [TeX:] $$\delta$$ in Eq. (14) is its distortion parameter.

Then, the depth extraction model can be established by combining formula (14) and (1):

Based on the depth of the target object derived from above, we also need to calculate the vertical distance [TeX:] $$T_{x}$$ from the target object to the optical axis. Fig. 7 is a schematic diagram of a camera stereo imaging system, where point P denotes the camera position, and line segment AB is parallel to the image plane. Let coordinates of A be (X, Y, Z) in the camera coordinate system. And the coordinates of point B are [TeX:] $$\left(X+T_{x}, Y, Z\right)$$ in the camera coordinate system. Points A and B are projected on the image plane, where [TeX:] $$A^{\prime}\left(x_{l}, y_{l}\right), B^{\prime}\left(x_{r}, y_{r}\right).$$ According to formula (7):

Combining Eqs. (6) and (16), The horizontal parallax d of the two points [TeX:] $$A^{\prime} \text { and } B^{\prime}-$$ with the same Y and depths—can be expressed as:

Therefore, given camera focal length f, image center point [TeX:] $$\left(u_{0}, v_{0}\right),$$ the physical size [TeX:] $$d_{x}$$ of each pixel in the x-axis on the image plane and depth of the target object, the vertical distance [TeX:] $$T_{x}$$ from the target object to the optical axis can be calculated:

According to formula (3), we can obtain the distance L from the target object to the projection point of the camera:

To verify the feasibility and accuracy of the passive ranging method, we conducted experiments used Xiaomi 3 (MI 3) smartphone. Java combined with C++ were used to write a passive ranging application for smartphones. After the application was written and debugged according to the above method, the accuracy of the depth extraction model and passive ranging were verified separately in the laboratory and natural environment.

The intrinsic parameters of the camera are: [TeX:] $$f_{x}=3486.5637, u_{0}=1569.0383, f_{y}=3497.4652, v_{0}=2107.98988,$$ and the image resolution is [TeX:] $$3120 \times 4208$$ . Substituted the parameters into the model to get the specific depth extraction model of the camera:

Table 4.

Group | Ordinate pixel v | Actual imaging angle[TeX:] $$a\left(^{\circ}\right)$$ | Calculated imaging angle [TeX:] $$a^{\prime}\left(^{\circ}\right)$$ | Angle error [TeX:] $$\left(^{\circ}\right)$$ | True depth D (mm) | Calculated depth [TeX:] $$D^{\prime}$$(mm) | Depth error (mm) | Relative error (%) |
---|---|---|---|---|---|---|---|---|

[TeX:] $$I_{1}$$ | 4075.79 | 60.81 | 60.92 | 0.11 | 546 | 548.50 | 2.50 | 0.458 |

3840.65 | 64.38 | 64.39 | 0.01 | 636 | 636.52 | 0.52 | 0.082 | |

3646.92 | 67.21 | 67.26 | 0.05 | 726 | 727.71 | 1.71 | 0.236 | |

3490 | 69.51 | 69.58 | 0.07 | 816 | 819.21 | 3.21 | 0.393 | |

3364.75 | 71.39 | 71.43 | 0.04 | 906 | 907.85 | 1.85 | 0.205 | |

3257.5 | 72.97 | 73.01 | 0.04 | 996 | 998.52 | 2.52 | 0.253 | |

[TeX:] $$I_{2}$$ | 3144.6 | 74.61 | 74.68 | 0.07 | 1035 | 1040.36 | 5.36 | 0.517 |

3079.58 | 75.58 | 75.64 | 0.06 | 1108 | 1113.23 | 5.23 | 0.472 | |

3013.13 | 76.58 | 76.62 | 0.04 | 1194 | 1198.16 | 4.16 | 0.348 | |

2948.42 | 77.57 | 77.581 | 0.011 | 1293 | 1294.21 | 1.21 | 0.093 | |

2885.83 | 78.22 | 78.50 | 0.28 | 1366 | 1400.82 | 34.82 | 2.549 | |

2827.4 | 79.09 | 79.36 | 0.27 | 1478 | 1517.03 | 39.03 | 2.640 | |

2772.96 | 79.92 | 80.17 | 0.25 | 1603 | 1644.84 | 41.84 | 2.610 | |

2722.87 | 80.71 | 80.91 | 0.2 | 1741 | 1781.31 | 40.31 | 2.315 | |

2676.69 | 81.44 | 81.59 | 0.15 | 1892 | 1927.69 | 35.69 | 1.886 | |

2635.39 | 82.11 | 82.20 | 0.09 | 2056 | 2080.55 | 24.55 | 1.194 | |

2597.41 | 82.73 | 82.76 | 0.03 | 2233 | 2243.41 | 10.41 | 0.466 | |

2562.52 | 83.30 | 83.28 | 0.02 | 2423 | 2418.80 | 4.20 | 0.173 | |

2530.99 | 83.80 | 83.75 | 0.05 | 2626 | 2602.32 | 23.68 | 0.902 |

In experiment 1, the camera rotation angle [TeX:] $$$\beta$ was $0^{\circ}$$$ . In group [TeX:] $$I_{1},$$ the height of camera [TeX:] $$h_{1}$$ was 305 mm. In group [TeX:] $$I_{2},$$ the height of camera [TeX:] $$h_{2}$$ was 285 mm. The corners pixels were extracted, and their actual imaging angles and depths were calculated based on the depth extraction model and ordinate pixels. The experimental data are shown in Table 4. The true depth was measured by a tape. The actual imaging angle of the corner can be calculated according to the cosine value of it, which equal to the actual depth divided by height. And the relative error was obtained by dividing the absolute error (the difference between the calculated depth and the true depth) by the true depth.

From Table 4, we can conclude that the relative error of the depth calculated by depth extraction model does not exceed 3%. The average relative error of depth is 0.93% when the distance is from 0.5 to 2.6 meters. The errors of the depth extracted by depth extraction model may related to many factors, such as the accuracy of the image processing algorithm, different light conditions or some other factors. In addition, due to the nonlinear distortion of camera lens, the closer the target object is to the optical axis of the camera, the smaller the image distortion error and the more accurate the measurement, and vice versa. However, from Table 4 we can conclude that in a certain rang, the measurement error is random, and is acceptable in our next tree DBH and height measurement works.

In experiment 2, the camera rotation angles of experimental groups [TeX:] $$I_{1}, I_{2}, I_{3}, I_{4}, I_{5} \text { were }-10^{\circ}, 0^{\circ}, 10^{\circ},20^{\circ} \text { and } 30^{\circ},$$ respectively, the height of camera [TeX:] $$h_{1}$$ was 408 mm. We also calculated the relative error root mean square (rRMS) of depth D, vertical distance [TeX:] $$T_{x}$$ and distance L under different camera rotation angles. Experimental data is shown in Table 5.

Table 5.

Root mean square of relative error | |||
---|---|---|---|

D | [TeX:] $$\mathbf{T}_{\mathbf{x}}$$ | L | |

[TeX:] $$-10^{\circ}$$ | 0.0319 | 0.0362 | 0.0323 |

[TeX:] $$0^{\circ}$$ | 0.0179 | 0.0207 | 0.0176 |

[TeX:] $$10^{\circ}$$ | 0.0199 | 0.0280 | 0.0205 |

[TeX:] $$20^{\circ}$$ | 0.0280 | 0.0331 | 0.0291 |

[TeX:] $$30^{\circ}$$ | 0.1241 | 0.1351 | 0.1253 |

The experimental results show that when the camera rotates counterclockwise, the relative error RMS of the depth D, the vertical distance [TeX:] $$T_{x}$$ and the distance L were relatively larger. Otherwise, the relative errors RMS were smaller. It is because that once the smartphone camera is rotated clockwise, the range of imaged ground will be far away from the centre of the image and more closer to the bottom of the image, where the nonlinear distortion is larger. It is beneficial to improve the measurement accuracy when we collect image by rotated the smartphone clockwise. The measurement error was also affected by the height of camera, camera intrinsic parameter accuracy and so on.

To verify the accuracy of the passive ranging method in nature scene, we took five images by smartphone camera and each image contained three target objects. In experiment 3, the camera rotation angle was [TeX:] $$O^{O},$$ the height of camera h was 1285 mm. Experimental data are shown in Table 6. The experimental results showed that the relative errors of this method were no more than 6%, while its average relative error was 1.71% within a range of 3–10 m. Sheng et al. [31] developed an underwater binocular vision ranging system with an average relative error of 2.34%, Zou and Yuan [32] achieved a relative error less than 10% of the passive ranging based on monocular vision. Therefore, compare with other passive ranging methods based on machine vision, this method had a relative higher measurement accuracy. In addition, our method is not as accurate as it reported by Huang et al. [25] (a relative error less than 3%). However, compare with this method, we should not to simulate linear relation for all kind of cameras, different camera rotation angles or heights of camera.

The accuracy of this passive ranging method may directly determined by the accuracy of depth extraction model and [TeX:] $$T_{x}$$ measurement result.

Table 6.

Group | True distance L (mm) | Pixel | Calculated distance [TeX:] $$L^{\prime}$$ (mm) | Absolute error (mm) | Relative error (%) |
---|---|---|---|---|---|

[TeX:] $$I_{1}$$ | 2609 | (3921.23, 1338) | 2576.28 | 32.72 | 1.25 |

4977 | (3116.51, 2215.3) | 4949.51 | 27.49 | 0.55 | |

6000 | (2947.6, 1008.67) | 5956.71 | 43.29 | 0.72 | |

[TeX:] $$I_{2}$$ | 6000 | (2947.6, 1008.67) | 5956.71 | 43.29 | 0.72 |

4010 | (3339.2, 1472.43) | 3946.42 | 63.58 | 1.59 | |

10320 | (2584.7, 580.25) | 10425.44 | 516.58 | 5.01 | |

[TeX:] $$I_{3}$$ | 5002 | (3112, 1097.92) | 4933.8 | 68.20 | 1.36 |

7720 | (2761.7, 1502.4) | 7590.05 | 129.95 | 1.68 | |

8000 | (3762.32, 789.55) | 7768.54 | 231.46 | 2.89 | |

[TeX:] $$I_{3}$$ | 3617 | (3477.88, 1049.93) | 3556.47 | 60.53 | 1.67 |

5214 | (3056, 1473.02) | 5191.46 | 22.54 | 0.43 | |

7000 | (2849.52, 692.23) | 6885.23 | 114.78 | 1.64 | |

[TeX:] $$I_{4}$$ | 3215 | (3614.7, 591.6) | 3292.41 | 77.41 | 2.41 |

4500 | (3214.54, 1489.1) | 4417.75 | 82.25 | 1.83 | |

5207 | (3057.1, 896) | 5278.95 | 71.95 | 1.38 |

In this paper, we present a depth extraction model and passive ranging method based on monocular vision system using smartphone. First, we use an optimized corner extraction algorithm to detect and extract the sub-pixel corners of a checkerboard with a fixed width and increased length, and investigate the linear relationship of the actual imaging angle of the object point and the ordinate pixel of the corresponding image point with different camera rotation angles. It is verified that given the same abscissas, the ordinatesis of the image points linearly related to their actual imaging angles [TeX:] $$(p0.01,r \geq 0.99).$$ Therefore, by assuming a linear function and substituting the actual imaging angles and the ordinate pixels of the special conjugate points (maximum and minimum values) into the linear relationship function, we establish a depth extraction model suitable for various of smartphones. What’s more, an improved camera calibration model with a nonlinear distortion term is used to obtain the distortion parameters and intrinsic parameters of camera, and the intrinsic parameters are used to calculate the depth of the target object. According to the principle of camera stereo imaging system, we calculate the vertical distance from the target object to the camera optical axis, and range the distance by Pythagorean theorem. To verify the accuracy of the model, we conduct two sets of experiments in both close and long-distance ranging in the laboratory and nature environment. The experimental results show that the average relative error of the depth measurement is 0.937% when the distance is within 0.5–2.6 m. What’s more, the relative error of measurement is 1.71% when the distance is 3–10 m. Therefore, using this method to measure distance has a high measurement accuracy.

Compared with other passive ranging methods, this method uses a smartphone to measure distance and extract depth which is convenient, portable and easy to be used in daily practice. It does not require a large scene calibration site and avoids errors caused by data fitting. In addition, we only need to obtain the intrinsic parameters by camera calibration at the first time, and then we can calculate the distance from the target object to the camera by a single image. It does not require any calibrations or known dimension objects to be placed in the measuring scene. However, when the target object to be photographed is far away from the camera, due to the perspective transformations, the detection accuracy of its contour is reduced and the measurement accuracy may also be affected. To solve this problem, in the next step we will devote to use the deep learning method to detect and extract a more specific object contour. Moreover, this technique can further be used as the basic of an object’s height and width measurement. Therefore, in the future of our work, we will also engage to use this method to measure the 3D information of an object.

She received B.S. degree in GIS from Anhui Science and Technology University in 2015. She is currently working as master’s degree candidate of Zhejiang Agriculture and Forestry University in Zhejiang, China. Her main research covers machine vision and close-range photogrammetry. He was born in 1992 in Zhejiang Province, China. He received master’s degree from Zhejiang Agriculture and Forestry University in 2018. He is currently working as Ph.D. degree Candidate of State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University in Wuhan, China. His main research covers computer vision and smart navigation for pedestrians.

She received B.S. degree in GIS from Anhui Science and Technology University in 2015. She is currently working as master’s degree candidate of Zhejiang Agriculture and Forestry University in Zhejiang, China. Her main research covers machine vision and close-range photogrammetry. He was born in 1992 in Zhejiang Province, China. He received master’s degree from Zhejiang Agriculture and Forestry University in 2018. He is currently working as Ph.D. degree Candidate of State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University in Wuhan, China. His main research covers computer vision and smart navigation for pedestrians. He was born in 1976 in Anhui Province, China. He received the Ph.D. degree in photogrammetry and remote sensing from Wuhan University in 2007. He is currently working as a Professor in School of Information Engineering-Zhejiang Agriculture and Forestry University, Hangzhou, China. His current research interest includes computer application technology and the application of GIS in the direction of agricultural informatization, etc.

- 1 B. Hou, B. Khanal, A. Alansary, S. McDonagh, A. Davidson, M. Rutherford, J. V. Hajnal, D. Rueckert, B. Glocker, B. Kainz, "3-D reconstruction in canonical co-ordinate space from arbitrarily oriented 2-D images,"
*IEEE Transactions on Medical Imaging*, vol. 37, no. 8, pp. 1737-1750, 2018.doi:[[[10.1109/TMI.2018.2798801]]] - 2 M. Waechter, M. Beljan, S. Fuhrmann, N. Moehrle, J. Kopf, M. Goesele, "Virtual rephotography: novel view prediction error for 3D reconstruction,"
*ACM Transactions on Graphics*, vol. 36, no. 1, 2017.doi:[[[10.1145/3072959.3126787]]] - 3 Y. I. Abdel-Aziz, H. M. Karara, M. Hauck, "Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry,"
*Photogrammetric Engineering Remote Sensing*, vol. 81, no. 2, pp. 103-107, 2015.custom:[[[-]]] - 4 D. L. McKay, M. R. Wohlers, C. K. Chuang, J. S. Draper, J. Walker, "Airborne validation of an IR passive TBM ranging sensor," in
*Proceedings of SPIE 3698: Infrared Technology and Applications XXV. Bellingham*, WA: International Society for Optics and Photonics, 1999;pp. 491-500. custom:[[[-]]] - 5 R. C. Bradshaw, D. P. Schmidt, J. R. Rogers, K. F. Kelton, R. W. Hyers, "Machine vision for high-precision volume measurement applied to levitated containerless material processing,"
*Review of Scientific Instruments*, vol. 76, no. 12, 2015.custom:[[[-]]] - 6 M. Aki, T. Rojanaarpa, K. Nakano, Y. Suda, N. Takasuka, T. Isogai, T. Kawai, "Road surface recognition using laser radar for automatic platooning,"
*IEEE Transactions on Intelligent Transportation Systems*, vol. 17, no. 10, pp. 2800-2810, 2016.doi:[[[10.1109/TITS.2016.2528892]]] - 7 R. Bajsy, "Active perception vs. passive perception," in
*Proceedings of the 3rd Workshop on Computer Vision: Representation and Control*, Bellaire, MI, 1985;pp. 55-59. custom:[[[-]]] - 8 H. Zhang, H. Wei, H. Yang, Y. Li, "Active laser ranging with frequency transfer using frequency comb,"
*Applied Physics Letters*, vol. 108, no. 18, 2016.custom:[[[-]]] - 9 J. Yang, C. Yang, J. Liu, N. Zhu, L. Yu, Y. Liu, "Visual passive ranging system based on target feature size,"
*Optics and Precision Engineering*, vol. 26, no. 1, pp. 245-252, 2018.custom:[[[-]]] - 10 F. Lin, X. Dong, B. M. Chen, K. Y. Lum, T. H. Lee, "A robust real-time embedded vision system on an unmanned rotorcraft for ground target following,"
*IEEE Transactions on Industrial Electronics*, vol. 59, no. 2, pp. 1038-1049, 2011.doi:[[[10.1109/TIE.2011.2161248]]] - 11 J. Sun, G. Sun, P. Ma, T. Dong, Y. Yang, "Laser target localization based on symmetrical wavelet denoising and asymmetric Gauss fitting,"
*Chinese Journal of Lasers*, vol. 44, no. 6, 2017.custom:[[[-]]] - 12 R. Y. Takimoto, M. S. G. Tsuzuki, R. V ogelaar, T. de Castro Martins, A. K. Sato, Y. Iwao, T. Gotoh, S. Kagei, "3D reconstruction and multiple point cloud registration using a low precision RGB-D sensor,"
*Mechatronics*, vol. 35, pp. 11-22, 2016.custom:[[[-]]] - 13 J. Shi, Y. Li, G. Qi, A. Sheng, "Machine vision based passive tracking algorithm with intermittent observations,"
*Journal of Huazhong University of Science and Technology (Natural Science Edition)*, vol. 45, no. 6, pp. 33-37, 2017.custom:[[[-]]] - 14 C. Xu, D. Huang, F. Kong, "Small UA V passive target localization approach and accuracy analysis,"
*Chinese Journal of Scientific Instrument*, vol. 36, no. 5, pp. 1115-1122, 2015.custom:[[[-]]] - 15 J. Mei, D. Zhang, Y. Ding, "Monocular vision for pose estimation in space based on cone projection,"
*Optical Engineering*, vol. 56, no. 10, 2017.custom:[[[-]]] - 16 R. Szeliski,
*Computer Vision: Algorithms and Applications*, NY: Springer, New York, 2010.custom:[[[-]]] - 17 A. Ming, T. Wu, J. Ma, F. Sun, Y. Zhou, "Monocular depth-ordering reasoning with occlusion edge detection and couple layers inference,"
*IEEE Intelligent Systems*, vol. 31, no. 2, pp. 54-65, 2015.doi:[[[10.1109/MIS.2015.94]]] - 18 E. Alexander, Q. Guo, S. Koppal, S. J. Gortler, T. Zickler, "Focal flow: Velocity and depth from differential defocus through motion,"
*International Journal of Computer Vision*, vol. 126, pp. 1062-1083, 2018.doi:[[[10.1007/s11263-017-1051-5]]] - 19 C. S. Royden, D. Parsons, J. Travatello, "The effect of monocular depth cues on the detection of moving objects by moving observers,"
*Vision Research*, vol. 124, pp. 7-14, 2016.custom:[[[-]]] - 20 T. Liu, Y. Mo, G. Xu, X. Dai, X. Zhu, J. Lu, "Depth estimation of monocular video using non-parametric fusion of multiple cues,"
*Journal of Southeast University (Natural Science Edition)*, vol. 45, no. 5, pp. 834-839, 2015.custom:[[[-]]] - 21 Y. Seo, A. Heyden, R. Cipolla, "A linear iterative method for auto-calibration using the DAC equation," in
*Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, Kauai, HI, 2001;custom:[[[-]]] - 22 G. Wu, Z. Tang, "Distance measurement in visual navigation of monocular autonomous robots,"
*Jiqiren (Robot)*, vol. 32, no. 6, pp. 828-832, 2010.custom:[[[-]]] - 23 J. Heikkila, O. Silven, "A four-step camera calibration procedure with implicit image correction," in
*Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, San Juan, Puerto Rico, 1997;pp. 1106-1112. custom:[[[-]]] - 24 C. Wu, C. Lin, C. Lee, "Applying a functional neurofuzzy network to real-time lane detection and front-vehicle distance measurement,"
*IEEE Transactions on SystemsMan, and Cybernetics, Part C (Applications and Reviews)*, vol. 42, no. 4, pp. 577-589, 2012.doi:[[[10.1109/TSMCC.2011.2166067]]] - 25 X. Huang, F. Gao, G. Xu, N. Ding, L. Xing, "Depth information extraction of on-board monocular vision based on a single vertical target image,"
*Journal of Beijing University of Aeronautics and Astronautics*, vol. 41, no. 4, pp. 649-655, 2015.custom:[[[-]]] - 26 D. Scaramuzza, A. Martinelli, R. Siegwart, "A toolbox for easily calibrating omnidirectional cameras," in
*Proceedings of 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems*, Beijing, China, 2016;pp. 5695-5701. custom:[[[-]]] - 27 C. Harris, M. Stephens, "A combined corner and edge detector," in
*Proceedings of the British Machine Vision Conference*, Manchester, UK, 1988;pp. 147-151. custom:[[[-]]] - 28 J. Shi, "Good features to track," in
*Proceedings of IEEE Conference on Computer Vision and Pattern Recognition*, Seattle, WA, 1994;pp. 593-600. custom:[[[-]]] - 29 A. Geiger, F. Moosmann, O. Car, B. Schuster, "Automatic camera and range sensor calibration using a single shot," in
*Proceedings of 2012 IEEE International Conference on Robotics and Automation*, Saint Paul, MN, 2012;pp. 3936-3943. custom:[[[-]]] - 30 Z. Zhang, "Flexible camera calibration by viewing a plane from unknown orientations," in
*Proceedings of the 7th IEEE International Conference on Computer Vision*, Kerkyra, Greece, 1999;pp. 666-673. custom:[[[-]]] - 31 M. Sheng, H. Zhou, H. Huang, H. Qin, "Study on an underwater binocular vision ranging method,"
*Journal of Huazhong University of Science and Technology (Natural Science Edition)*, vol. 46, no. 8, pp. 93-98, 2018.custom:[[[-]]] - 32 B. Zou, Y. Yuan, "High precision distance measurement based on monocular vision for intelligent traffic,"
*Journal of Transportation on Systems Engineering and Information Technology*, vol. 18, no. 4, pp. 46-53,60, 2018.custom:[[[-]]]