Computer Science Department | Appalachian State University



Plotting 3D Coordinates Using Microsoft KinectRebecca Cooper and Nick WestveerDepartment of Computer ScienceAppalachian State University Research Experience for Teachers ProgramNational Science Foundation Abstract – The goal of this research is to use the Microsoft Kinect for Windows along with the SkeletonBasics-WPF software to determine Cartesian coordinates of the eight corner points of a rectangular prism. The experimental results were compared to the known coordinates and lengths. A procedure was developed to collect, analyze and report data for these points. Volumes were calculated for each rectangular prism and compared to measured volumes to determine accuracy.Keywords – Kinect, IR imaging, Coordinates, volume, SkeletonBasics-WPFI. INTRODUCTIONAccuracy and precision in a measurement is important for many fields of human endeavor. New technologies to measure distances and spatial relationships help with engineering, design, production and architecture. These technologies can also be used for entertainment purposes and other consumer electronics needs. The Microsoft Kinect Sensor for Windows and the associated SkeletonBasics-WPF program create coordinate data points for an object. These coordinates were checked for accuracy by a variety of methods throughout the research. SkeletonBasics-WPF uses the infrared (IR) sensing components of the Kinect to determine the Cartesian coordinates of a human being measuring an object. As part of the research a procedure was developed for measuring the coordinates of a rectangular prism with the Kinect. The data from the Kinect was compared to measured data by other means to statistically determine the accuracy and range of the Kinect and the software. II. INFORMATION ABOUT THE KINECTThe Kinect Sensor bar, Figure 1, has multiple sensors for gathering information. The arrangement of the sensor within the Kinect Sensor bar can be seen in Figure 2. The Kinect gathers information from a RGB video camera in the center, an IR projector and camera on either side of the RGB camera, and a series of microphones. Each portion of the sensor bar serves a unique purpose in gathering information for the Kinect. The IR projector and camera are primarily used for depth information, while the RGB video camera provides images which can be used in color or converted to grayscale images. [1]Figure SEQ Figure \* ARABIC 1 - Kinect Sensor Bar Exterior ViewFigure SEQ Figure \* ARABIC 2 - Kinect Sensor Bar Interior ViewThe microphone array picks up sounds from various points within the room and keeps track of where those sounds are coming from. This is a very helpful feature when the Kinect is being used for a multi-player game. This research will focus on the IR projector and camera. The IR dots emitted from the projector can be seen using a night-vision camera as in Figure 3. The Kinect sends out the IR dot array from the IR emitter and then the IR camera reads back the reflection of these IR dots to capture depth data. The IR camera reads 30 frames per second at 16 bits per pixel with a resolution of 640x480. This allows the depth to be calculated accuratley to at least 1mm. Microsoft Sensor Specifications states that the viewing angle is 43o vertical and 57o horizontal. [2,3]Figure 3 - Kinect IR dot spectrum as shown by a night vision cameraIII. INFORMATION ABOUT THE SOFTWAREThe version of the SkeletonBasics-WPF software used for this research was developed by Luke Rice, a computer science graduate student at Appalachian State University. Data collection started using the full skeleton program which provides 20 data points. Our research was narrowed to only use one of the twenty points, the right hand. The program creates the Cartesian coordinate points by sensing body-position and estimating the correct position for the 20 points it needs to create the skeleton. As the research progressed some areas of weakness were found in the program which could be revised to streamline data collection. The latest version of the SkeletonBasics-WPF program has radio buttons for a choice of right or left hand and only outputs the data for that particular hand. The right hand radio button is the default setting. The updated version of the software had real time Cartesian coordinates on the screen so the data collector could easily position themselves and the object in at the correct coordinate location. The new software also removed the extra image that was not needed for the research and centered the skeleton in the screen to resolve range issues. Figure 4 shows a comparison of the original program to the revised version.Figure 4 - SkeletonBasics-WPF. Left: Original Version. Right: Updated VersionIn both versions of the software the 20 skeleton points do not have to be present on the data collector’s skeleton. It is imperative that the hand being used, as well as the corresponding wrist and elbow are clearly distinguished with green dots. When the program registered these points clearly the hand position of the generated skeleton was at the base of the ring finger for an open palm and at the first knuckle of the ring finger for a closed fist.IV. RANGE AND ACCURACY WITHIN THE RANGEUsing the Skeleton Basics program data was obtained for Cartesian coordinates of the 20 points on the skeletal structure. In order to accurately make measurements with the Kinect there must be an acceptable range of Cartesian coordinate values for the Kinect. There are two ways to find the range of coordinates for the IR grid of the Kinect Sensor. Our experimental method used SkeletonBasics-WPF program. Measurements were taken from the right and left hand data at various Z distances from the Kinect. At each Z point the right or left hand was placed as far to the edge of the screen as the software would allow. With the left hand a red line as the subject left the edge of the visualization screen, however for the right hand a red line never appeared. This data was used to form the first right/left grid of range. Due to the fact no consistent data was produced on the right side a change to the software needed to be made. The skeleton was not centered in the original version of the program which was causing the inconsistent data. The newer version of the SkeletonBasics-WPF the skeleton was centered and new data was collected for the right hand range. With this data the angles were found to be nearly the same on both the right and left. The right angle was 30o and the left angle was 29o, as seen in Figure5. Figure 5 - MS Excel Graph showing X coordinate range at various Z values.The Y values for the grid were more complicated to find experimentally. There seemed to be an interference with the floor causing inconsistent values for the negative Y range. A similar process to the one used to find the X range was used to find the max y range at various distances, Z. This angle was experimentally determined to be 23o. To further interpret the data, image processing was used to find the angels of the IR dot pattern emitted by the Kinect. Using ImageJ to analyze the IR dot pattern both the positive and negative Y ranges were found to have an angle measured at 25o. This same process was used to confirm the data for the X range with an angle measure of 30o in both the right and left directions. The Kinect for XBOX cannot detect objects closer than 80cm. The Kinect for Windows does a better job and can detect objects beyond 40cm. The maximum Z distance on for either version of the Kinect is 4.00m.[1] The data collected shows that the Kinect loses accuracy beyond 3.50m. V. PROCEDURE FOR MEASURING OBJECTS USING THE SKELETON PROGRAMAn experimental setup was needed to create consistent data. The Kinect Senor bar was placed on a level surface on the same plane as the object being measured. This minimized any error associated with the angles from the sensor to the object being measured. The Kinect Explorer-WPF program from the SDK Developer Toolkit was run to calibrate the vertical tilt of the sensor. Other objects the sensor recognized as skeletal were removed from the field of range to avoid interference with the data.For the base line data set the Cartesian coordinates of an object were measured two meters from the Kinect Sensor bar. The same object was then measured at various ranges between 1.5 and 3.0 meters away from the sensor. These placements were verified by measuring the objects distance to the Kinect Sensor bar using a meter stick, measuring tape and level. This method was also used to verify the X and Y coordinates for the objects and Kinect sensor. The data given by the SkeletonBasics-WPF program was generated in meters allowing for quick verification of the Kinect Sensor bar coordinate data. In order to verify the reproducibility of the method, the data collector and computer operator switched roles throughout the data collection process. The computer operator started and stopped the programs being used, named the data sets, and manipulated the data in MS Excel. When the SkeletonBasics-WPF program is opened by the computer operator, the data collector positioned their body in front of the sensor bar until it recognized the body frame and generated a skeleton. Different techniques were used to measure the Cartesian coordinates of an object. Each corner of a rectangular prism was measured by the following: full hands on the corners of the object, one finger on each corner of the object, a ruler at each corner of the object, a level at each corner of the object, a metal rod from a ring-stand at each corner of the object, and a wooden dowel rod at each corner of the object. The coordinate point the program generated for the hand while using these different techniques was inconsistent from trial to trial. Experimentally the optimum reproducibility for data collection was using a wooden dowel rod one centimeter in diameter with a marked hand position. It was important to have the hand position higher than the object; otherwise the hand was obscured by the object causing the data to be inaccurate. The SkeletonBasics-WPF program worked best if there was only one skeleton in the data collection window. At times the program would recognize a background object such as a chair or table as a second skeleton and the data would come from its perceived hand. Procedurally objects with greater mass or secured to the table did not move during the measuring process, producing more consistent results. VI. MEASURING A RECTANGULAR PRISM USING THE KINECTUsing the Cartesian coordinate data the length, width, height and volume of rectangular prisms could be calculated. To ensure consistency between trials eight data points were labeled, one at each corner of the object. These data points were labeled as snapshots 0-7 by the SkeletonBasics-WPF program. All of the data collected followed the pattern shown in Figure 6. The side of the prism facing the sensor bar was always labeled with the upper right corner as position zero.Figure 6 - Box 2 with corners labeled for data entry positionsVII. MANIPULATION OF DATA USING EXCELIn an effort to ease the manipulation of data produced a spreadsheet was created in MS Excel. The spreadsheet pulls out only the points needed and then calculates the distances between them in order to form the edges of the rectangular prism. This was especially important when using the original SkeletonBasics-WPF program which included all 20 of the points from the skeleton rather than just the right hand from the updated version.The SkeletonBasics-WPF program exports the collected data as a .csv file which can be manipulated in the same fashion as the .xlsx file. By using an .xlsx file data can be saved with functions, graphs and charts. It is therefore important to reformat the data from the .csv to the .xlsx format. The primary focus of the data collection was recreating a rectangular prism object. It was important to have the spreadsheet calculate the length, width and height of each of the boxes used for measurement. For this research it was also important to compare the volume from the Kinect data with the measured volume. Using the spreadsheet allowed data transfer directly from the collection sheet to a formula sheet where calculations were automated. These automated calculations were completed for the length, width, height, volume, and averages. Calculations for percentage error were also run from this data. During data analysis MS Excel was used to calculate the standard deviations and create bell curves to test the accuracy of the results.VIII. RESULTSData collected during this experiment was generated from the eight corner point coordinates for a rectangular prism. Data was collected with the rectangular prisms at various distances, Z, from the Kinect and at various angles right, left, up and down from the center point of the Kinect. The first data results are for the rectangular prism at a distance of Z=2.0m centered at Y=0, and X=0. Using the coordinate points for each corner, length, width and height were found. Each of these distances was used to calculate the volume of Box 1, giving 60 values for the volume. The mean for this data was 4766cm3 with a standard deviation of 888. Of the 60 volume calculations only seven fell outside of the standard of three standard deviations.Due to the existence of experimental outliers the data was analyzed to find which measurements were most and least accurate. Several different methods were used to find the sources of error in measurements. One method was to obtain the actual Cartesian coordinates for the rectangular prism and compare those to the values supplied by the Kinect. The R2 value when comparing the X value from the Kinect to the Actual X value was 0.9976, which is nearly a perfect relationship. The X coordinate was the most accurate, followed by the Y coordinate and then the Z coordinate as shown in Table 1.The same experiment was conducted at a Z value of 1m to compare the accuracy at 1m versus 2m. The 1m data was less accurate than the 2m data as seen in Table 2.Table 1 - Comparison of Cartesian Coordinates at 2mComparison of Cartesian Coordinates at 2mX from KinectActual XY from KinectActual YZ from KinectActual Z-0.0376100.1701340.1752.0281152.002-0.0340800.1726850.1862.1710832.147-0.023340-0.0071102.0108222-0.0225700.00075402.1538722.1380.1859140.210.1786630.1882.0104632.0210.1883750.210.1797550.192.1530272.1570.1963450.210.00935302.0024362.0110.1899770.210.01375802.1497772.15R2 value for XR2 value for YR2 value for Z0.99760.99510.9653 Table 2 - Comparison of Cartesian Coordinates at 1mComparison of Cartesian Coordinates at 1mX from KinectActual XY from KinectActual YZ from KinectActual Z0.07240300.3322920.0990.9271830.940.06127600.3538110.0981.086231.07470.07260700.152707-0.07120.9359850.93570.04452600.163073-0.0761.0806351.07650.2267470.1980.3094780.08121.001390.97380.2247960.19810.3046450.06631.1354541.1090.2811940.1980.18036-0.06120.9494570.97310.2775330.19790.186741-0.05091.0734381.1079R2 value for XR2 value for YR2 value for Z0.95480.98790.9226From the comparison of measured versus actual coordinates it was determined that the Kinect was least accurate with Z distances. The Kinect was also more accurate with all coordinate points at Z=2m than it was at Z=1m. This data can be verified by looking at the standard deviations for each Cartesian coordinate point. For the Z value all of the data collected for the set of coordinates at 2m fell within one standard deviation as seen in Figure 8.Figure 8 - Standard deviation for Z values of 2m with various X and Y coordinatesFigure 9 - Standard deviation for X values at various distances ZFigure 10 - Standard deviation for Y valuesTo verify the volume data a comparison of the length, depth and height calculations found that that there was the most error in depth and the least error in length which is supported by the data above.Similar data was collected for two other rectangular prisms of different dimensions by using the same procedure and data manipulation in MS Excel.IX. CONCLUSIONSThe Kinect for Windows with the SkeletonBasics-WPF program was able to generate data points for rectangular prism objects. The accuracy of Box 1 with one standard deviation was 4766±888. After the outlying data was reemoved the new accuracy with one standard deviation was 4884±738. Human error was most likely the cause of the outlying data points. Looking at the images created by the program the hand point was obscured by the object being measured for several of the outlying points. There was also outlying data caused by multiple skeletons being recognized by the Kinect sensor. Box 2 had an accuracy measure of 6677±580 before removing outliers. For Box 2 there was only one outlier. After removing this data point the new accuracy measure was 6634±527. For Box 3 the accuracy was 10013±750 before removing any outlying data. There was again only one outlying data point. After this data was removed from the set the accuracy became 9972±720. Upon further tests of the accuracy the Kinect experimental values for volume were comparied to the volume measured using a ruler. For Box 1 the measured volume was 4951cm3. Comparing this volume to the average volume from the experimental data we found a error of -3.7%. For Box 2 the measured volume was 6732cm3. Comparing this volume to the average volume obtained experimentally the error was -0.82%. For Box 3 the measured volume was 9798cm3 which upon comparison gave an error of 2.2%.Future work could be done to improve upon several aspects of this research. One area the needs improvement is reproducability. It was difficult to gather consist data for the length width and height because of the proceduce necessary to measure each corner point. The corners at the back of the box were the most difficult to measure and gave the most inconsistent results. Another aspect of this research that was not studied would be to find coordinates on objects that are not rectangular prisms. ACKNOWLEDGEMENTSThis research has been funded by the National Science Foundation through the RET program Award Number 1301089. We would like to acknowledge the Department of Computer Science at Appalachian State University for hosting this grant opportunity. We would like to thank Dr. Rahman Tashakkori and Dr. R. Mitchell Parry for their guidance and instruction through this research process. We are also grateful to Luke Rice and Bahar Akhtar for developing the SkeletonBasics-WPF program that we used for this research and for helping us with general setup of the experiments. This opportunity would not have been possible without the support of the administration at both Watauga and West Wilkes High Schools. REFERENCES[1] Using Kinect for Windows with XNA, 1.1ed., University of Hull, United Kingdom,2012, pp. 6-9.[2] Henry,Peter et.al, “RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments,” University of Washington, Department of Computer Science & Engineering, Seattle, WA, 2012[3] Microsoft Corporation. (2013, July 10). Kinect Sensor [Online].Available: ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download