These are MATLAB-6 compatible *.mat files (read in via the load() function). Each MAT file (representing one single frame from the robot log files) has the raw RGB image as well as the disparity information (so you can do your own stereo processing if desired). Also included in the MAT file is uint8 ''mask'' of the image indicating a pixelwise labeling. 0 means ground plane, 1 means obstacle, and 2 means ''this pixel was not labeled by a human''. Unlabeled areas have meaning; they may regions for which the terrain class was hard to tell (even with context), or they may be ''don't cares'' (e.g., sky).
From a binary image as output by a classification algorithm, or even a classification image with continuous values (e.g., on [0,1] or [-1,+1), this can be compared directly to the human-labeled binary image (see formulas in my IROS paper, which I'll be sending out shortly). Performance on the dataset is then the weighted mean over all frames in the dataset, where the weights are the number of human-labeled test points for a given image I.
These datasets are taken from actual LAGR log data.