Why grayscale an image




















See text for detailed analysis. A : Performance of each descriptor type on the AR Face dataset. C : Performance of each descriptor type on the Barnard et al. D : Performance of each descriptor type on Caltech The Aleix and Robert AR dataset [20] is a large face dataset containing over 4, color face images under varying lighting, expression, and disguise conditions.

In our experiments we omit images with disguises and changes in expression, leaving eight neutral facial expression images per person see [20] for example images. In each cross-validation run, we randomly choose one training image per person and six testing images. Because there are large changes in brightness, methods that are not robust to these changes could dramatically impair performance. Chance performance is. Our results on the AR dataset are provided in Fig.

For SIFT, SURF, and GB, there is a large performance gap between the best and worst methods, consistent with our hypothesis that the choice of grayscale algorithm is especially important when the number of training instances is small and there is a large amount of brightness variation. Gleam performs well for all four descriptors. Value performs poorly for all descriptors. SIFT performs best compared to the other descriptors. It exhibits large uniform changes in illumination conditions.

We use only the predominantly front-facing images. For training, we use 3 training images and 7 test images per category, chosen randomly. Example images are shown in Fig. Chance performance on CUReT is. The Barnard et al. Each object is photographed in 11 different illumination conditions while the pose of the object is simultaneously varied see Fig. We chose this dataset because it is the only object dataset that exhibits a variety of systematic changes in lighting color, which we hypothesized would influence many of the grayscale representations.

We train on 2 images per object and test on the remaining 9. Chance accuracy is. Two or more methods work well for each descriptor type, but some of them are consistent across descriptors.

Because it is rotation invariant, SURF achieves greater accuracy compared to the other descriptors. The popular Caltech dataset [23] consists of images found using Google image search from object categories, with at least 31 images in each.

As shown in Fig. We adopt the standard Caltech evaluation scheme. We train on 15 randomly chosen images per category and test on 30 other randomly chosen images per category, unless there are fewer than 30 images available in which case all of the remaining images are used. Chance performance on Caltech is. Our Caltech results are provided in Fig. Several methods work well, but only Gleam performs well for all four descriptors.

While the choice of grayscale algorithm is significant for Caltech , it has a less dramatic effect compared to the other datasets. This is likely because Caltech exhibits less brightness variation and we use a comparatively larger training set. SIFT substantially exceeds the performance of the other descriptors. The greatest mean per-class accuracy on Caltech is Luster , which achieved accuracy.

For comparison [25] , achieved with grayscale SIFT descriptors that had been sparse coded in a hierarchical spatial pyramid matching system. The mean rank performance of each grayscale algorithm marginalized over the datasets and descriptor types is shown in Fig. The simplest methods perform best, with Gleam achieving the greatest rank, but it is not significantly different from Intensity.

Value performs most poorly. Methods incorporating gamma correction are generally superior to their counterparts that omit it, e. The -axis is the mean rank for a particular grayscale method when the results are combined across the datasets and descriptor types. Gleam and Intensity exhibit the greatest rank and most robust performance. Our results indicate that each descriptor type is sensitive to the choice of grayscale algorithm.

To analyze magnitude of this effect, we computed the coefficient of variation CV of each method's performance across grayscale algorithms. These results are shown in Fig. For all of the descriptors the choice of grayscale algorithm mattered the least for Caltech , probably because of the greater number of training instances and lack of illumination variability.

The -axis is the coefficient of variation for the accuracy of each descriptor type computed across all of the grayscale methods. All of the methods are sensitive to the choice of grayscale algorithm, but LBP is the least sensitive in general. The choice of grayscale algorithm mattered the least for Caltech and the most for the AR Face Dataset. Our objective was to determine if the method used to convert from color-to-grayscale matters, and we can definitively say that it does influence performance.

For all datasets there was a significant gap between the top performing and worst performing methods. Our results indicate that the method used to convert to grayscale should be clearly described in all publications, which is not always the case in image recognition.

For object and face recognition, Gleam is almost always the top performer. For texture recognition, Luminance and Luminance are good choices. Although color descriptors are sometimes extracted in the HSV colorspace, our results suggest replacing Value with Gleam is advisable.

In general, we observed little benefit from using a method based on human brightness perception. The only potential exception was textures. Emulating the way humans perceive certain colors as brighter than others appears to be of limited benefit for grayscale image recognition. However, methods that incorporate a form of gamma correction e. Developing a pre-processing algorithm specifically designed for edge-based and gradient-based descriptors is an interesting future direction.

One way to achieve this is to learn a transformation from color-to-grayscale that is robust to changes in brightness, perhaps by allowing the gamma value to vary per color channel, e.

There is no reason to assume that the single value used in the standard gamma correction function is ideal for recognition. Alternatively, it may be advisable for the transformation weights to vary depending on the global or local statistics of each particular image.

In both cases it is challenging to optimize the weights explicitly for recognition since doing so would require re-extracting descriptors. As long as the number of parameters remains relatively small, they could feasibly be optimized per dataset using cross-validation or a meta-heuristic, e.

An alternative is to learn a mapping from color images to descriptors directly. There has been some success with this approach [26] , [27] , but it has not been widely adopted because these learned transformations tend to be considerably slower than engineered methods e.

The choice made the largest impact for datasets in which only a limited amount of training data was used and illumination conditions were highly variable. We were successful in identifying a method that was consistently superior for face and object recognition. Similarly, for the problem of texture recognition, a pair of top performers emerged. It is now incumbent upon researchers in the computer vision community to report the conversion method they use in each paper, as this seemingly innocuous choice can significantly influence results.

Conceived and designed the experiments: CK. Performed the experiments: CK. Browse Subject Areas? Click through the PLOS taxonomy to find articles in your field. Abstract In image recognition it is often assumed the method used to convert color images to grayscale has little impact on recognition performance. To get a grayscale image, the color information from each channel is removed, leaving only the luminance values, and that is why the image becomes a pattern of light and dark areas devoid of color, essentially a black and white image.

Most digital imaging software applications, even the most basic ones, are able to convert an image to grayscale. This is also very important when printing, since it only consumes black ink, as opposed to printing in color, which consumes all three print colors cyan, magenta and yellow as well as black. By: Justin Stoltzfus Contributor, Reviewer. By: Satish Balakrishnan. Dictionary Dictionary Term of the Day.

High-Performance Cloud Computing. Techopedia Terms. Connect with us. Sign up. Term of the Day. I disagree with the implication that gray scale images are always better than color images; it depends on the technique and the overall goal of the processing. For example, if you wanted to count the bananas in an image of a fruit bowl image, then it's much easier to segment when you have a colored image!

Many images have to be in grayscale because of the measuring device used to obtain them. Think of an electron microscope. It's measuring the strength of an electron beam at various space points. An AFM is measuring the amount of resonance vibrations at various points topologically on a sample.

In both cases, these tools are returning a singular value- an intensity, so they implicitly are creating a gray-scale image. For image processing techniques based on brightness, they often can be applied sufficiently to the overall brightness grayscale ; however, there are many many instances where having a colored image is an advantage.

Binary might be too simple and it could not represent the picture character. Color might be too much and affect the processing speed. First of starting image processing whether on gray scale or color images, it is better to focus on the applications which we are applying. Unless and otherwise, if we choose one of them randomly, it will create accuracy problem in our result. For example, if I want to process image of waste bin, I prefer to choose gray scale rather than color. Because in the bin image I want only to detect the shape of bin image using optimized edge detection.

I could not bother about the color of image but I want to see rectangular shape of the bin image correctly. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Asked 9 years, 1 month ago. Active 3 years, 3 months ago.

Viewed 50k times. I think this can be a stupid question but after read a lot and search a lot about image processing every example I see about image processing uses gray scale to work I understood that gray scale images use just one channel of color, that normally is necessary just 8 bit to be represented, etc I am not sure if I was clear about my doubt, I hope someone can answer me thank you very much.

Improve this question. Gaurav Kumar 3 3 silver badges 13 13 bronze badges. Even if you don't grayscale the image, all image processing requires some form of collapsing data since even with modern computers, processing raw images is nigh impossible. Even our brains, which one could say are computers built mainly for such tasks, compress and ignore lots of visual input to make sight possible. Add a comment.



0コメント

  • 1000 / 1000