Optimized dithering through (slow) semi-exhaustive search through local neighborhood permutations. Implementation available in the workshop folder of the gegl sources; as a grayscale op that can be used to create a color dither.
Note: this page relies on 1:1 or 2:1, 3:1 pixel sizes; you might want to zoom some steps in our out to hit a scale where the examples look their best.
To speed things up, the search is pre-seeded with roughly the right mix of pixels using a fast positional dither. Then multiple iterations of finding the best improvement available with one of a set of mutations for each pixel, mutations happen in a 2x2 neighborhood of the target pixel and neighbors to the right and down. Whole pixels can be swapped or just one quantization-step.
To enable chunked-processing - the implementation in GEGL is computing the values of a 2px wide border around the processed rectangle first using the input neighborhood, this head-start for the borders results in dithering results that tile with each other. (for further reproducibility - a grid rather than border could be computed - and excluded from inner processing.)
In the following set of images, there is an increase in high frequency detail maintained going from using a blue-noise mask, thourgh floyd-steinberg error diffusion and finally the direct-binary-optimization of shuffle search.
3x3x3 color cube, 64 colors, for this conversion; shuffle search is biting a bit much off the low end of each grayscale band, both blue-noise and floyd steinberg implementations suffer from a incorrect gamma for mixin within the dithering zones between two colors.
6x6x6 color cube, 216 web-safe colors
The following images are from wikipedia apart from the shuffle-search, the gamma of these differ from the David's in the top of the page - as error diffusion and similar is done with sRGB gamma rather than linear data.