Adaptive Prediction Tree Coding

Home

Ghost Notes

The Kiss I'd Kill For

Pictures

Essays

Sonnets

APT

CLIP

SDFEAS

Blog

About


APT Version 1.0 (Released March 2004)

John Robinson

What is APT?

APT stands for Adaptive Prediction Trees. It is a general-purpose image coding method for lossless and lossy compression. It codes monochrome and colour images with bit depths from 1 to 16 bits per component. Its performance on all image types is often close to that of the best alternative.

APT uses principal components analysis, a multicomponent image pyramid, shape-adaptive non-linear predictions, adaptive runlength and Huffman coding. It also supports coding with colour palettes where appropriate. No part of APT is patented (but see the note below on the palette ordering method in the provided source code). APT is inherently progressive, and a straightforward modification of the decoder to write directly to an on-screen picture buffer allows simple, progressive, image recovery.

APT supersedes Binary Tree Predictive Coding (BTPC). The final version of BTPC was version 5, released March 2003. APT's spatial predictor, inter-component predictor, quantizer and lossless coder are all different to those in BTPC. APT uses hex-trees rather than binary trees.

How does APT compare with the alternatives?

Here is a thumbnail of a photographic image (original size 512 x 512):

Thumbnail

Here is a graph comparing the coding of the original photo by APT and standard alternatives.

PSNR/bpp graph

Note that the x axis of the graph (bits per pixel) is logarithmic. The vertical axis (Peak Signal to Noise Ratio) is in dB. Higher and to the left is better so this graph shows JPEG 2000 marginally beating APT over a wide range of data rates. The lossless data rates for APT, PNG and JPEG-LS are shown as vertical lines (they correspond to infinite PSNR so using vertical lines is just a way to show them on the same graph). The leftmost line is best, so APT marginally beats JPEG-LS. (The actual values for lossless compression are 12.2 bits per pixel for APT, 12.4 bpp for JPEG-LS and 13.6 for PNG.)

This graphical image (original size 40 x 610):

Thumbnail

is coded losslessly by APT at 1.40 bpp, by PNG at 1.92 bpp and by JPEG-LS at 5.46 bpp. The picture was originally a palettized GIF at 7.24 bpp. Lossy coding is shown here:

PSNR/bpp graph

Again the lossless results for APT, PNG and JPEG-LS are included as vertical lines, showing that both JPEG and JPEG 2000 must code below 30 dB PSNR to achieve a similar data rate to APT's lossless performance.

Results comparing APT with competitors on many other images are given at the bottom of this page. In summary:

When applied losslessly, APT has good performance on both natural images and graphics. On average it matches JPEG-LS on photographs and does much better on graphics. It beats a typical one-shot PNG compressor almost always, but when many possible options for PNG are tried during encoding (e.g. by pngcrush) APT loses to PNG about 30% of the time. In tests APT is rarely more than 5% worse than the best of JPEG-LS, PNG and GIF but it is sometimes much better than all three. For example, the graph image distributed with the APT source code compresses to 0.096bpp with APT, 0.14bpp with PNG and 0.47bpp with both GIF and JPEG-LS.

As a lossy compressor for photographic images, APT is always superior to JPEG and sometimes superior to JPEG 2000. More often it is slightly inferior to JPEG 2000 for photos, but the subjective difference is very small at the compression ratios used in practice. At very low data rates two artifacts of APT become visible -- smudging and speckling. These may be less acceptable subjectively than the "blocky" artifact of DCT and wavelet compression. The two expanded versions of a small part of lenna below were both coded at 0.2 bpp (more compression than is usual). The JPEG 2000 case, on the left, used unequal colour weighting to maximize subjective quality. (In objective comparisons I test JPEG 2000 with equal colour weighting to maximize PSNR.) APT does not provide a choice of colour weightings so the image shown on the right is the same as used in objective comparisons.

Expanded part of decoded Lenna image Expanded part of decoded Lenna image
Expanded part of decoded Lenna image after full picture was
compressed by JPEG 2000 to 0.2 bpp. PSNR = 29.8 dB
Expanded part of decoded Lenna image after full picture was
compressed by APT 1.0 to 0.2 bpp. PSNR = 29.3 dB

For lossy compression of graphics, APT is much superior to both JPEG and JPEG 2000. For other imaging modalities (e.g. MRI, SAR) APT performs close to JPEG 2000.

APT's lossless coding of bilevel images is usually inferior to CCITT Group IV coding (i.e. READ coding). The worst measured performance for APT against standards has been for coding high-resolution document images (e.g. faxes) where it uses up to twice as many bits as Group IV. However, for detailed bilevel images (e.g. the contours of regions in two-component video coding, or a transparency bitmap) it outperforms Group IV. It far outperforms PNG, JPEG-LS, etc. for bilevel images.

APT compresses images with component depths greater than 8 bits. This makes it suitable for 12-bit and 16-bit medical and remote sensing images. I have not done a systematic evaluation of APT's relative performance on such images for lack of suitable comparison coders, but performance is better than independent coding of 8-bit planes by any competing method. However, at its core, APT is still an 8-bit method, and the released version does not include quantization tests to see whether a presented 16-bit image is really just a scaled 8-bit original. In future work I hope to improve APT's 16-bit performance by converting to 16-bit resolution throughout. This should also lead to better handling of large palettes and allow lossless coding with invertible colour transforms.

In the freely-available software implementation of APT I have made some effort to make decoding fast, but have left several slow functions in the encoder. As a result encoding can take up to ten times as long as decoding. APT averages about the same decode speed as Kakuda, a speed-optimized software implementation of JPEG 2000, for images of equivalent quality. APT is much faster than any known version of JPEG-LS.

Getting and using the code

C++ source code is available for APT and may be freely used.

Note: The palette ordering method implemented in the file palettize.h of the source code may be covered by a patent. APT itself does not define palette ordering as this is a matter for the encoder only. I anticipate that later versions of the APT source code will implement a patent-free palettizer in the encoder. This will not affect the APT format (i.e. the palette order can be changed and the decoder will work unaltered). Note that palettes are only used for lossless coding of images with a small number of colours.

The tarred and gzipped package apt1.0.tar.gz contains:

  • source code
  • makefile
  • executables for SuSE 8.2
  • console application executables for Windows (.exe files)

The executables are:

  • capt - Encoder
  • dapt - Decoder
  • isapt - Compressed file inspector: gives basic information
  • psnr - Program to calculate psnr between two pnm pictures
  • bpp - Program to calculate bpp for a coded version of a picture

The makefile contains instructions to make all of these.

Usage:

  • capt input_pic output_file [quality]
  • dapt input_file output_pic

The optional quality parameter is a number between 0 and 100. Default is 80. Quality 100 is lossless.

capt reads PBM, PGM and PPM raw format files (magic number "P4", "P5" or "P6"). dapt writes PGM and PPM files only.

I have included two pictures in the distribution. To test your compiled versions of capt and dapt work correctly, you can do:

  • capt airplane.ppm coded
  • dapt coded outpic
  • bpp airplane.ppm coded [which should print: 0.819702]
  • psnr airplane.ppm outpic [which should print: 35.5444]
  • dapt graph.apt graph.ppm

If you then view graph.ppm, you will see the performance curves for coding of airplane.ppm. airplane.ppm is a photographic image, so the curve for APT is slightly below that of JPEG 2000. graph.ppm, however, is graphical, and is an example of an image that compresses significantly better under APT than its competitors.

Old material

BTPC was described in J A Robinson, "Efficient General-Purpose Image Compression with Binary Tree Predictive Coding", IEEE Transactions on Image Processing, Vol 6, No 4, April 1997, pp 601-607. The web pages at http://www.elec.york.ac.uk/visual/jar11/btpc/btpc.html explain the principles of the method as well as provide results for BTPC version 4.1. I recently (March 2004) learned that BTPC 4.1 is still being used on mobile phones because of its very fast encode and decode speeds, so here is the Unix/Linux archive and here is a zipfile including Windows binaries. Some of the C++ in version 4.1 is obsolete, so you may have to tweak it to get it to compile these days. The BTPC version 5 webpages are still online in a similar format to these APT pages.

APT comparisons with standard algorithms

Comparison methods:

  • JPEG-LS (LOCO-1) -- The Hewlett Packard locoe/locod implementation. Experiments used all three sample, line and plane interleaving options, and the one that compressed the most was chosen for each image. Sometimes this was not the option recommended for a particular image type.
  • PNG encoding was done with the ImageMagick convert tool followed by pngcrush which chooses the best of several PNG encoding methods. This selection is time consuming during encoding but does not affect decode time.
  • JPEG 2000 -- The Kakadu V3.4 implementation. Equal weighting of colour channels was used to maximize PSNR.
  • Baseline JPEG -- The Independent JPEG Group's implementation. This rescales input images to [0,255], so images were pre- and post-processed to ensure fair PSNR results.

Airplane (Photo)

Baboon (Famous Detailed Photo)

BBC Logo (Graphic)

Screen shot (Graphic)

Old Logo (Graphic)

Diary page (Large bilevel image)

Fruits (Photo)

Frymire (Artwork)

Graph (Graphic!)

SHDTV Test Image (Mixed)

Lenna (Famous Photo)

Moonlight (Splashscreen)

MRI (Medical)

X-Ray (Medical)

Peppers (Famous Photo)

512x512 Sailboat (Photo)

Same picture, but smaller (Photo)

Same picture, even smaller (Photo)

Another sailing picture (Photo)

SAR (Remote sensing)

Serrano (Artwork)

Traceroute (Map graphic)

Tulips (Photo)

Watch (Photorealistic graphic)

Water (Photo)