A few blocks are using 15-bit colour directly. I was really hoping it would reveal a more technical or logical grouping.
With a good colour palette algorithm they could've displayed the whole image with a single 256 colour palette... and it makes me think, the jpeg screenshot above (highly compressed though) is a lot smaller than the compressed MIM file. They could've packed things a bit better if they used the DCT decompression hardware for field screens as well.
Speed/size tradeoff. The tile data is tiny compared to the video data on the disk, and they needed to DMA it to the framebuffer quick. The PSX had no direct access to VRAM and the transfer is slow. That and the tile has to sit somewhere and it's bad enough that it has to be placed three times in VRAM (Texture cache, and the two framebuffers)
Keep in mind that the PSX can display multiple bit depths on the screen at the same time. You can have a 8-tile next to a 4-bit one next to a 15-bit one. That's your memory saver there. The size on the disk is irrelevant compared to how much VRAM you need.