Can't you use the current remade high Res art from team avalanche jusete and jmp to train the tool?
Good training data for a tool like this is data that is quite similar to the data you'll be scaling. The reason the FF9 stuff works so well is because the high-res images are literally the source images for the backgrounds. As such, ESRGAN can understand exactly how those images were scaled, complete with dithering, limited palettes, etc., and thus better understand how to undo that.
Team Avalanche backgrounds, though gorgeous, are often fairly heavily reimagined. Things that might seem innocuous artistically make a big difference on the technical side of things. A different wood texture here, a different cloth material there, reworked lighting... If you downscale those, you don't get the original in-game backgrounds. And ESRGAN doesn't truly understand concepts like "wood", "cloth", etc. It is looking at frequent patterns, color variations, etc. And those would be substantially different on the TA stuff.
So you could use TA backgrounds for both the high- and low-res training images and create potentially okay but not really helpful training data with similar problems to the Manga109 training (albeit in the opposite direction), or you could use the TA backgrounds for high-res and the original in-game backgrounds for low-res training images, and get awful useless training materials that aren't grounded in how scaling actually works. Neither is really desirable.