Idea #1: Language-agnostic Binary data schemaSo... most of our projects involve
consuming, editing, and writing binary data like Final Fantasy game asset files.
This community has been awesome as far as bringing people together to reverse engineer the file formats, publish their findings on Wiki, and sharing source code.
So now, it would be super nice if we could go just one step further and to create actual schema definitions, in a structured format. "That way, we could auto-generate documentation. And most importantly, I wouldn't have to develop code to parse the data, because it can just be parsed automatically, by my program, which by the way is written in... uh,
my preferred programming language..."
Yeah. Exactly. That's why we don't have schema files. Everyone has different needs, prefers different programming languages, etc.
But still... just imagine the possibilities if we could agree on a language-agnostic schema. This is a common problem and there are solutions for it. For example, perhaps we could use one of these:
https://en.wikipedia.org/wiki/Data_exchange#Data_exchange_languagesEven if the language-agnostic schema isn't super-friendly to your preferred programming language, it would not matter if there were utilities to generate documentation for you to read and understand it in a user-friendly way, and if there were utilities to translate the schema to data structures in your preferred programming language. Right?
Well, for new projects, yeah, that might be fantastic. But... for existing projects... well, suppose we did come up with such a plan and even developed the tools to generate docs and code in everyone's preferred language. Would we expect the authors of existing tools to re-factor their code to use the new auto-generated code? Probably not.
But, if you are such an author, then you probably have some good insight into what kind of features you would want such auto-generated code to have, in your preferred programming language. So, perhaps you would be interested in contributing ideas, or even code, to translating universal schema to "code in my preferred language".
And long-term, imagine the possibilities with
future tools!
I'm sure this community has talked about this before numerous times, so my main question is, do we have a forum topic dedicated to this? If not, can we create one? And if there are any individual threads where this has been discussed at length, can you point to them?
Idea #2:It would be cool to have a tool to auto-reverse-engineer new file formats. The tool could use a repository of known file formats / characteristics (like "magic numbers", other patterns to look for, etc.). It could even be interactive and show different interpretations to the user and let the user guide the tool to choose the right interpretation. Finally, it could output schemas for the user. What format should the schema be in, though? See idea #1 above. (See?! That's another reason we need it!)