Kaetemi

To content | To menu | To search

Tag - closedsource

Entries feed - Comments feed

Friday 17 August 2012

3ds Max File Format (Part 1: The outer file format; OLE2)

The 3ds Max file format, not too much documentation to be found about it. There are some hints here and there about how it's built up, but there exists no central documentation on it.

Right now we are in the following situation. A few thousand of max files, created by a very old version of max (3.x), containing path references to textures and other max files that have been renamed and relocated or which simply no longer exist. Yes, we have a maxscript that can go through them all, and that manages to fix a large number of paths. However, there are a lot of paths that are stored as part as fields in plugins and material scripts that don't get noticed, and the performance of opening and closing this number of files from 3ds Max directly is horrible. The obvious solution? Figure out how we can read and save the max file with modified contents, without having to understand all of the actual data it contains. Fortunately, this is actually possible without too much work.

Some research online brings up the following blog post, relating to a change in the max file format in version 2010, which would make it easier to update asset paths: http://www.the-area.com/blogs/chris/reading_and_modifying_asset_file_paths_in_the_3ds_max_file. That's nice and all, but it's only from version 2010 on, and it very likely won't contain any assets referenced by path by old plugins and such.

So, starting at the beginning. The blog post I referred to above nicely hints us to the OLE structured file format. Since there exist a wide range of implementations for that, we can pretty much skip that, and accept that it's basically a filesystem in a file, so it's a file containing multiple file streams. A reliable open source implementation of this container format can be found in libgsf. When scanning a fairly recent max file, using the command gsf list, we can find the following streams inside this file:

f         52 VideoPostQueue
f     147230 Scene
f        366 FileAssetMetaData2
f       2198 DllDirectory
f      29605 Config
f       3438 ClassDirectory3
f        691 ClassData
f      29576 SummaryInformation
f       2320 DocumentSummaryInformation

The FileAssetMetaData2 is new in 3ds Max 2010.

One step further, we can start examining the contents of these streams. And it's usually easiest to start off with one of the more simple ones. VideoPostQueue seems small enough to figure out the overall logic of the file format, hoping that the rest is serialized in a similar way. Using the command gsf dump we can get a hex output of one of the streams, and using a simple text editor we can find how it's structured. Binary formats often contain 32 bit length values, which are usually easy to spot in small files, since they'll contain a large number of 00 values. It's basically a matter of finding possible 32bit length integers, and matching them together with various fixed length fields and other typical binary file contents, until something programatically logical turns up. Here's a manually parsed VideoPostQueue storage stream:

[
        50 00 (id: 0x0050)
        0a 00 00 00 (size: 10 - 6 = 4)
        [
                01 00 00 00 (value: 1)
        ]
]
[
        60 00 (id: 0x0060)
        2a 00 00 80 (size: 42 - 6 = 36) (note: negative bit = container)
        [
                10 00 (id: 0x0010)
                1e 00 00 00 (size: 30 - 6 = 24)
                [
                        07 00 00 00 (value: 7)
                        01 00 00 00 (value: 1)
                        00 00 00 00
                        00 00 00 00
                        20 12 00 00 (value: 4610)
                        00 00 00 00
                ]
                20 00 (id: 0x0020)
                06 00 00 00 (size: 6 - 6 = 0)
        ]
]

The storage streams in the max container file contain a fairly simple chunk based file format (and in fact similar in format to the fairly well known .3ds file format). Being based on chunks is what allows 3ds Max to open a file for which certain plugins are missing. It's basically a tree structured format where every entry has an identifier and a size, so when an identifier is unknown, or when it's contents are incompatible, it can simply be kept as is or discarded. The only exceptions in the file that don't use this structure are SummaryInformation and DocumentSummaryInformation, which are supposedly in a standard Windows format, and the new FileAssetMetaData2 section is formatted differently as well unfortunately.

In this format, the chunk header consists of a 2-byte unsigned integer which is the identifier, and a 4-byte unsigned integer, where the 31 least significant bits are the size and the msb is a flag that helpfully lets us know if the chunk itself contains more chunks, and thus is a container, or not. For very large files, where 31 bits is insufficient for the size, the entire size field is set to 0, and the header increases with an additional 64-bit unsigned integer field which is similarly structured as the 32-bit size field. The size field includes the size of the header.

       0 | 0f 20 (id)
                 00 00 00 00 (size missing)
                             17 fe 01 00 00 00 00 80 (size in 64 bits)

With this information it is possible to read a max file, modify the binary contents of chunks (most of them are fairly basic of format), and we should be able to re-save the max file with our modified data. The DllDirectory section, for example, parsed programatically starts like this:

CStorageContainer - items: 20
        [0x21C0] CStorageValue - bytes: 4
        786432216
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 39
                Viewport Manager for DirectX (Autodesk)
                [0x2037] CStorageUCString - length: 19
                ViewportManager.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 49
                mental ray: Material Custom Attributes (Autodesk)
                [0x2037] CStorageUCString - length: 21
                mrMaterialAttribs.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 37
                Custom Attribute Container (Autodesk)
                [0x2037] CStorageUCString - length: 23
                CustAttribContainer.dlo
...

Of course, it would be interesting if we could go further, and directly manipulate the parameters of our own plugins and scripts from our own tools back into the max files so that everything is centrally stored without any duplicate source data in the way. And that's exactly what I'll be doing next.

Sunday 22 August 2010

Why avoid closed silver bullet game engines, why not? A full page of Unity bashing

This is another short excerpt of text that was originally written as part of my final internship report in June. It is here aimed towards Unity, as this was the silver bullet of the year, but a lot of this can generally be applied to pretty much any closed silver bullet engine.

I have now worked with Unity for three projects. A small game at the global game jam this year, an online multiplayer board game prototype, and a first person shooter.

If you want to make generic re-hashed physics games, where the player is a character walking around a world, and you don't want to do anything technologically innovative, then Unity is for you, and then you should use it. For me, it is a pure waste of my time.

Initially, I had given it a chance, because it seemed interesting, and there was quite a lot of attention paid to it recently. While using it for the first time, though, I got the impression that the overly designed architecture, which Unity forces you to use, leads to some very sloppy coding practices (public variables, singletons, and such), and others who I talked to found this as well.

Another issue that quickly turned up with Unity was the fact that it's not possible to debug. When Unity crashes with a fatal error, and it really does that in quite a few situations, you can lose a very valuable amount of time on figuring out just where it's going wrong. Sure, with Unity it's possible to try after every single line of code if it works, because the compilation takes literally no time. After a while you start spending more than half of your time on just launching the game just to see if that last line of code doesn't crash Unity.

I expect from an engine that it only provides me with highly optimized implementations of fundamentally important techniques that take a long time to write. And that I can choose to bypass or modify as I wish. Unity does not give me this. Instead, it gives me a truckload of half working garbage made from a bunch of libraries they licensed from elsewhere which they crudely stitched together to form one static unified blob. In comparison, the most friendly engine that I have used is XNA, simply because it does all the boring stuff for you, but gives you complete freedom in the area of graphical programming. Unity is making me think of ditching C# altogether, just so I can avoid it.

But one of the major issues, really, is vendor lock-in. Mostly any script written for Unity, can not be relevantly used in any game that does not run under Unity, thanks to the Unity specific interfaces that are necessary for scripts to interact with each other in Unity. On the other hand, code written in C++ for one project, with one engine, can easily be retransformed for use in another project without major changes, even when using a different engine, because a proper engine does not force me to use poor design patterns. And a game written in C/C++ using OpenGL and OpenAL runs potentially on everything, because I can make it to, while a game written in Unity will only run on the platforms that Unity decides to support. Unity will not run on Linux, because they said so.

It also does not allow me to experiment with new technologies or techniques. It is not practical to, for example, make use of OpenCL from within Unity, as we do not truly have direct access, which makes it impractical to share data between Unity rendering and OpenCL without needlessly copying data back and forth between the mainboard ram and the gpu ram.

Then there are countless bugs and design flaws, especially in the networking and sound interfaces of the Unity engine, that they have known about for a long time already, as can be proven by various posts on the Unity forums, which are easier to fix by writing an own engine, than by working around them.

In the end though, if I constantly have to hear, only from people who actually don't even make games with Unity themselves, how good Unity is, that "everything" will be fixed in "the next version", and that I should just use it, then something is clearly wrong. It's more viral marketing hype than useful technology.

It's potentially a nice level editor, though.

But the thing is, a lot of people do prefer to work with a commercial all-in-one solution, such as Flash, Unity, or whatever the next silver bullet is. They just want to get their ideas out there, without having to bother with technical details, and they'll just hop on to the next bandwagon whenever it passes by. The upside of this is that they will keep following the current technological advances. The downside is that they'll be lagging behind the current technologies for as far as their engine of choice goes.

I'm not in favor of writing your own engine from scratch either. The thing that you need to be capable of doing, is to rush forward, ahead of the commonly known techniques, and extend what already exists. There is no use in rewriting again and again what we already have, it is more valuable to build upon that. To make a car analogy here; don't reinvent the wheel, make something that's not a wheel that can do it's function of transportation much better than a wheel, and mount it on an existing car.

So, in my opinion as a programmer, the best choice is to start out from a game engine which gives you full access to the source code, whether that be a commercially licensed or open source licensed engine does not matter. What matters is that you should have the ability to fix what is wrong, without wasting more time than necessary. And extend that with your own unique ideas and experiments.