Kaetemi

To content | To menu | To search

Tag - max_file_format

Entries feed - Comments feed

Saturday 25 August 2012

3ds Max File Format (Part 6: We get signal)

Let's see what we can do now.

INode *node = scene.container()->scene()->rootNode()->find(ucstring("TR_HOF_civil01_gilet")); nlassert(node);
exportObj("tr_hof_civil01_gilet.obj", node->getReference(1)->getReference(1));

Main screen turn on!!

Plain easy, right?

Thursday 23 August 2012

3ds Max File Format (Part 5: How it all links together; ReferenceMaker, INode)

At this point, you should start to familiarize yourself a bit with the publicly available 3ds Max API documentation. The contents of the file map practically 1:1 with how the system is built up internally. Most important is the inheritance of the classes, as we need to be aware of all the parent classes, and preferably structure our parser classes in a similar way.

As a reminder, check out the output that we got out of the file in the last part. Turn on some good music, and scroll away. The file starts with a ParamBlock2, and ends with a Scene, which is interesting, isn't it? ParamBlock2 is one of the lowest classes in power, while Scene stands basically at the top of the entire system. It means that the deepest structures, or rather the structures on which everything depends, are serialized out first, followed by the higher classes, which are very likely to refer to them in some way. Chances are high that the second class that is serialized directly refers to the first one; and, as you can easily spot, the class with index 1 has two values that equal 0.

        1 0x0001: (SceneClassUnknown: ViewportManager, (0x5a06293c, 0x30420c1e), ViewportManager.gup) [3] { 
                0 0x2034: (StorageRaw) { 
                        Size: 8
                        String: ........ 
                        Hex: 00 00 00 00 ff ff ff ff } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } 
                2 0x1000: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 00 00 00 00 
                        Int: 0 
                        Float: 0 } } 

Unfortunately, 0 is not much of a trustworthy value (which reminds me that you should search for Float: 1 values appearing in the scene, those tend to reveal a lot). But, we do have at least two potential chunks, 0x2034 or 0x1000, that may possibly refer to the first class. Moving on to the next class, at index 2 there's another ParamBlock2; which is unlikely to refer to something else, and at index three we can find another class which happens to have a reference to index 2 in one of it's values, 0x2034, which, as it will appear by examining more data, is no coincidence.

        3 0x0002: (SceneClassUnknown: mental ray: material custom attribute, (0x218ab459, 0x25dc8980), mrMaterialAttribs.gup) [2] { 
                0 0x2034: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 02 00 00 00 
                        Int: 2 
                        Float: 2.8026e-45 } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } } 

As we saw in the previous part, the first chunk that could be found in many of the class instances in the scene, was related to AppData, which is implemented in the Animatable class. The chunks containing the references appear either first, or after this AppData chunk. They are handled by the ReferenceMaker class, a subclass of Animatable, which is one of the base classes that all of the scene classes derive from. Given this, we can fairly assume that the chunks are written in order of inheritance. This is fairly logical. And it also allows us to guess more accurately at which level chunks are implemented that we have not encountered before.

The references, however, are not related to the more user-centric scene graph. Consultation of the documentation will clarify that the scene graph itself is stored in Node classes, which then reference the objects that they represent in the scene, and that the objects normally have no need for knowledge of their position or rank in the scene. In the file, we can find two node entries. After improving our parser to better show the references, these appear as follows.

        380 0x0012: (NodeUnknown: RootNode, (0x00000002, 0x00000000), Builtin) [0] { 
                References 0x2035: (StorageArray) { 
                        0: 16 } 
                0x204B Equals 0x2E (46): (CStorageValue) { 46 } 
                Orphan[0] 0x0120: (StorageRaw) { 
                        Size: 0
                        String:  
                        Hex: } } 

and

        389 0x001a: (NodeImpl: Node, (0x00000001, 0x00000000), Builtin) [0] { 
                AppData: (AppData) [83] PARSED { 
... etc ... }
                References 0x2035: (StorageArray) { 
                        0: 16
                        1: 0
                        2: 384
                        3: 1
                        4: 387
                        5: 3
                        6: 134
                        7: 6
                        8: 388 } 
                0x204B Equals 0x2E (46): (CStorageValue) { 46 } 
                Orphan[0] 0x09ce: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: bc 02 00 00 
                        Int: 700 
                        Float: 9.80909e-43 } 
                Orphan[1] 0x0960: (StorageRaw) { 
                        Size: 8
                        String: |....... 
                        Hex: 7c 01 00 00 00 08 00 00 } 
                Orphan[2] 0x0962: (StorageRaw) { 
                        Size: 40
                        String: G.E._.A.c.c._.M.i.k.o.t.o.B.a.n.i.e.r.e. } 
                Orphan[3] 0x09ba: (StorageRaw) { 
                        Size: 0
                        String:  
                        Hex: } 
... etc ....

The RootNode appears before the Node, which is opposite to how the Scene stores the classes. This would mean that a node holds a reference to it's parent somewhere. It does not seem to be in the references block, though. The index of the parent, 380, would be found in the hex print as 7c 01. Easily spotted. The first four bytes of 0x0960 refer to the parent index, the four bytes after that suspicously look like flags of some sort. The chunk before it, 0x09ce, is constant across the file, and is different in different file versions, and I would expect it to be a version number. The chunk with the name after that speaks for itself, and the empty one doesn't seem to interesting right now, because it's empty. Empty chunks tend to contain strings or arrays, that's all you can guess.

But, this node is just some metadata, and it does not actually contain the mesh. It must point to the mesh somehow. In the file there's an Editable Poly at index 387.

387 0x0018: (SceneClassUnknown: Editable Poly, (0x1bf8338d, 0x192f6098), EPoly.dlo) [12] { 

Which happens to be refered to at index 4 in the 0x2035 block. The chunk refers to more, so we look up those as well.

384 0x0016: (SceneClassUnknown: Position/Rotation/Scale, (0x00002005, 0x00000000), Builtin) [5] { 
134 0x000d: (SceneClassUnknown: NeL Material, (0x64c75fec, 0x222b9eb9), Script) [8] { 
388 0x0019: (SceneClassUnknown: Base Layer, (0x7e9858fe, 0x1dba1df0), Builtin) [9] { 

There are other values in this 0x2035 chunk, but they are not references. Two different chunks are used for storing references to other classes, 0x2034 and 0x2035, either one of them and never both of them in the same class. The 0x2034 block simply is an array of all references directly, where -1 indices map to pointer NULL. Here, however, we see the block 0x2035 in use. Cross referencing with different files reveals that the object appears after the number 1, a controller after 0, a material after 3, and a layer after 6. It would appear that this chunk stores the both the class's index and the reference's index as a map instead, preceeded by some other integer value that probably contains flags of some sort.

At this point, we can parse the scene graph. That's very nice, but we're not geting any actual 3d data yet, which, for curiosity's sake would be totally fun to get out of all this, right? Even though my initial plan was just to parse the file to change some filepath strings inside there.

Tuesday 21 August 2012

3ds Max File Format (Part 4: The first useful data; Scene, AppData, Animatable)

The most interesting part of this file is, evidently, the Scene. Opening it up in the chunk parser, it begins like follows, and goes on for a few ten thousands of lines:

0 0x2012: (StorageContainer) [452] { 
        0 0x0000: (StorageContainer) [6] { 
                0 0x000b: (StorageRaw) { 
                        Size: 24
                        String: <).Z..B0`......!........ 
                        Hex: 3c 29 06 5a 1e 0c 42 30 60 11 00 00 00 00 00 21 e0 2e 04 00 01 00 00 00 } 
                1 0x000e: (StorageRaw) { 
                        Size: 19
                        String: [email protected] 
                        Hex: 00 00 04 00 00 00 00 00 82 00 00 00 00 00 40 00 00 00 00 } 
                2 0x000e: (StorageRaw) { 
                        Size: 19
                        String: [email protected] 
                        Hex: 01 00 01 00 00 00 00 00 82 00 00 00 00 00 40 00 00 00 00 } 

Inside the Scene stream, we can find a single chunk container. In this case the id of this chunk is 0x2012, and is apparently related to which version, 2010, the file was saved with. This container contains a large number of objects, 452 in this case, with ids that are all very low in value. So, we go look for something we recognize.

While scrolling through the file, I found a short string that I recognized, "no fx", which is part of the NeL node properties plugin. It appears in the file under the following section:

        389 0x001a: (StorageContainer) [25] { 
                0 0x2150: (StorageContainer) [84] { 
                        0 0x0100: (StorageRaw) { 
                                Size: 4
                                String: S... 
                                Hex: 53 00 00 00 
                                Int: 83 
                                Float: 1.16308e-43 } 
...
                        18 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. ....'...... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 2e 27 0c 05 09 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 9
                                        String: no sound. } } 
                        19 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. .../'...... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 2f 27 0c 05 06 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 6
                                        String: no fx. } } 
                        20 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. ........... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 b2 07 00 00 01 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 1
                                        String: . 
                                        Hex: 00 } } 
...

The 0x2150 chunk seems to exclusively contain 0x0110 entries, preceeded one 0x0100 header entry stating the number of 0x0110 entries. Every entry has a 0x0130 binary blob which is the value of some property, and a 0x0120 binary blob which would likely be some sort of identifier header of this value. Now we should find out how our own nel plugin code puts this value in there, to see if we can make sense of this header.

CExportNel::getScriptAppData(pNode, NEL3D_APPDATA_ENV_FX, "no fx");

This is the line of code that reads the environment fx property from a node. The value of NEL3D_APPDATA_ENV_FX is defined as:

#define NEL3D_APPDATA_ENV_FX (84682543)

Which should appear as 2f 27 0c 05 in the hex output. And it can indeed be found at byte index 12, followed by another integer 6 which happens to match with the length of the value in the value chunk. There's that redundancy showing up again. Looking deeper into the function to figure out the rest of the bytes, we can find this in the getScriptAppData function

AppDataChunk *ap=node->GetAppDataChunk (MAXSCRIPT_UTILITY_CLASS_ID, UTILITY_CLASS_ID, id);

Where the class id MAXSCRIPT_UTILITY_CLASS_ID matches with the first 8 bytes of the header, and the super class id UTILITY_CLASS_ID matches with the following 4 bytes. In short:

58 48 d6 04 1d 75 d1 16 // MAXSCRIPT_UTILITY_CLASS_ID
20 10 00 00 // UTILITY_CLASS_ID
2f 27 0c 05 // NEL3D_APPDATA_ENV_FX
06 00 00 00 // SIZE

We can now be quite sure that the 0x2150 block contains all of the AppData entries. According to the public SDK documentation, the AppData functionality is part of the Animatable base class. Which means that the chunk which contains this block is either an instance of this Animatable class, or more likely a subclass of it. It is of course very likely that the class is somehow identified by the chunk id, which in this case is 0x001a, or 26 in decimal.

Now, the reason why you shouldn't have skipped part 1 and part 2 of this series, is because the data in the DllDirectory and ClassDirectory3 streams is needed to make sense of the Scene stream. Similar to how the ClassDirectory3 references a dll in the DllDirectory by index, the Scene is here referencing a class by index in the ClassDirectory3. This allows to get hold of the ClassId, needed to instantiate the class from the associated class description instance, or to provide a fallback mechanism when the class is part of a missing plugin, and be able to provide the user with the name and description of what is missing from his installation.

At index 26 of the ClassDirectory3 in this file, I can find the following entry:

Entries[26]: (ClassEntry) [2] PARSED { 
        Header: (ClassEntryHeader) { 
                DllIndex: -1
                ClassId: (0x00000001, 0x00000000)
                SuperClassId: 1 } 
        Name: Node}

Which is the builtin class Node, which probably implements the INode interface that ends up deriving from the Animatable base class.

To confirm this theory, I go the other way, and look for a class I know about.

Entries[13]: (ClassEntry) [2] PARSED { 
        Header: (ClassEntryHeader) { 
                DllIndex: -2
                ClassId: (0x64c75fec, 0x222b9eb9)
                SuperClassId: 3072 } 
        Name: NeL Material} 

At index 13 I find the NeL material script class, which would then be found as a chunk in the file with id 0x000d. And there exists one.

        134 0x000d: (StorageContainer) [8] { 
                0 0x2034: (StorageRaw) { 
                        Size: 40
                        String: |.......~............................... 
                        Hex: 7c 00 00 00 7d 00 00 00 7e 00 00 00 7f 00 00 00 80 00 00 00 81 00 00 00 82 00 00 00 83 00 00 00 84 00 00 00 85 00 00 00 } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } 
                2 0x21b0: (StorageContainer) [1] { 
                        0 0x1020: (StorageRaw) { 
                                Size: 4
                                String: ^... 
                                Hex: 5e 00 00 00 
                                Int: 94 
                                Float: 1.31722e-43 } } 
                3 0x0010: (StorageRaw) { 
                        Size: 8
                        String: ........ 
                        Hex: 0e 00 00 00 09 00 00 00 } 
                4 0x4001: (StorageRaw) { 
                        Size: 50
                        String: M.T.R.L._.G.E._.A.c.c._.M.i.k.o.t.o.B.a.n.i.e.r.e. } 
                5 0x4003: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 0a 02 00 00 
                        Int: 522 
                        Float: 7.31478e-43 } 
                6 0x4020: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 00 00 00 00 
                        Int: 0 
                        Float: 0 } 
                7 0x4030: (StorageRaw) { 
                        Size: 16
                        String: ...............? 
                        Hex: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 3f } } 

Which, as entry 0x4001 helpfully lets us know by the UTF-16 name MTRL_GE_Acc_MikotoBaniere, is indeed a NeL Material.

So, using this knowledge, we can parse the scene container in a smarter way. Here I provide for you the output at this point.

http://dl.kaetemi.be/blog/maxfile_dump_4.txt

Now that we have figured out which chunks are contained in the Scene streams, and we know how and where some of the data inside it is formatted, how does it all link together? That will be the next subject.

Sunday 19 August 2012

3ds Max File Format (Part 3: The department of redundancy department; Config)

Now we'll have a look at the Config stream. It begins like follows, and goes on forever with various integer fields and other binary blobs.

(StorageContainer) [15] { 
0 0x2090: (StorageRaw) { 
        Size: 4
        String: .... 
        Hex: 00 00 00 00 } 
1 0x20e0: (StorageContainer) [4] { 
        0 0x0100: (StorageRaw) { 
                Size: 12
                String: .. A........ 
                Hex: 00 00 20 41 0a 00 00 00 01 00 00 00 } 
        1 0x0400: (StorageRaw) { 
                Size: 8
                String: ........ 
                Hex: 07 00 00 00 01 00 00 00 } 

As most of the contents seems fairly different from eachother, it's best to look from a distance to the main container.

(StorageContainer) [15] {
0 0x2090: (StorageRaw)
1 0x20e0: (StorageContainer) [4]
2 0x20a0: (StorageContainer) [2]
3 0x20a5: (StorageContainer) [2]
4 0x20a6: (StorageContainer) [1]
5 0x2190: (StorageContainer) [2]
6 0x20b0: (StorageContainer) [10]
7 0x2130: (StorageContainer) [3]
8 0x2080: (StorageContainer) [213]
9 0x20d0: (StorageContainer) [9]
10 0x2160: (StorageContainer) [5]
11 0x21a0: (StorageContainer) [82]
12 0x2180: (StorageContainer) [1]
13 0x2007: (StorageContainer) [1]
14 0x2008: (StorageContainer) [3] }

The first id seems to be unique, so we can assume that each of these containers has a specific set of information in it. Comparing between files of max versions, there are some less and some more of these identifiers, but the contents of them remains pretty much the same.

One container in this file particularly interests me, as it contains strings related to the NeL Material, and thus will likely be necessary to parse the Scene format where this is stored. More specifically, chunk 0x2180 contains stuff like the following:

9 0x0007: (StorageContainer) [3] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 02 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 17
                String: ....bForceZWrite. 
                Hex: 0d 00 00 00 62 46 6f 72 63 65 5a 57 72 69 74 65 00 } 
        2 0x0007: (StorageContainer) [7] { 
                0 0x0060: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 06 00 00 00 } 
                1 0x0006: (StorageRaw) { 
                        Size: 9
                        String: ....type. 
                        Hex: 05 00 00 00 74 79 70 65 00 } 
                2 0x0006: (StorageRaw) { 
                        Size: 12
                        String: ....boolean. 
                        Hex: 08 00 00 00 62 6f 6f 6c 65 61 6e 00 } 
...

The block itself begins like:

12 0x2180: (StorageContainer) [1] { 
        0 0x0040: (StorageContainer) [2] { 
                0 0x0050: (StorageRaw) { 
                        Size: 12
                        String: ....._.d..+" 
                        Hex: 00 0c 00 00 ec 5f c7 64 b9 9e 2b 22 } 
                1 0x0007: (StorageContainer) [10] { 
                        0 0x0060: (StorageRaw) { 
                                Size: 4
                                String: .... 
                                Hex: 09 00 00 00 } 
                        1 0x0007: (StorageContainer) [15] { 
                                0 0x0060: (StorageRaw) { 
                                        Size: 4
                                        String: .... 
                                        Hex: 0e 00 00 00 } 
                                1 0x0006: (StorageRaw) { 
                                        Size: 9
                                        String: ....nlbp. 
                                        Hex: 05 00 00 00 6e 6c 62 70 00 } 
...

A max file that has the NeL Multi Bitmap script used in it as well, has 2 0x0040 entries in the 0x2180 container. We'll call the 0x2180 block ConfigScript, and the 0x0040 container ConfigScriptEntry from now on, as these seem to be related to how script parameters will be stored in the file. What's also interesting is that all the chunks with id 0x0007 in this entire block are containers, and all the 0x0060 blocks are integers. The 0x0050 block is the header block for the ConfigScriptEntry, and contains the same SuperClassID and ClassID from the NeL Material script as seen previously.

Here are a few 0x0007 blocks from a specific depth inside the tree structure:

2 0x0007: (StorageContainer) [5] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 04 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 12
                String: ....boolean. 
                Hex: 08 00 00 00 62 6f 6f 6c 65 61 6e 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0001: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 00 00 00 00 } } } 
2 0x0007: (StorageContainer) [7] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 06 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 10
                String: ....float. 
                Hex: 06 00 00 00 66 6c 6f 61 74 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0004: (StorageRaw) { 
                Size: 4
                String: ..#< 
                Hex: 0a d7 23 3c } 
        5 0x0006: (StorageRaw) { 
                Size: 7
                String: ....ui. 
                Hex: 03 00 00 00 75 69 00 } 
        6 0x0007: (StorageContainer) [2] { 
                0 0x0060: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 01 00 00 00 } 
                1 0x0006: (StorageRaw) { 
                        Size: 17
                        String: ....cfBumpUSpeed. 
                        Hex: 0d 00 00 00 63 66 42 75 6d 70 55 53 70 65 65 64 00 } } } } 
2 0x0007: (StorageContainer) [5] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 04 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 12
                String: ....integer. 
                Hex: 08 00 00 00 69 6e 74 65 67 65 72 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0003: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 01 00 00 00 } } } 

If you have an understanding of the file format at this point, and try to understand the contents of these blocks, you should notice how ridiculous this looks.

It's like making an xml file that goes

<name>Name</name><value>Number</value>
<name>Value</name><value>2</value>

instead of

<Number>2</Number>

thanks to abstraction layers being piled up on each other.

The 0x0060 entry contains an integer which states the number of chunks that follow in the container, and is thus basically the size header of an array. Chunks with id 0x0006 recognizably contain strings, prefixed with their size, which is already known by the chunk header, and followed by an unnecessary null value byte suffix. It gets even sillier. The blocks shown above are actually not arrays, but tables of two columns stored in an array stored in the chunk tree structure. The first value in the array is the name column, and the second value the value column.

These blocks are child chunks of containers, containing a chunk with a name of a data field, and describe the format of this data field. It is fairly straightforward, the value of the name 'type' is the type, the 'default' is the default, and so on. For the default value, and this actually goes for the entire format of the ConfigScript block, the id of the chunk is directly related to the type field, and defines the actual low level storage of the field. The type fields is very helpful in finding out the meanings of these chunk ids.

0x0001 is a boolean stored as 4 bytes, 0x0002 does not appear in my file, 0x0003 is a 32 bit possibly signed integer, 0x0004 is a float, 0x0005 is a string in the same format as the 0x0006 internal strings, 0x0007 is the previously covered array-in-a-container, 0x0008 is a color stored as a 3 floating point vector with value 255.f being the maximum. With this information, the file can be made much more readable. Here's a short excerpt (pun intended) from inside the ConfigScript block.

2 0x0007: (ConfigScriptMetaContainer) [42] { 
        0 0x0060: (CStorageValue) { 41 } 
        1 0x0006: (ConfigScriptMetaString) { main } 
        2 0x0003: (CStorageValue) { 1 } 
        3 0x0003: (CStorageValue) { 2 } 
        4 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { rollout } 
                2 0x0006: (ConfigScriptMetaString) { NelParams } } 
        5 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bLightMap } 
                2 0x0007: (ConfigScriptMetaContainer) [5] { 
                        0 0x0060: (CStorageValue) { 4 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } } } 
        6 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bUnlighted } 
                2 0x0007: (ConfigScriptMetaContainer) [7] { 
                        0 0x0060: (CStorageValue) { 6 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } 
                        5 0x0006: (ConfigScriptMetaString) { ui } 
                        6 0x0007: (ConfigScriptMetaContainer) [2] { 
                                0 0x0060: (CStorageValue) { 1 } 
                                1 0x0006: (ConfigScriptMetaString) { cbUnlighted } } } } 
        7 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bStainedGlassWindow } 
                2 0x0007: (ConfigScriptMetaContainer) [7] { 
                        0 0x0060: (CStorageValue) { 6 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } 
                        5 0x0006: (ConfigScriptMetaString) { ui } 
                        6 0x0007: (ConfigScriptMetaContainer) [2] { 
                                0 0x0060: (CStorageValue) { 1 } 
                                1 0x0006: (ConfigScriptMetaString) { cbStainedGlassWindow } } } } 

Most other blocks in this file seem to contain value sets where the type is fixed to the id, and the id is basically the name of the config value. Right now, there doesn't seem to be anything in there that interests me, so I won't bother with them too much, but here's an example of one simplified anyways.

2 0x20a0: (Config20a0) [2] { 
        0 0x0100: (CStorageValue) { 1 } 
        1 0x0110: (Config20a0Entry) [25] { 
                0 0x0100: (CStorageValue) { 220 } 
                1 0x0110: (CStorageValue) { 0 } 
                2 0x0120: (CStorageValue) { 1 } 
                3 0x0130: (CStorageValue) { 0 } 
                4 0x0140: (CStorageValue) { 0 } 
                5 0x0150: (CStorageValue) { 0 } 
                6 0x0160: (CStorageValue) { 1 } 
                7 0x0161: (CStorageValue) { 1 } 
                8 0x0170: (CStorageValue) { 1 } 
                9 0x0180: (CStorageValue) { 0 } 
                10 0x0190: (CStorageValue) { 0 } 
                11 0x0200: (CStorageValue) { 0 } 
                12 0x0210: (CStorageValue) { 0 } 
                13 0x0220: (CStorageValue) { 994352038 } 
                14 0x0230: (CStorageValue) { 1041059807 } 
                15 0x0240: (CStorageValue) { 266338296 } 
                16 0x0250: (CStorageValue) { 131008 } 
                17 0x0260: (CStorageValue) { 0 } 
                18 0x0270: (CStorageValue) { 1 } 
                19 0x0280: (CStorageValue) { 0 } 
                20 0x0310: (CStorageValue) { 0 } 
                21 0x0290: (CStorageValue) {  } 
                22 0x0390: (CStorageValue) { default } 
                23 0x0300: (StorageContainer) [1] { 
                        0 0x0100: (StorageRaw) { 
                                Size: 4
                                String: .... 
                                Hex: 00 00 00 00 } } 
                24 0x0330: (StorageRaw) { 
                        Size: 16
                        String: ................ } } } 

Quite boring, right?

Next up is the long awaited Scene.

3ds Max File Format (Part 2: The first inner structures; DllDirectory, ClassDirectory3)

Now that we understand the outer structure of the file, it's time to look closer to what's inside. The DllDirectory stream looks like a good starting point. After cleaning up a whole bunch of code to make it easier to plug in specialized handling code, a nice and readable output of this structure shows up as follows:

DllDirectory
(StorageContainer) [20] { 
0 0x21c0: (StorageRaw) { 
        Size: 4
        String: .... 
        Hex: d8 00 e0 2e } 
1 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 78
                String: V.i.e.w.p.o.r.t. .M.a.n.a.g.e.r. .f.o.r. .D.i.r.e.c.t.X. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 38
                String: V.i.e.w.p.o.r.t.M.a.n.a.g.e.r...g.u.p. } } 
2 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 98
                String: m.e.n.t.a.l. .r.a.y.:. .M.a.t.e.r.i.a.l. .C.u.s.t.o.m. .A.t.t.r.i.b.u.t.e.s. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 42
                String: m.r.M.a.t.e.r.i.a.l.A.t.t.r.i.b.s...g.u.p. } } 
...
19 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 54
                String: B.i.p.e.d. .C.o.n.t.r.o.l.l.e.r. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 18
                String: b.i.p.e.d...d.l.c. } } } 

Thanks to the chunks, most of this is self-explanatory, and fairly easy to handle. The 0x21c0 chunk seems to be a header chunk for the DllDirectory, so we can call it DllHeader or something similar, and contains 4 bytes. This chunk is found in a version 2010 file, but doesn't seem to exist in files from version 3, so it's probably not crucial to handle and we can ignore it's contents until it seems we need it for something. The rest of the chunks in this container are all of id 0x2038, and are the entries in this list, so they are called DllEntry. Inside each of these, there is a 0x2039 chunk containing a description, and a 0x2037 chunk containing a name, both in UTF-16.

The meaning of chunk identifiers depends on the parent container chunk, so we have to code it like that as well. Each container type is set up with a handler to create the class that handles a chunk of a specified identifier. Parsing this block with some smarter code results in a data reading that looks as follows:

(DllDirectory) [20] { 
0 0x21c0: (CStorageValue) { 786432216 } 
1 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { Viewport Manager for DirectX (Autodesk) } 
        1 0x2037: (CStorageValue) { ViewportManager.gup } } 
2 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { mental ray: Material Custom Attributes (Autodesk) } 
        1 0x2037: (CStorageValue) { mrMaterialAttribs.gup } } 
...
19 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { Biped Controller (Autodesk) } 
        1 0x2037: (CStorageValue) { biped.dlc } } } 

Next, we do a similar thing for the ClassDirectory3 stream.

ClassDirectory3
(StorageContainer) [57] { 
0 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ................ 
                Hex: ff ff ff ff 82 00 00 00 00 00 00 00 82 00 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 22
                String: P.a.r.a.m.B.l.o.c.k.2. } } 
1 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ....<).Z..B0`... 
                Hex: 00 00 00 00 3c 29 06 5a 1e 0c 42 30 60 11 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 30
                String: V.i.e.w.p.o.r.t.M.a.n.a.g.e.r. } } 
...
8 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ................ 
                Hex: 03 00 00 00 02 00 00 00 00 00 00 00 00 0c 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 16
                String: S.t.a.n.d.a.r.d. } } 
...
13 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ....._.d..+".... 
                Hex: fe ff ff ff ec 5f c7 64 b9 9e 2b 22 00 0c 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 24
                String: N.e.L. .M.a.t.e.r.i.a.l. } } 
...
56 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ...."".......... 
                Hex: ff ff ff ff 22 22 00 00 00 00 00 00 00 01 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 10
                String: S.c.e.n.e. } } } 

This container does not seem to have a header chunk, but again simply contains a whole bunch of entries of id 0x2040, containing a binary blob with id 0x2060 and a UTF-16 string with id 0x2042 that has a description. There's a block in here with some data that I can recognize and reference from our own code. The NeL Material, which is a MAXScript, has a class id of (0x64c75fec, 0x222b9eb9) which matches the middle 8 bytes of the 16 byte blob (read them backwards). The last four bytes in the blob match with the last four bytes in the Standard (material) class entry, and appear to be the SuperClassID. When we look closer at the first four bytes, this appears to be a signed integer, given that there's both ff ff ff as 00 00 00 numbers without too much inbetween. For the NeL Material, which is a script, this value is -2, cross-referencing with other max files with scripted classes reveals the same. Builtin types, such as Scene, have this number as -1. Classes that come from plugins, such as ViewPortManager, have a positive value. Even closer inspection reveals that this value matches with the index of the associated dll in the DllDirectory, ViewPortManager being part of ViewPortManager.gup, and Standard being part of mtl.dlt. It can be expected that the indices of the classes in this list will be needed later on as well. A smarter parsing output looks as follows:

(ClassDirectory3) [57] { 
0 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: -1
                ClassID: (0x00000000, 0x00000082)
                SuperClassID: 130 } 
        1 0x2042: (CStorageValue) { ParamBlock2 } } 
1 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: 0
                ClassID: (0x30420c1e, 0x5a06293c)
                SuperClassID: 4448 } 
        1 0x2042: (CStorageValue) { ViewportManager } } 
...
56 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: -1
                ClassID: (0x00000000, 0x00002222)
                SuperClassID: 256 } 
        1 0x2042: (CStorageValue) { Scene } } } 

The ClassData stream is very similar, and seems to contain a global data storage for classes, or something in that style. It doesn't seem to have anything in it that interests me or seems crucial at this point, so I won't bother with it too much for now. It's fairly self-explanatory.

(ClassData) [7] { 
0 0x2100: (ClassDataEntry) [2] { 
        0 0x2110: (ClassDataHeader) { 
                ClassID: (0xbe7c7e52, 0x87d987f4)
                SuperClassID: 16 } 
        1 0x2120: (StorageRaw) { 
                Size: 0
                String:  
                Hex: } } 
...
4 0x2100: (ClassDataEntry) [2] { 
        0 0x2110: (ClassDataHeader) { 
                ClassID: (0x33b673a4, 0x44b50d1e)
                SuperClassID: 4128 } 
        1 0x2120: (StorageContainer) [14] { 
                0 0x0190: (StorageRaw) { 
                        Size: 48
                        String: [email protected]@...= 
                        Hex: 00 00 00 00 00 00 00 00 1f 1c c1 c3 01 00 00 00 cd cc cc 3d cd cc cc 3d 00 00 00 00 cf f7 7b 40 e1 7a 1d 42 01 00 00 00 00 00 a0 40 cd cc cc 3d } 
                1 0x019c: (StorageRaw) { 
                        Size: 72
                        String: [email protected]@[email protected][email protected]=..........HC 
                        Hex: 00 00 00 00 00 00 00 00 1f 1c c1 c3 01 00 00 00 00 00 80 3f 00 00 a0 40 00 00 00 00 cf f7 7b 40 e1 7a 1d 42 01 00 00 00 00 00 a0 40 cd cc cc 3d cd cc cc 3d 00 40 9c 45 cd cc cc 3d 01 00 00 00 01 00 00 00 00 00 48 43 } 
...

So far, this was easy. After this comes the real stuff.

Friday 17 August 2012

3ds Max File Format (Part 1: The outer file format; OLE2)

The 3ds Max file format, not too much documentation to be found about it. There are some hints here and there about how it's built up, but there exists no central documentation on it.

Right now we are in the following situation. A few thousand of max files, created by a very old version of max (3.x), containing path references to textures and other max files that have been renamed and relocated or which simply no longer exist. Yes, we have a maxscript that can go through them all, and that manages to fix a large number of paths. However, there are a lot of paths that are stored as part as fields in plugins and material scripts that don't get noticed, and the performance of opening and closing this number of files from 3ds Max directly is horrible. The obvious solution? Figure out how we can read and save the max file with modified contents, without having to understand all of the actual data it contains. Fortunately, this is actually possible without too much work.

Some research online brings up the following blog post, relating to a change in the max file format in version 2010, which would make it easier to update asset paths: http://www.the-area.com/blogs/chris/reading_and_modifying_asset_file_paths_in_the_3ds_max_file. That's nice and all, but it's only from version 2010 on, and it very likely won't contain any assets referenced by path by old plugins and such.

So, starting at the beginning. The blog post I referred to above nicely hints us to the OLE structured file format. Since there exist a wide range of implementations for that, we can pretty much skip that, and accept that it's basically a filesystem in a file, so it's a file containing multiple file streams. A reliable open source implementation of this container format can be found in libgsf. When scanning a fairly recent max file, using the command gsf list, we can find the following streams inside this file:

f         52 VideoPostQueue
f     147230 Scene
f        366 FileAssetMetaData2
f       2198 DllDirectory
f      29605 Config
f       3438 ClassDirectory3
f        691 ClassData
f      29576 SummaryInformation
f       2320 DocumentSummaryInformation

The FileAssetMetaData2 is new in 3ds Max 2010.

One step further, we can start examining the contents of these streams. And it's usually easiest to start off with one of the more simple ones. VideoPostQueue seems small enough to figure out the overall logic of the file format, hoping that the rest is serialized in a similar way. Using the command gsf dump we can get a hex output of one of the streams, and using a simple text editor we can find how it's structured. Binary formats often contain 32 bit length values, which are usually easy to spot in small files, since they'll contain a large number of 00 values. It's basically a matter of finding possible 32bit length integers, and matching them together with various fixed length fields and other typical binary file contents, until something programatically logical turns up. Here's a manually parsed VideoPostQueue storage stream:

[
        50 00 (id: 0x0050)
        0a 00 00 00 (size: 10 - 6 = 4)
        [
                01 00 00 00 (value: 1)
        ]
]
[
        60 00 (id: 0x0060)
        2a 00 00 80 (size: 42 - 6 = 36) (note: negative bit = container)
        [
                10 00 (id: 0x0010)
                1e 00 00 00 (size: 30 - 6 = 24)
                [
                        07 00 00 00 (value: 7)
                        01 00 00 00 (value: 1)
                        00 00 00 00
                        00 00 00 00
                        20 12 00 00 (value: 4610)
                        00 00 00 00
                ]
                20 00 (id: 0x0020)
                06 00 00 00 (size: 6 - 6 = 0)
        ]
]

The storage streams in the max container file contain a fairly simple chunk based file format (and in fact similar in format to the fairly well known .3ds file format). Being based on chunks is what allows 3ds Max to open a file for which certain plugins are missing. It's basically a tree structured format where every entry has an identifier and a size, so when an identifier is unknown, or when it's contents are incompatible, it can simply be kept as is or discarded. The only exceptions in the file that don't use this structure are SummaryInformation and DocumentSummaryInformation, which are supposedly in a standard Windows format, and the new FileAssetMetaData2 section is formatted differently as well unfortunately.

In this format, the chunk header consists of a 2-byte unsigned integer which is the identifier, and a 4-byte unsigned integer, where the 31 least significant bits are the size and the msb is a flag that helpfully lets us know if the chunk itself contains more chunks, and thus is a container, or not. For very large files, where 31 bits is insufficient for the size, the entire size field is set to 0, and the header increases with an additional 64-bit unsigned integer field which is similarly structured as the 32-bit size field. The size field includes the size of the header.

       0 | 0f 20 (id)
                 00 00 00 00 (size missing)
                             17 fe 01 00 00 00 00 80 (size in 64 bits)

With this information it is possible to read a max file, modify the binary contents of chunks (most of them are fairly basic of format), and we should be able to re-save the max file with our modified data. The DllDirectory section, for example, parsed programatically starts like this:

CStorageContainer - items: 20
        [0x21C0] CStorageValue - bytes: 4
        786432216
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 39
                Viewport Manager for DirectX (Autodesk)
                [0x2037] CStorageUCString - length: 19
                ViewportManager.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 49
                mental ray: Material Custom Attributes (Autodesk)
                [0x2037] CStorageUCString - length: 21
                mrMaterialAttribs.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 37
                Custom Attribute Container (Autodesk)
                [0x2037] CStorageUCString - length: 23
                CustAttribContainer.dlo
...

Of course, it would be interesting if we could go further, and directly manipulate the parameters of our own plugins and scripts from our own tools back into the max files so that everything is centrally stored without any duplicate source data in the way. And that's exactly what I'll be doing next.