Kaetemi

To content | To menu | To search

Wednesday 3 April 2013

Blog moved

Now at http://www.kaetemi.be/wp/.

Saturday 25 August 2012

3ds Max File Format (Part 6: We get signal)

Let's see what we can do now.

INode *node = scene.container()->scene()->rootNode()->find(ucstring("TR_HOF_civil01_gilet")); nlassert(node);
exportObj("tr_hof_civil01_gilet.obj", node->getReference(1)->getReference(1));

Main screen turn on!!

Plain easy, right?

Thursday 23 August 2012

3ds Max File Format (Part 5: How it all links together; ReferenceMaker, INode)

At this point, you should start to familiarize yourself a bit with the publicly available 3ds Max API documentation. The contents of the file map practically 1:1 with how the system is built up internally. Most important is the inheritance of the classes, as we need to be aware of all the parent classes, and preferably structure our parser classes in a similar way.

As a reminder, check out the output that we got out of the file in the last part. Turn on some good music, and scroll away. The file starts with a ParamBlock2, and ends with a Scene, which is interesting, isn't it? ParamBlock2 is one of the lowest classes in power, while Scene stands basically at the top of the entire system. It means that the deepest structures, or rather the structures on which everything depends, are serialized out first, followed by the higher classes, which are very likely to refer to them in some way. Chances are high that the second class that is serialized directly refers to the first one; and, as you can easily spot, the class with index 1 has two values that equal 0.

        1 0x0001: (SceneClassUnknown: ViewportManager, (0x5a06293c, 0x30420c1e), ViewportManager.gup) [3] { 
                0 0x2034: (StorageRaw) { 
                        Size: 8
                        String: ........ 
                        Hex: 00 00 00 00 ff ff ff ff } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } 
                2 0x1000: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 00 00 00 00 
                        Int: 0 
                        Float: 0 } } 

Unfortunately, 0 is not much of a trustworthy value (which reminds me that you should search for Float: 1 values appearing in the scene, those tend to reveal a lot). But, we do have at least two potential chunks, 0x2034 or 0x1000, that may possibly refer to the first class. Moving on to the next class, at index 2 there's another ParamBlock2; which is unlikely to refer to something else, and at index three we can find another class which happens to have a reference to index 2 in one of it's values, 0x2034, which, as it will appear by examining more data, is no coincidence.

        3 0x0002: (SceneClassUnknown: mental ray: material custom attribute, (0x218ab459, 0x25dc8980), mrMaterialAttribs.gup) [2] { 
                0 0x2034: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 02 00 00 00 
                        Int: 2 
                        Float: 2.8026e-45 } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } } 

As we saw in the previous part, the first chunk that could be found in many of the class instances in the scene, was related to AppData, which is implemented in the Animatable class. The chunks containing the references appear either first, or after this AppData chunk. They are handled by the ReferenceMaker class, a subclass of Animatable, which is one of the base classes that all of the scene classes derive from. Given this, we can fairly assume that the chunks are written in order of inheritance. This is fairly logical. And it also allows us to guess more accurately at which level chunks are implemented that we have not encountered before.

The references, however, are not related to the more user-centric scene graph. Consultation of the documentation will clarify that the scene graph itself is stored in Node classes, which then reference the objects that they represent in the scene, and that the objects normally have no need for knowledge of their position or rank in the scene. In the file, we can find two node entries. After improving our parser to better show the references, these appear as follows.

        380 0x0012: (NodeUnknown: RootNode, (0x00000002, 0x00000000), Builtin) [0] { 
                References 0x2035: (StorageArray) { 
                        0: 16 } 
                0x204B Equals 0x2E (46): (CStorageValue) { 46 } 
                Orphan[0] 0x0120: (StorageRaw) { 
                        Size: 0
                        String:  
                        Hex: } } 

and

        389 0x001a: (NodeImpl: Node, (0x00000001, 0x00000000), Builtin) [0] { 
                AppData: (AppData) [83] PARSED { 
... etc ... }
                References 0x2035: (StorageArray) { 
                        0: 16
                        1: 0
                        2: 384
                        3: 1
                        4: 387
                        5: 3
                        6: 134
                        7: 6
                        8: 388 } 
                0x204B Equals 0x2E (46): (CStorageValue) { 46 } 
                Orphan[0] 0x09ce: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: bc 02 00 00 
                        Int: 700 
                        Float: 9.80909e-43 } 
                Orphan[1] 0x0960: (StorageRaw) { 
                        Size: 8
                        String: |....... 
                        Hex: 7c 01 00 00 00 08 00 00 } 
                Orphan[2] 0x0962: (StorageRaw) { 
                        Size: 40
                        String: G.E._.A.c.c._.M.i.k.o.t.o.B.a.n.i.e.r.e. } 
                Orphan[3] 0x09ba: (StorageRaw) { 
                        Size: 0
                        String:  
                        Hex: } 
... etc ....

The RootNode appears before the Node, which is opposite to how the Scene stores the classes. This would mean that a node holds a reference to it's parent somewhere. It does not seem to be in the references block, though. The index of the parent, 380, would be found in the hex print as 7c 01. Easily spotted. The first four bytes of 0x0960 refer to the parent index, the four bytes after that suspicously look like flags of some sort. The chunk before it, 0x09ce, is constant across the file, and is different in different file versions, and I would expect it to be a version number. The chunk with the name after that speaks for itself, and the empty one doesn't seem to interesting right now, because it's empty. Empty chunks tend to contain strings or arrays, that's all you can guess.

But, this node is just some metadata, and it does not actually contain the mesh. It must point to the mesh somehow. In the file there's an Editable Poly at index 387.

387 0x0018: (SceneClassUnknown: Editable Poly, (0x1bf8338d, 0x192f6098), EPoly.dlo) [12] { 

Which happens to be refered to at index 4 in the 0x2035 block. The chunk refers to more, so we look up those as well.

384 0x0016: (SceneClassUnknown: Position/Rotation/Scale, (0x00002005, 0x00000000), Builtin) [5] { 
134 0x000d: (SceneClassUnknown: NeL Material, (0x64c75fec, 0x222b9eb9), Script) [8] { 
388 0x0019: (SceneClassUnknown: Base Layer, (0x7e9858fe, 0x1dba1df0), Builtin) [9] { 

There are other values in this 0x2035 chunk, but they are not references. Two different chunks are used for storing references to other classes, 0x2034 and 0x2035, either one of them and never both of them in the same class. The 0x2034 block simply is an array of all references directly, where -1 indices map to pointer NULL. Here, however, we see the block 0x2035 in use. Cross referencing with different files reveals that the object appears after the number 1, a controller after 0, a material after 3, and a layer after 6. It would appear that this chunk stores the both the class's index and the reference's index as a map instead, preceeded by some other integer value that probably contains flags of some sort.

At this point, we can parse the scene graph. That's very nice, but we're not geting any actual 3d data yet, which, for curiosity's sake would be totally fun to get out of all this, right? Even though my initial plan was just to parse the file to change some filepath strings inside there.

Tuesday 21 August 2012

3ds Max File Format (Part 4: The first useful data; Scene, AppData, Animatable)

The most interesting part of this file is, evidently, the Scene. Opening it up in the chunk parser, it begins like follows, and goes on for a few ten thousands of lines:

0 0x2012: (StorageContainer) [452] { 
        0 0x0000: (StorageContainer) [6] { 
                0 0x000b: (StorageRaw) { 
                        Size: 24
                        String: <).Z..B0`......!........ 
                        Hex: 3c 29 06 5a 1e 0c 42 30 60 11 00 00 00 00 00 21 e0 2e 04 00 01 00 00 00 } 
                1 0x000e: (StorageRaw) { 
                        Size: 19
                        String: [email protected] 
                        Hex: 00 00 04 00 00 00 00 00 82 00 00 00 00 00 40 00 00 00 00 } 
                2 0x000e: (StorageRaw) { 
                        Size: 19
                        String: [email protected] 
                        Hex: 01 00 01 00 00 00 00 00 82 00 00 00 00 00 40 00 00 00 00 } 

Inside the Scene stream, we can find a single chunk container. In this case the id of this chunk is 0x2012, and is apparently related to which version, 2010, the file was saved with. This container contains a large number of objects, 452 in this case, with ids that are all very low in value. So, we go look for something we recognize.

While scrolling through the file, I found a short string that I recognized, "no fx", which is part of the NeL node properties plugin. It appears in the file under the following section:

        389 0x001a: (StorageContainer) [25] { 
                0 0x2150: (StorageContainer) [84] { 
                        0 0x0100: (StorageRaw) { 
                                Size: 4
                                String: S... 
                                Hex: 53 00 00 00 
                                Int: 83 
                                Float: 1.16308e-43 } 
...
                        18 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. ....'...... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 2e 27 0c 05 09 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 9
                                        String: no sound. } } 
                        19 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. .../'...... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 2f 27 0c 05 06 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 6
                                        String: no fx. } } 
                        20 0x0110: (StorageContainer) [2] { 
                                0 0x0120: (StorageRaw) { 
                                        Size: 20
                                        String: XH...u.. ........... 
                                        Hex: 58 48 d6 04 1d 75 d1 16 20 10 00 00 b2 07 00 00 01 00 00 00 } 
                                1 0x0130: (StorageRaw) { 
                                        Size: 1
                                        String: . 
                                        Hex: 00 } } 
...

The 0x2150 chunk seems to exclusively contain 0x0110 entries, preceeded one 0x0100 header entry stating the number of 0x0110 entries. Every entry has a 0x0130 binary blob which is the value of some property, and a 0x0120 binary blob which would likely be some sort of identifier header of this value. Now we should find out how our own nel plugin code puts this value in there, to see if we can make sense of this header.

CExportNel::getScriptAppData(pNode, NEL3D_APPDATA_ENV_FX, "no fx");

This is the line of code that reads the environment fx property from a node. The value of NEL3D_APPDATA_ENV_FX is defined as:

#define NEL3D_APPDATA_ENV_FX (84682543)

Which should appear as 2f 27 0c 05 in the hex output. And it can indeed be found at byte index 12, followed by another integer 6 which happens to match with the length of the value in the value chunk. There's that redundancy showing up again. Looking deeper into the function to figure out the rest of the bytes, we can find this in the getScriptAppData function

AppDataChunk *ap=node->GetAppDataChunk (MAXSCRIPT_UTILITY_CLASS_ID, UTILITY_CLASS_ID, id);

Where the class id MAXSCRIPT_UTILITY_CLASS_ID matches with the first 8 bytes of the header, and the super class id UTILITY_CLASS_ID matches with the following 4 bytes. In short:

58 48 d6 04 1d 75 d1 16 // MAXSCRIPT_UTILITY_CLASS_ID
20 10 00 00 // UTILITY_CLASS_ID
2f 27 0c 05 // NEL3D_APPDATA_ENV_FX
06 00 00 00 // SIZE

We can now be quite sure that the 0x2150 block contains all of the AppData entries. According to the public SDK documentation, the AppData functionality is part of the Animatable base class. Which means that the chunk which contains this block is either an instance of this Animatable class, or more likely a subclass of it. It is of course very likely that the class is somehow identified by the chunk id, which in this case is 0x001a, or 26 in decimal.

Now, the reason why you shouldn't have skipped part 1 and part 2 of this series, is because the data in the DllDirectory and ClassDirectory3 streams is needed to make sense of the Scene stream. Similar to how the ClassDirectory3 references a dll in the DllDirectory by index, the Scene is here referencing a class by index in the ClassDirectory3. This allows to get hold of the ClassId, needed to instantiate the class from the associated class description instance, or to provide a fallback mechanism when the class is part of a missing plugin, and be able to provide the user with the name and description of what is missing from his installation.

At index 26 of the ClassDirectory3 in this file, I can find the following entry:

Entries[26]: (ClassEntry) [2] PARSED { 
        Header: (ClassEntryHeader) { 
                DllIndex: -1
                ClassId: (0x00000001, 0x00000000)
                SuperClassId: 1 } 
        Name: Node}

Which is the builtin class Node, which probably implements the INode interface that ends up deriving from the Animatable base class.

To confirm this theory, I go the other way, and look for a class I know about.

Entries[13]: (ClassEntry) [2] PARSED { 
        Header: (ClassEntryHeader) { 
                DllIndex: -2
                ClassId: (0x64c75fec, 0x222b9eb9)
                SuperClassId: 3072 } 
        Name: NeL Material} 

At index 13 I find the NeL material script class, which would then be found as a chunk in the file with id 0x000d. And there exists one.

        134 0x000d: (StorageContainer) [8] { 
                0 0x2034: (StorageRaw) { 
                        Size: 40
                        String: |.......~............................... 
                        Hex: 7c 00 00 00 7d 00 00 00 7e 00 00 00 7f 00 00 00 80 00 00 00 81 00 00 00 82 00 00 00 83 00 00 00 84 00 00 00 85 00 00 00 } 
                1 0x204b: (StorageRaw) { 
                        Size: 1
                        String: . 
                        Hex: 2e } 
                2 0x21b0: (StorageContainer) [1] { 
                        0 0x1020: (StorageRaw) { 
                                Size: 4
                                String: ^... 
                                Hex: 5e 00 00 00 
                                Int: 94 
                                Float: 1.31722e-43 } } 
                3 0x0010: (StorageRaw) { 
                        Size: 8
                        String: ........ 
                        Hex: 0e 00 00 00 09 00 00 00 } 
                4 0x4001: (StorageRaw) { 
                        Size: 50
                        String: M.T.R.L._.G.E._.A.c.c._.M.i.k.o.t.o.B.a.n.i.e.r.e. } 
                5 0x4003: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 0a 02 00 00 
                        Int: 522 
                        Float: 7.31478e-43 } 
                6 0x4020: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 00 00 00 00 
                        Int: 0 
                        Float: 0 } 
                7 0x4030: (StorageRaw) { 
                        Size: 16
                        String: ...............? 
                        Hex: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 3f } } 

Which, as entry 0x4001 helpfully lets us know by the UTF-16 name MTRL_GE_Acc_MikotoBaniere, is indeed a NeL Material.

So, using this knowledge, we can parse the scene container in a smarter way. Here I provide for you the output at this point.

http://dl.kaetemi.be/blog/maxfile_dump_4.txt

Now that we have figured out which chunks are contained in the Scene streams, and we know how and where some of the data inside it is formatted, how does it all link together? That will be the next subject.

Sunday 19 August 2012

3ds Max File Format (Part 3: The department of redundancy department; Config)

Now we'll have a look at the Config stream. It begins like follows, and goes on forever with various integer fields and other binary blobs.

(StorageContainer) [15] { 
0 0x2090: (StorageRaw) { 
        Size: 4
        String: .... 
        Hex: 00 00 00 00 } 
1 0x20e0: (StorageContainer) [4] { 
        0 0x0100: (StorageRaw) { 
                Size: 12
                String: .. A........ 
                Hex: 00 00 20 41 0a 00 00 00 01 00 00 00 } 
        1 0x0400: (StorageRaw) { 
                Size: 8
                String: ........ 
                Hex: 07 00 00 00 01 00 00 00 } 

As most of the contents seems fairly different from eachother, it's best to look from a distance to the main container.

(StorageContainer) [15] {
0 0x2090: (StorageRaw)
1 0x20e0: (StorageContainer) [4]
2 0x20a0: (StorageContainer) [2]
3 0x20a5: (StorageContainer) [2]
4 0x20a6: (StorageContainer) [1]
5 0x2190: (StorageContainer) [2]
6 0x20b0: (StorageContainer) [10]
7 0x2130: (StorageContainer) [3]
8 0x2080: (StorageContainer) [213]
9 0x20d0: (StorageContainer) [9]
10 0x2160: (StorageContainer) [5]
11 0x21a0: (StorageContainer) [82]
12 0x2180: (StorageContainer) [1]
13 0x2007: (StorageContainer) [1]
14 0x2008: (StorageContainer) [3] }

The first id seems to be unique, so we can assume that each of these containers has a specific set of information in it. Comparing between files of max versions, there are some less and some more of these identifiers, but the contents of them remains pretty much the same.

One container in this file particularly interests me, as it contains strings related to the NeL Material, and thus will likely be necessary to parse the Scene format where this is stored. More specifically, chunk 0x2180 contains stuff like the following:

9 0x0007: (StorageContainer) [3] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 02 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 17
                String: ....bForceZWrite. 
                Hex: 0d 00 00 00 62 46 6f 72 63 65 5a 57 72 69 74 65 00 } 
        2 0x0007: (StorageContainer) [7] { 
                0 0x0060: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 06 00 00 00 } 
                1 0x0006: (StorageRaw) { 
                        Size: 9
                        String: ....type. 
                        Hex: 05 00 00 00 74 79 70 65 00 } 
                2 0x0006: (StorageRaw) { 
                        Size: 12
                        String: ....boolean. 
                        Hex: 08 00 00 00 62 6f 6f 6c 65 61 6e 00 } 
...

The block itself begins like:

12 0x2180: (StorageContainer) [1] { 
        0 0x0040: (StorageContainer) [2] { 
                0 0x0050: (StorageRaw) { 
                        Size: 12
                        String: ....._.d..+" 
                        Hex: 00 0c 00 00 ec 5f c7 64 b9 9e 2b 22 } 
                1 0x0007: (StorageContainer) [10] { 
                        0 0x0060: (StorageRaw) { 
                                Size: 4
                                String: .... 
                                Hex: 09 00 00 00 } 
                        1 0x0007: (StorageContainer) [15] { 
                                0 0x0060: (StorageRaw) { 
                                        Size: 4
                                        String: .... 
                                        Hex: 0e 00 00 00 } 
                                1 0x0006: (StorageRaw) { 
                                        Size: 9
                                        String: ....nlbp. 
                                        Hex: 05 00 00 00 6e 6c 62 70 00 } 
...

A max file that has the NeL Multi Bitmap script used in it as well, has 2 0x0040 entries in the 0x2180 container. We'll call the 0x2180 block ConfigScript, and the 0x0040 container ConfigScriptEntry from now on, as these seem to be related to how script parameters will be stored in the file. What's also interesting is that all the chunks with id 0x0007 in this entire block are containers, and all the 0x0060 blocks are integers. The 0x0050 block is the header block for the ConfigScriptEntry, and contains the same SuperClassID and ClassID from the NeL Material script as seen previously.

Here are a few 0x0007 blocks from a specific depth inside the tree structure:

2 0x0007: (StorageContainer) [5] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 04 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 12
                String: ....boolean. 
                Hex: 08 00 00 00 62 6f 6f 6c 65 61 6e 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0001: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 00 00 00 00 } } } 
2 0x0007: (StorageContainer) [7] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 06 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 10
                String: ....float. 
                Hex: 06 00 00 00 66 6c 6f 61 74 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0004: (StorageRaw) { 
                Size: 4
                String: ..#< 
                Hex: 0a d7 23 3c } 
        5 0x0006: (StorageRaw) { 
                Size: 7
                String: ....ui. 
                Hex: 03 00 00 00 75 69 00 } 
        6 0x0007: (StorageContainer) [2] { 
                0 0x0060: (StorageRaw) { 
                        Size: 4
                        String: .... 
                        Hex: 01 00 00 00 } 
                1 0x0006: (StorageRaw) { 
                        Size: 17
                        String: ....cfBumpUSpeed. 
                        Hex: 0d 00 00 00 63 66 42 75 6d 70 55 53 70 65 65 64 00 } } } } 
2 0x0007: (StorageContainer) [5] { 
        0 0x0060: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 04 00 00 00 } 
        1 0x0006: (StorageRaw) { 
                Size: 9
                String: ....type. 
                Hex: 05 00 00 00 74 79 70 65 00 } 
        2 0x0006: (StorageRaw) { 
                Size: 12
                String: ....integer. 
                Hex: 08 00 00 00 69 6e 74 65 67 65 72 00 } 
        3 0x0006: (StorageRaw) { 
                Size: 12
                String: ....default. 
                Hex: 08 00 00 00 64 65 66 61 75 6c 74 00 } 
        4 0x0003: (StorageRaw) { 
                Size: 4
                String: .... 
                Hex: 01 00 00 00 } } } 

If you have an understanding of the file format at this point, and try to understand the contents of these blocks, you should notice how ridiculous this looks.

It's like making an xml file that goes

<name>Name</name><value>Number</value>
<name>Value</name><value>2</value>

instead of

<Number>2</Number>

thanks to abstraction layers being piled up on each other.

The 0x0060 entry contains an integer which states the number of chunks that follow in the container, and is thus basically the size header of an array. Chunks with id 0x0006 recognizably contain strings, prefixed with their size, which is already known by the chunk header, and followed by an unnecessary null value byte suffix. It gets even sillier. The blocks shown above are actually not arrays, but tables of two columns stored in an array stored in the chunk tree structure. The first value in the array is the name column, and the second value the value column.

These blocks are child chunks of containers, containing a chunk with a name of a data field, and describe the format of this data field. It is fairly straightforward, the value of the name 'type' is the type, the 'default' is the default, and so on. For the default value, and this actually goes for the entire format of the ConfigScript block, the id of the chunk is directly related to the type field, and defines the actual low level storage of the field. The type fields is very helpful in finding out the meanings of these chunk ids.

0x0001 is a boolean stored as 4 bytes, 0x0002 does not appear in my file, 0x0003 is a 32 bit possibly signed integer, 0x0004 is a float, 0x0005 is a string in the same format as the 0x0006 internal strings, 0x0007 is the previously covered array-in-a-container, 0x0008 is a color stored as a 3 floating point vector with value 255.f being the maximum. With this information, the file can be made much more readable. Here's a short excerpt (pun intended) from inside the ConfigScript block.

2 0x0007: (ConfigScriptMetaContainer) [42] { 
        0 0x0060: (CStorageValue) { 41 } 
        1 0x0006: (ConfigScriptMetaString) { main } 
        2 0x0003: (CStorageValue) { 1 } 
        3 0x0003: (CStorageValue) { 2 } 
        4 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { rollout } 
                2 0x0006: (ConfigScriptMetaString) { NelParams } } 
        5 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bLightMap } 
                2 0x0007: (ConfigScriptMetaContainer) [5] { 
                        0 0x0060: (CStorageValue) { 4 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } } } 
        6 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bUnlighted } 
                2 0x0007: (ConfigScriptMetaContainer) [7] { 
                        0 0x0060: (CStorageValue) { 6 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } 
                        5 0x0006: (ConfigScriptMetaString) { ui } 
                        6 0x0007: (ConfigScriptMetaContainer) [2] { 
                                0 0x0060: (CStorageValue) { 1 } 
                                1 0x0006: (ConfigScriptMetaString) { cbUnlighted } } } } 
        7 0x0007: (ConfigScriptMetaContainer) [3] { 
                0 0x0060: (CStorageValue) { 2 } 
                1 0x0006: (ConfigScriptMetaString) { bStainedGlassWindow } 
                2 0x0007: (ConfigScriptMetaContainer) [7] { 
                        0 0x0060: (CStorageValue) { 6 } 
                        1 0x0006: (ConfigScriptMetaString) { type } 
                        2 0x0006: (ConfigScriptMetaString) { boolean } 
                        3 0x0006: (ConfigScriptMetaString) { default } 
                        4 0x0001: (CStorageValue) { 0 } 
                        5 0x0006: (ConfigScriptMetaString) { ui } 
                        6 0x0007: (ConfigScriptMetaContainer) [2] { 
                                0 0x0060: (CStorageValue) { 1 } 
                                1 0x0006: (ConfigScriptMetaString) { cbStainedGlassWindow } } } } 

Most other blocks in this file seem to contain value sets where the type is fixed to the id, and the id is basically the name of the config value. Right now, there doesn't seem to be anything in there that interests me, so I won't bother with them too much, but here's an example of one simplified anyways.

2 0x20a0: (Config20a0) [2] { 
        0 0x0100: (CStorageValue) { 1 } 
        1 0x0110: (Config20a0Entry) [25] { 
                0 0x0100: (CStorageValue) { 220 } 
                1 0x0110: (CStorageValue) { 0 } 
                2 0x0120: (CStorageValue) { 1 } 
                3 0x0130: (CStorageValue) { 0 } 
                4 0x0140: (CStorageValue) { 0 } 
                5 0x0150: (CStorageValue) { 0 } 
                6 0x0160: (CStorageValue) { 1 } 
                7 0x0161: (CStorageValue) { 1 } 
                8 0x0170: (CStorageValue) { 1 } 
                9 0x0180: (CStorageValue) { 0 } 
                10 0x0190: (CStorageValue) { 0 } 
                11 0x0200: (CStorageValue) { 0 } 
                12 0x0210: (CStorageValue) { 0 } 
                13 0x0220: (CStorageValue) { 994352038 } 
                14 0x0230: (CStorageValue) { 1041059807 } 
                15 0x0240: (CStorageValue) { 266338296 } 
                16 0x0250: (CStorageValue) { 131008 } 
                17 0x0260: (CStorageValue) { 0 } 
                18 0x0270: (CStorageValue) { 1 } 
                19 0x0280: (CStorageValue) { 0 } 
                20 0x0310: (CStorageValue) { 0 } 
                21 0x0290: (CStorageValue) {  } 
                22 0x0390: (CStorageValue) { default } 
                23 0x0300: (StorageContainer) [1] { 
                        0 0x0100: (StorageRaw) { 
                                Size: 4
                                String: .... 
                                Hex: 00 00 00 00 } } 
                24 0x0330: (StorageRaw) { 
                        Size: 16
                        String: ................ } } } 

Quite boring, right?

Next up is the long awaited Scene.

3ds Max File Format (Part 2: The first inner structures; DllDirectory, ClassDirectory3)

Now that we understand the outer structure of the file, it's time to look closer to what's inside. The DllDirectory stream looks like a good starting point. After cleaning up a whole bunch of code to make it easier to plug in specialized handling code, a nice and readable output of this structure shows up as follows:

DllDirectory
(StorageContainer) [20] { 
0 0x21c0: (StorageRaw) { 
        Size: 4
        String: .... 
        Hex: d8 00 e0 2e } 
1 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 78
                String: V.i.e.w.p.o.r.t. .M.a.n.a.g.e.r. .f.o.r. .D.i.r.e.c.t.X. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 38
                String: V.i.e.w.p.o.r.t.M.a.n.a.g.e.r...g.u.p. } } 
2 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 98
                String: m.e.n.t.a.l. .r.a.y.:. .M.a.t.e.r.i.a.l. .C.u.s.t.o.m. .A.t.t.r.i.b.u.t.e.s. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 42
                String: m.r.M.a.t.e.r.i.a.l.A.t.t.r.i.b.s...g.u.p. } } 
...
19 0x2038: (StorageContainer) [2] { 
        0 0x2039: (StorageRaw) { 
                Size: 54
                String: B.i.p.e.d. .C.o.n.t.r.o.l.l.e.r. .(.A.u.t.o.d.e.s.k.). } 
        1 0x2037: (StorageRaw) { 
                Size: 18
                String: b.i.p.e.d...d.l.c. } } } 

Thanks to the chunks, most of this is self-explanatory, and fairly easy to handle. The 0x21c0 chunk seems to be a header chunk for the DllDirectory, so we can call it DllHeader or something similar, and contains 4 bytes. This chunk is found in a version 2010 file, but doesn't seem to exist in files from version 3, so it's probably not crucial to handle and we can ignore it's contents until it seems we need it for something. The rest of the chunks in this container are all of id 0x2038, and are the entries in this list, so they are called DllEntry. Inside each of these, there is a 0x2039 chunk containing a description, and a 0x2037 chunk containing a name, both in UTF-16.

The meaning of chunk identifiers depends on the parent container chunk, so we have to code it like that as well. Each container type is set up with a handler to create the class that handles a chunk of a specified identifier. Parsing this block with some smarter code results in a data reading that looks as follows:

(DllDirectory) [20] { 
0 0x21c0: (CStorageValue) { 786432216 } 
1 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { Viewport Manager for DirectX (Autodesk) } 
        1 0x2037: (CStorageValue) { ViewportManager.gup } } 
2 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { mental ray: Material Custom Attributes (Autodesk) } 
        1 0x2037: (CStorageValue) { mrMaterialAttribs.gup } } 
...
19 0x2038: (DllEntry) [2] { 
        0 0x2039: (CStorageValue) { Biped Controller (Autodesk) } 
        1 0x2037: (CStorageValue) { biped.dlc } } } 

Next, we do a similar thing for the ClassDirectory3 stream.

ClassDirectory3
(StorageContainer) [57] { 
0 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ................ 
                Hex: ff ff ff ff 82 00 00 00 00 00 00 00 82 00 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 22
                String: P.a.r.a.m.B.l.o.c.k.2. } } 
1 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ....<).Z..B0`... 
                Hex: 00 00 00 00 3c 29 06 5a 1e 0c 42 30 60 11 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 30
                String: V.i.e.w.p.o.r.t.M.a.n.a.g.e.r. } } 
...
8 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ................ 
                Hex: 03 00 00 00 02 00 00 00 00 00 00 00 00 0c 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 16
                String: S.t.a.n.d.a.r.d. } } 
...
13 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ....._.d..+".... 
                Hex: fe ff ff ff ec 5f c7 64 b9 9e 2b 22 00 0c 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 24
                String: N.e.L. .M.a.t.e.r.i.a.l. } } 
...
56 0x2040: (StorageContainer) [2] { 
        0 0x2060: (StorageRaw) { 
                Size: 16
                String: ...."".......... 
                Hex: ff ff ff ff 22 22 00 00 00 00 00 00 00 01 00 00 } 
        1 0x2042: (StorageRaw) { 
                Size: 10
                String: S.c.e.n.e. } } } 

This container does not seem to have a header chunk, but again simply contains a whole bunch of entries of id 0x2040, containing a binary blob with id 0x2060 and a UTF-16 string with id 0x2042 that has a description. There's a block in here with some data that I can recognize and reference from our own code. The NeL Material, which is a MAXScript, has a class id of (0x64c75fec, 0x222b9eb9) which matches the middle 8 bytes of the 16 byte blob (read them backwards). The last four bytes in the blob match with the last four bytes in the Standard (material) class entry, and appear to be the SuperClassID. When we look closer at the first four bytes, this appears to be a signed integer, given that there's both ff ff ff as 00 00 00 numbers without too much inbetween. For the NeL Material, which is a script, this value is -2, cross-referencing with other max files with scripted classes reveals the same. Builtin types, such as Scene, have this number as -1. Classes that come from plugins, such as ViewPortManager, have a positive value. Even closer inspection reveals that this value matches with the index of the associated dll in the DllDirectory, ViewPortManager being part of ViewPortManager.gup, and Standard being part of mtl.dlt. It can be expected that the indices of the classes in this list will be needed later on as well. A smarter parsing output looks as follows:

(ClassDirectory3) [57] { 
0 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: -1
                ClassID: (0x00000000, 0x00000082)
                SuperClassID: 130 } 
        1 0x2042: (CStorageValue) { ParamBlock2 } } 
1 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: 0
                ClassID: (0x30420c1e, 0x5a06293c)
                SuperClassID: 4448 } 
        1 0x2042: (CStorageValue) { ViewportManager } } 
...
56 0x2040: (ClassEntry) [2] { 
        0 0x2060: (ClassDirectoryHeader) { 
                DllIndex: -1
                ClassID: (0x00000000, 0x00002222)
                SuperClassID: 256 } 
        1 0x2042: (CStorageValue) { Scene } } } 

The ClassData stream is very similar, and seems to contain a global data storage for classes, or something in that style. It doesn't seem to have anything in it that interests me or seems crucial at this point, so I won't bother with it too much for now. It's fairly self-explanatory.

(ClassData) [7] { 
0 0x2100: (ClassDataEntry) [2] { 
        0 0x2110: (ClassDataHeader) { 
                ClassID: (0xbe7c7e52, 0x87d987f4)
                SuperClassID: 16 } 
        1 0x2120: (StorageRaw) { 
                Size: 0
                String:  
                Hex: } } 
...
4 0x2100: (ClassDataEntry) [2] { 
        0 0x2110: (ClassDataHeader) { 
                ClassID: (0x33b673a4, 0x44b50d1e)
                SuperClassID: 4128 } 
        1 0x2120: (StorageContainer) [14] { 
                0 0x0190: (StorageRaw) { 
                        Size: 48
                        String: [email protected]@...= 
                        Hex: 00 00 00 00 00 00 00 00 1f 1c c1 c3 01 00 00 00 cd cc cc 3d cd cc cc 3d 00 00 00 00 cf f7 7b 40 e1 7a 1d 42 01 00 00 00 00 00 a0 40 cd cc cc 3d } 
                1 0x019c: (StorageRaw) { 
                        Size: 72
                        String: [email protected]@[email protected][email protected]=..........HC 
                        Hex: 00 00 00 00 00 00 00 00 1f 1c c1 c3 01 00 00 00 00 00 80 3f 00 00 a0 40 00 00 00 00 cf f7 7b 40 e1 7a 1d 42 01 00 00 00 00 00 a0 40 cd cc cc 3d cd cc cc 3d 00 40 9c 45 cd cc cc 3d 01 00 00 00 01 00 00 00 00 00 48 43 } 
...

So far, this was easy. After this comes the real stuff.

Friday 17 August 2012

3ds Max File Format (Part 1: The outer file format; OLE2)

The 3ds Max file format, not too much documentation to be found about it. There are some hints here and there about how it's built up, but there exists no central documentation on it.

Right now we are in the following situation. A few thousand of max files, created by a very old version of max (3.x), containing path references to textures and other max files that have been renamed and relocated or which simply no longer exist. Yes, we have a maxscript that can go through them all, and that manages to fix a large number of paths. However, there are a lot of paths that are stored as part as fields in plugins and material scripts that don't get noticed, and the performance of opening and closing this number of files from 3ds Max directly is horrible. The obvious solution? Figure out how we can read and save the max file with modified contents, without having to understand all of the actual data it contains. Fortunately, this is actually possible without too much work.

Some research online brings up the following blog post, relating to a change in the max file format in version 2010, which would make it easier to update asset paths: http://www.the-area.com/blogs/chris/reading_and_modifying_asset_file_paths_in_the_3ds_max_file. That's nice and all, but it's only from version 2010 on, and it very likely won't contain any assets referenced by path by old plugins and such.

So, starting at the beginning. The blog post I referred to above nicely hints us to the OLE structured file format. Since there exist a wide range of implementations for that, we can pretty much skip that, and accept that it's basically a filesystem in a file, so it's a file containing multiple file streams. A reliable open source implementation of this container format can be found in libgsf. When scanning a fairly recent max file, using the command gsf list, we can find the following streams inside this file:

f         52 VideoPostQueue
f     147230 Scene
f        366 FileAssetMetaData2
f       2198 DllDirectory
f      29605 Config
f       3438 ClassDirectory3
f        691 ClassData
f      29576 SummaryInformation
f       2320 DocumentSummaryInformation

The FileAssetMetaData2 is new in 3ds Max 2010.

One step further, we can start examining the contents of these streams. And it's usually easiest to start off with one of the more simple ones. VideoPostQueue seems small enough to figure out the overall logic of the file format, hoping that the rest is serialized in a similar way. Using the command gsf dump we can get a hex output of one of the streams, and using a simple text editor we can find how it's structured. Binary formats often contain 32 bit length values, which are usually easy to spot in small files, since they'll contain a large number of 00 values. It's basically a matter of finding possible 32bit length integers, and matching them together with various fixed length fields and other typical binary file contents, until something programatically logical turns up. Here's a manually parsed VideoPostQueue storage stream:

[
        50 00 (id: 0x0050)
        0a 00 00 00 (size: 10 - 6 = 4)
        [
                01 00 00 00 (value: 1)
        ]
]
[
        60 00 (id: 0x0060)
        2a 00 00 80 (size: 42 - 6 = 36) (note: negative bit = container)
        [
                10 00 (id: 0x0010)
                1e 00 00 00 (size: 30 - 6 = 24)
                [
                        07 00 00 00 (value: 7)
                        01 00 00 00 (value: 1)
                        00 00 00 00
                        00 00 00 00
                        20 12 00 00 (value: 4610)
                        00 00 00 00
                ]
                20 00 (id: 0x0020)
                06 00 00 00 (size: 6 - 6 = 0)
        ]
]

The storage streams in the max container file contain a fairly simple chunk based file format (and in fact similar in format to the fairly well known .3ds file format). Being based on chunks is what allows 3ds Max to open a file for which certain plugins are missing. It's basically a tree structured format where every entry has an identifier and a size, so when an identifier is unknown, or when it's contents are incompatible, it can simply be kept as is or discarded. The only exceptions in the file that don't use this structure are SummaryInformation and DocumentSummaryInformation, which are supposedly in a standard Windows format, and the new FileAssetMetaData2 section is formatted differently as well unfortunately.

In this format, the chunk header consists of a 2-byte unsigned integer which is the identifier, and a 4-byte unsigned integer, where the 31 least significant bits are the size and the msb is a flag that helpfully lets us know if the chunk itself contains more chunks, and thus is a container, or not. For very large files, where 31 bits is insufficient for the size, the entire size field is set to 0, and the header increases with an additional 64-bit unsigned integer field which is similarly structured as the 32-bit size field. The size field includes the size of the header.

       0 | 0f 20 (id)
                 00 00 00 00 (size missing)
                             17 fe 01 00 00 00 00 80 (size in 64 bits)

With this information it is possible to read a max file, modify the binary contents of chunks (most of them are fairly basic of format), and we should be able to re-save the max file with our modified data. The DllDirectory section, for example, parsed programatically starts like this:

CStorageContainer - items: 20
        [0x21C0] CStorageValue - bytes: 4
        786432216
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 39
                Viewport Manager for DirectX (Autodesk)
                [0x2037] CStorageUCString - length: 19
                ViewportManager.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 49
                mental ray: Material Custom Attributes (Autodesk)
                [0x2037] CStorageUCString - length: 21
                mrMaterialAttribs.gup
        [0x2038] CStorageContainer - items: 2
                [0x2039] CStorageUCString - length: 37
                Custom Attribute Container (Autodesk)
                [0x2037] CStorageUCString - length: 23
                CustAttribContainer.dlo
...

Of course, it would be interesting if we could go further, and directly manipulate the parameters of our own plugins and scripts from our own tools back into the max files so that everything is centrally stored without any duplicate source data in the way. And that's exactly what I'll be doing next.

Sunday 22 August 2010

Why avoid closed silver bullet game engines, why not? A full page of Unity bashing

This is another short excerpt of text that was originally written as part of my final internship report in June. It is here aimed towards Unity, as this was the silver bullet of the year, but a lot of this can generally be applied to pretty much any closed silver bullet engine.

I have now worked with Unity for three projects. A small game at the global game jam this year, an online multiplayer board game prototype, and a first person shooter.

If you want to make generic re-hashed physics games, where the player is a character walking around a world, and you don't want to do anything technologically innovative, then Unity is for you, and then you should use it. For me, it is a pure waste of my time.

Initially, I had given it a chance, because it seemed interesting, and there was quite a lot of attention paid to it recently. While using it for the first time, though, I got the impression that the overly designed architecture, which Unity forces you to use, leads to some very sloppy coding practices (public variables, singletons, and such), and others who I talked to found this as well.

Another issue that quickly turned up with Unity was the fact that it's not possible to debug. When Unity crashes with a fatal error, and it really does that in quite a few situations, you can lose a very valuable amount of time on figuring out just where it's going wrong. Sure, with Unity it's possible to try after every single line of code if it works, because the compilation takes literally no time. After a while you start spending more than half of your time on just launching the game just to see if that last line of code doesn't crash Unity.

I expect from an engine that it only provides me with highly optimized implementations of fundamentally important techniques that take a long time to write. And that I can choose to bypass or modify as I wish. Unity does not give me this. Instead, it gives me a truckload of half working garbage made from a bunch of libraries they licensed from elsewhere which they crudely stitched together to form one static unified blob. In comparison, the most friendly engine that I have used is XNA, simply because it does all the boring stuff for you, but gives you complete freedom in the area of graphical programming. Unity is making me think of ditching C# altogether, just so I can avoid it.

But one of the major issues, really, is vendor lock-in. Mostly any script written for Unity, can not be relevantly used in any game that does not run under Unity, thanks to the Unity specific interfaces that are necessary for scripts to interact with each other in Unity. On the other hand, code written in C++ for one project, with one engine, can easily be retransformed for use in another project without major changes, even when using a different engine, because a proper engine does not force me to use poor design patterns. And a game written in C/C++ using OpenGL and OpenAL runs potentially on everything, because I can make it to, while a game written in Unity will only run on the platforms that Unity decides to support. Unity will not run on Linux, because they said so.

It also does not allow me to experiment with new technologies or techniques. It is not practical to, for example, make use of OpenCL from within Unity, as we do not truly have direct access, which makes it impractical to share data between Unity rendering and OpenCL without needlessly copying data back and forth between the mainboard ram and the gpu ram.

Then there are countless bugs and design flaws, especially in the networking and sound interfaces of the Unity engine, that they have known about for a long time already, as can be proven by various posts on the Unity forums, which are easier to fix by writing an own engine, than by working around them.

In the end though, if I constantly have to hear, only from people who actually don't even make games with Unity themselves, how good Unity is, that "everything" will be fixed in "the next version", and that I should just use it, then something is clearly wrong. It's more viral marketing hype than useful technology.

It's potentially a nice level editor, though.

But the thing is, a lot of people do prefer to work with a commercial all-in-one solution, such as Flash, Unity, or whatever the next silver bullet is. They just want to get their ideas out there, without having to bother with technical details, and they'll just hop on to the next bandwagon whenever it passes by. The upside of this is that they will keep following the current technological advances. The downside is that they'll be lagging behind the current technologies for as far as their engine of choice goes.

I'm not in favor of writing your own engine from scratch either. The thing that you need to be capable of doing, is to rush forward, ahead of the commonly known techniques, and extend what already exists. There is no use in rewriting again and again what we already have, it is more valuable to build upon that. To make a car analogy here; don't reinvent the wheel, make something that's not a wheel that can do it's function of transportation much better than a wheel, and mount it on an existing car.

So, in my opinion as a programmer, the best choice is to start out from a game engine which gives you full access to the source code, whether that be a commercially licensed or open source licensed engine does not matter. What matters is that you should have the ability to fix what is wrong, without wasting more time than necessary. And extend that with your own unique ideas and experiments.

On the value of innovative techniques in games

This is a short excerpt of text that was originally written as part of my final internship report in June.

I prefer to look into new ways and new techniques of doing something, instead of just going with what everyone has already done a thousand times before. This does not mean that I do everything from scratch, it means that I also see a place for the lesser known techniques, that still require a certain level of skill to implement and work with, in addition to looking for new ways of doing something myself.

And when thinking of new techniques, you will eventually find obscure blogs online where some random unknown people have thought of a similar or just exactly the same technique as well, and some of them have succeeded, but never really got their techniques well known, and some of them will have failed, but when one person fails it does not mean everyone else will fail.

Sure, it takes less time to just implement what is already known, and you'll have faster results. But there'll be nothing new, no reason why anyone would be interested in the game, nothing that makes it unique. A lot of unique gameplay possibilities come from simple technical tricks that are rarely done, such as the portals in Portal, and not from endless brainstorming sessions, that are wasting the available production time. If you're only willing to implement what already exists without considering any fundamental changes, then you're essentially just going to ship nothing but shovelware.

For example; any halfway competent run off the mill programmer can implement a somewhat optimized heightmap landscape system, since it is very easy to understand, and there are literally hundreds of tutorials for it, but they will rarely think beyond these commonplace techniques. They had enough training to follow instructions, but they do not have the skills to do something for themselves. Generally they will reject unknown techniques as being wrong, simply because almost nobody uses them, and rarely will they consider implementing any unusual techniques that may actually be more suited for the application.

Unless you're willing to ship shovelware, unless you find quantity more important than quality, there is only limited room for people without any skillful insight in a game development company.

Pretty much all webbrowser games are shovelware.

Nonetheless, there is a very large business to do creating shovelware. All the poor quality products and services on the market can only use games of equally poor quality in their marketing campaign. Using a game that goes beyond the current standards as a means of advertisement, for whatever rubbish products or services that are released on the market, does not serve it's advertising purpose well, as the end consumer would find the product to be very poor compared the the quality of the marketing campaign. On the other hand, for a product or service with a sufficiently high level of quality, throwing an interesting innovative game in it's marketing campaign could prove to be very successful.

Sunday 25 October 2009

SSE2 memcpy

SSE2 provides functionality for performing faster on aligned memory. By copying the first and last bytes of an unaligned memory destination using the conventional unaligned functionality, and copying everything in between as aligned, it is possible to make use of this performance improvement on large unaligned memory blocks as well.

In this graph the green lines are the conventional memcpy available in Microsoft Visual Studio 2008, the red lines are the SSE memcpy function available in Nevrax NeL, and the blue lines are the custom SSE2 function. The bright colored lines are the performance on alinged memory blocks, while the dark colored lines are tested on differently unaligned blocks of memory. Horizontally the copy function is tested on different sizes of memory, on the vertical axis the copy speed is displayed in MB/s.

As you can see, NeL's SSE memcpy performs very well on aligned memory, but gives horrible performance on unaligned memory, as it does not take the aligning of the memory blocks into account. The builtin memcpy function is fastest of all at copying blocks below 128 bytes, but also reaches it's speed limit there. The SSE2 memcpy takes larger sizes to get to it's maximum performance, but peaks above NeL's aligned SSE memcpy even for unaligned memory blocks.

Code is available below, ask before using. SSE2 memcpy

Continue reading...