Kaetemi

To content | To menu | To search

Sunday 25 October 2009

SSE2 memcpy

SSE2 provides functionality for performing faster on aligned memory. By copying the first and last bytes of an unaligned memory destination using the conventional unaligned functionality, and copying everything in between as aligned, it is possible to make use of this performance improvement on large unaligned memory blocks as well.

In this graph the green lines are the conventional memcpy available in Microsoft Visual Studio 2008, the red lines are the SSE memcpy function available in Nevrax NeL, and the blue lines are the custom SSE2 function. The bright colored lines are the performance on alinged memory blocks, while the dark colored lines are tested on differently unaligned blocks of memory. Horizontally the copy function is tested on different sizes of memory, on the vertical axis the copy speed is displayed in MB/s.

As you can see, NeL's SSE memcpy performs very well on aligned memory, but gives horrible performance on unaligned memory, as it does not take the aligning of the memory blocks into account. The builtin memcpy function is fastest of all at copying blocks below 128 bytes, but also reaches it's speed limit there. The SSE2 memcpy takes larger sizes to get to it's maximum performance, but peaks above NeL's aligned SSE memcpy even for unaligned memory blocks.

Code is available below, ask before using. SSE2 memcpy

Continue reading...

Saturday 25 April 2009

Snowballs - April 2009. Status

As you might have noticed, I haven't posted any follow ups on my first blog post on Snowballs yet. I had some ideas planned to talk about, but there's been other work like the NeLSound engine that had higher priority. Since Snowballs doesn't have sound yet, you can see this as work on Snowballs as well, if you wish so. I have also been working on getting the content pipeline up and running. About half of the old broken script mess have been converted to python scripts right now, so it's possible here to pull a whole new landscape trough the content pipeline. There's some stuff outside NeL that I'm working on, as well, which takes up quite a bit of time as well. But more importantly, there's this game I'm *cough*secretly*cough* working on, which I'm using NeL for. This has been producing some pretty nice code that might sometime be modified for use in Snowballs. It's got some useful stuff in there like a login screen that uses CEGUI, with a bunch of changes to the login server to get some more information on the shards in the window.

Friday 5 September 2008

NeLSound XAudio2 Driver (Update)

The project to run NeLSound on top of Microsoft's new XAudio2 API is already nearing it's final stages of development. The driver currently implements all basic functionality required by NeLSound, as well as the some of the environment effects (commonly known as EAX), and has support for the OGG Vorbis music format. I am currently working on adding support for the ADPCM sample buffer format used by NeL to the driver, and I will soon work on the rest of the environment effects implementation as well. Also, the new owners of NeL have made available a new official website for NeL over at http://dev.ryzom.com/, and you can browse the code of the new sound driver at http://dev.ryzom.com/repositories/browse/nel/nel/src/sound/driver/xaudio2.

Sunday 24 August 2008

NeLSound XAudio2 Driver

I have recently started working on an XAudio2 driver for NeLSound, which will allow OpenNeL to make use of Microsoft's new XAudio2 sound library (included in the latest DirectX SDK), in addition to the three sound libraries already currently supported by OpenNeL (FMod, DSound and OpenAL). The project currently does not implement all functionality yet, as the rest of the features will be implemented in the coming weeks, but it already runs the sound_sources sample included with OpenNeL pretty nicely without problems. The code can be downloaded from http://nel.svn.sourceforge.net/viewvc/nel/trunk/nel/src/sound/driver/xaudio2/, and is released, like all other available OpenNeL code, under version 2 of the GNU General Public License.