Tuesday, July 12, 2005

AMD Alleges Intel Compilers Create Slower AMD Code

AMD Alleges Intel Compilers Create Slower AMD Code: "I noticed this problem back in January of 2004, with Intel C 8.0, and went through heck over nine months with Intel's customer support to get it fixed until I eventually had to abandon their compiler.

On any non-Intel processors, it specifically included an alternate code path for 'memcpy' that actually used 'rep movsb' to copy one byte at a time, instead of (for example) 'rep movsd' to copy a doubleword at a time (or MMX instructions to copy quadwords). This was probably the most brain-dead memcpy I'd ever seen, and was around 4X slower than even a typical naive assembly memcpy:

push ecx
shr ecx, 2
rep movsd
pop ecx
and ecx, 3
rep movsb

They responded with completely ridiculous answers, such as:

'Our 8.0 memcpy was indeed optimized for a Pentium(r)4 Processor,when we reworked this routine we used the simplest, most robust, and straightforward implementation for older processors so that we didn't need the extra code to check for alignment, length, overlap, and other conditions.'

BS. I went and added the following line to the beginning of my source code:

extern 'C' int __intel_cpu_indicator;

then I added:

__intel_cpu_indicator = -512;

to the 'main' function.

This forced Intel C to use the 'Pentium 4' memcpy regardless of which processor in in the machine. It turns out that their special 'Pentium 4' memcpy which I tested thoroughly in all kinds of situations, and it worked perfectly fine on an AMD Athlon and a Pentium III. I pointed this out to them.

I received the following response:

'The fast mempcy is over 2000 lines of hand coded assembly, with lots of special cases where different code paths are chosen based on relative alignment of the source and destination. ... If the performance of memcpy/memset only are improved for Pentium III will that satisfy you?'

I answered 'No,' saying that I needed support for AMD processors as well. I also gave them a copy of my own memcpy routine that was 50% faster than theirs--and just used MMX. They closed the support issue and did nothing to resolve it.

I switched back to Visual C ."

No comments:

Edward A. Villarreal. Powered by Blogger.

Labels

Total Pageviews