Labels

3G (1) 8600GT (1) AI (4) amazon (1) API (1) apple (3) apple mail (1) atlassian (1) audio (1) bambo (1) Bamboo (1) bloat (1) boost (1) bugbear (1) C++ (5) calling conventions (1) cdecl (1) chromecast (1) CI (1) compiler (1) continuous integration (1) coursera (1) custom domain (1) debugging (1) deltanine (1) diagnosis (1) diy (5) DLL (1) dns (1) don't be evil (1) ec2 (1) education (1) electronics (1) express checkout (1) fail (6) fink (1) firewire (1) free hosting (1) GAE (1) google (1) Google App Engine (4) H170 (1) hackerx (1) hackintosh (1) Haskell (3) homebrew (2) i1394 (1) icloud (2) iOS 9 (1) ipad2 (2) jobhunting (2) lag (1) letsencrypt (2) libjpeg (1) linux (1) mac (2) mbcs (1) mechanic (1) memory (1) MFC (3) Microsoft (1) migration (1) ML (1) mobile (1) movi (1) MSBuild (1) music (1) naked domain (1) NLP (2) o2 sensor (1) obd (1) Optiplex960 (1) osx (1) outlook express (1) payments (1) paypal (1) photos (2) PIL (1) Project Euler (1) projectmix (1) python (2) raspberrypi (3) recruitment (1) renwal (1) skylake (1) soundcloud (1) ssl (2) stdcall (1) stripe (1) subaru (2) supermemo (1) supermemo anki java (1) sync (2) Telstra (1) tests (1) thunderbird (1) udacity (1) unicode (1) Uniform Cost Search (1) university (1) upgrade (2) vodafail (1) vodafone (1) VS2010 (1) vs2013 (1) VS6.0 (1) weather (1) win (1) Win32 (1) Z170 (1)

Friday 18 October 2013

More than I ever wanted to know about Win32 calling conventions...

Just spent the last day trying to debug a weird bug - one of those works in Debug builds but crashes in Release builds issues in a C++ MFC application. After tweaking some compiler settings I discovered it wouldn't crash if I disabled optimizations in release mode, but rather than doing that I thought I would dig down to find the root cause.
After much setting of breakpoints and attaching the debugger I found it was crashing in a call to CString::Format() - but only subsequent to a call to a function in 3rd party DLL that was used for encryption. The origins of this DLL were long ago lost in the mists of time so unfortunately there was no documentation.
The function was called by dynamically loading the DLL and calling ::GetProcAddress() to obtain the function pointer, which had the following signature:

typedef long ( __stdcall* LPFNDLLENCRYPT)(const char *, const char *, char *);

__stdcall
is used to call most of the Win32 apis. This calling convention can generate smaller code, as the called function performs the stack cleanup code.

__cdecl calling convention relies on the caller to cleanup the stack frame, and since functions are typically called from more than one place in the code, the additional stack cleanup code bloats the code.

On a hunch, I changed the function signature to:

typedef long ( __cdecl* LPFNDLLENCRYPT)(const char *, const char *, char *);

and lo and behold - it fixed the bug!

It seems the DLL was using __cdecl calling convention and expected the caller to cleanup the stack frame, but since the function pointer incorrectly declared it as __stdcall the compiler was not generating the stack cleanup code. This left the stack in a corrupted state which caused the next call to the CString::Format function in the MFC DLL to fail. Presumably the Debug build was more resilient with respect to stack corruption.

Here's a really useful article on calling conventions on CodeProject which explains calling conventions in more detail than you would ever want to know.... unless your debugging weird bugs due to calling conventions!

I love C++!

No comments:

Post a Comment