четверг, 23 июня 2011 г.

"Cross-platform Java"

If you think about it, "cross-platform Java" is mostly a Unix/Linux phenomenon. A lot of the Java API and behavior is copied from and influenced by Unix/Linux/Posix. Really cross-platform Java code ("write once, run everywhere") is mostly run under Unix web servers. It is virtually non-existent on desktop and server Windows (which is, by the way, 90% of all desktops, 70% of netbooks, around 30% of servers). Java is widely used in mobile devices, however there Java is hardly cross-platform: mobile devices are resource-constrained so that Sun had to invent different reduced sets of Java (Java ME etc.), and different phone models run Java so differently that developers had to produce different versions of a same application for each model! (at least it was true a couple of years ago, don't know how it is today, with smarter phones; i'm phone-agnostic)

Both Java and C# (through mono) support a wide range of processor architectures, with C# lacking support for some rarely-used architectures. All in all, x86 and x64 are still dominant nowadays, and both frameworks provide equally good support for them.

So if you think about it, the Java guys behave the same way the Windows guys do: they live in their little world and believe their little world is everything while in fact they're bound to a specific platform few people outside that platform care about.

.NET Api is based on WinApi? Java API is based on Unix/Posix.
.NET isn't popular on Linux? Java isn't popular on Windows.
.NET isn't installed by default on Linux? Java isn't installed by default on Windows.
et cetera, et cetera.



All in all, in practice .NET isn't much less "cross-platform" than Java, in my opinion. They're two sides of the same coin. Just because JVM has an official implementation for Windows, doesn't tell much.

BTW, Java HotSpot contains A LOT of inline assembly making it very difficult to port to other processor architectures. If I were going to be cross-platform, I myself would consider using as much C as possible in the first place. Wikipedia:"Ports of HotSpot (OpenJDK's Virtual Machine) are difficult because the code contains a lot of assembly in addition to the C++ core." They even developed a project, zero-assembler Hotspot, to make the "cross-platform" JDK be adequately portable.

среда, 22 июня 2011 г.

Reinventing the wheel with the circular reference resolver.

Today I have implemented a "circular reference resolver". In other words, it is a garbage collector, a very simple one.

My little framework uses reference counting (+ smart pointers) to manage memory. However, sometimes I need circular references. It is usually a parent-child relation. In my case, a parent contains a cached child object that again points to the parent.

A naive reference counting implementation can't deal with circular references at all. No matter what it does, an isolated island of objects will live on their own.

I also can't use "weak" references, i.e. without actually retaining references. Imagine the following scenario: a material contains a hard reference to a state, while the state contains a weak reference (no retaining) to the material.
  1. TMaterial* mat = TMaterial::Load("stone.mat"); // material refCount = 1  
  2. TMaterialState* matState = mat->CreateState(); // state refcount = 1;  
  3. entity->SetMaterialState(matState);  // state refCount=2  
  4. matState->Unref(); // state refCount=1  
  5. mat->Unref(); // material refCount = 0  

Since the state references the material in a weak manner, the material may eventually get released, because the system thinks no one references it (it doesn't track "weak" references) and we're free to remove it. And then, when the state will try to reference material data, we'll get a crash :(
We also can't make them both reference each other in a hard manner because in that case a circular reference will prevent us from releasing the whole group at all.

TCircularReferenceResolver deals with it very simply: it just holds a big "parent => children" map which is regularly enumerated at a given interval (explicitly via TCircularReferenceResolver::Collect()). When a circular reference group is created, it should be explicitly added to the resolver (somewhere in the constructor), via ::AddParentChildRelation. A parent must explicitly remove its children in the destructor definition, however children should never try to unreference their parent, by having a weak reference (this prevents from infinite recursion).

TCircularReferenceResolver itself retains hard references to all such objects. So when user code forgets about these objects, their reference count is guaranteed to be: 1 for the parent (retained by the big map) and 2 for each children (one reference retained from the big map and one reference from the parent).

If at least one object in such group (parent => children) contains a bigger number of retained references than 1-2, then the whole group is reachable. If the parent has exactly 1 reference and its children have exactly 2 references each, then the group is supposed to be unreachable, and we can release it at once, simply unreferencing the parent which will recursively release all children.

We also can have multiple resolvers, and we can make resolvers current for specific threads.

Here is an example:

  1. TStaticMaterial::TStaticMaterial(TTexture* texture) // ctor  
  2. {  
  3.     TCircularReferenceResolver* resolver = TCircularReferenceResolver::ForCurrentThread(); // this resolver is registered by the canvas  
  4.     if(!resolver)  
  5.         DA_THROW_WITH_MSG(EC_INVALID_STATE, "No circular reference resolver registered for the current thread. Probably called outside the Canvas thread.");  
  6.     m_texture = texture;  
  7.     texture->Ref();  
  8.     m_cachedState = new TStaticMaterialState(this); // they reference each other  
  9.     resolver->AddParentChildRelation(this, m_cachedState); // explicitly adds to the resolver  
  10. }  

And I also have a canvas->m_contentMgr->CollectCircularReferences(curTicks); call in the rendering thread, which triggers collection each 5 seconds for the debug build, and each 15 seconds for the release build (it's not like the Java's collector which traces the whole heap, so it's a decent interval).

Then if I write something stupid like this:
  1. // Loads the material (which is TStaticMaterial).  
  2.   TMaterial* garbage = canvas->ContentManager()->LoadMaterial(TString::FromUtf8("default.dmat"), false);  
  3.  // And immediately removes from the cache.  
  4.   canvas->ContentManager()->RemoveMaterial(TString::FromUtf8("default.dmat"));   
  5. // And also removes the last reference.  
  6.   garbage->Unref();   

the log will print in 5 seconds:

[INFO!Main] Garbage collected 1 circular reference group(s).

Makes me feel so cozy when I see it :)

The resolver isn't probably fully thread-safe though: it doesn't stop the world. I didn't care about it very much, because 1) resolvers are supposed to be per-thread 2) i have only 2 working threads for the current project 3) ref/unref are atomic. Anyway, if we urgently need to synchronize it, the ::SyncRoot() mutex is supposed to be used.

понедельник, 20 июня 2011 г.

MinGW bugs

Yesterday I encountered my second MinGW bug.

The whole freaking day was spent in debugging and figuring out what was going on.
Here is a snippet. Yes, it catches 'int' because I tried to reduce my problem to the simplest form (and I learned it crashes with any type).
  1. // inside a newly created Win32 thread  
  2.     virtual void OnStart()  
  3.     {  
  4.   
  5.              /* some code */  
  6.   
  7.                 try  
  8.                 {  
  9.                     resource->CreateAsyncHandle(); // <-- this throws  
  10.                 }  
  11.                 catch(int i)  
  12.                 {  
  13.                     printf("Upper level.\n"); // <-- doesn't print, isn't caught, crashes 
  14.                     exit(1); 
  15.                 }  
  16.          // Some code  
  17.    }  


This is the function that throws the exception:
  1. void TAsyncProxyMaterial::CreateAsyncHandle() // virtual function btw  
  2. {  
  3.     try  
  4.     {  
  5.         throw 2; // test code to simplify, in the actual code it throws a TException  
  6.     }  
  7.     catch(int i)  
  8.     {  
  9.         printf("Lower level.\n");  
  10.   
  11.         throw;  
  12.     }  
  13.   
  14.   
  15.     ... /* some simple code that deals with member fields */  
  16. }  
(original code didn't rethrow, it is done merely to be able to set breakpoints, and it crashes anyway)

Ideally, it should print
"Lower level.
Upper level."

exiting with error code 1.

In fact, it prints only "Lower level." and crashes with "This application has requested the Runtime to terminate it in an unusual way." (crashed with both MinGW 3.4.5 and MinGW 4.4.something)

DebugDiag didn't confirm there was a segfault. gdb's breakpoints were ignored when set to printf("Upper level.\n")

I also tried static builds, -mthread keyword, and many other magic things.

Also, for debugging purposes, I removed all stack-allocated objects, to guarantee it isn't fault of an exception in one of destructors.

Weird.

So I applied a quick and ugly workaround (still to be redesigned):
  1.     bool createAsyncHandle_workaroundMinGWBug(IAsyncResource* res, TException* out_ex) __attribute__ ((noinline))  
  2.     {  
  3.         bool exceptionOccured = false;  
  4.   
  5.         try  
  6.         {  
  7.             res->CreateAsyncHandle();  
  8.         }  
  9.         catch(TException& e)  
  10.         {  
  11.             exceptionOccured = true;  
  12.             *out_ex = e;  
  13.         }  
  14.   
  15.         return exceptionOccured;  
  16.     }  
  17.   
  18.   
  19. virtual void OnStart()  
  20. {  
  21.   
  22.     // ... 
  23.   
  24.                 TException mingwBug_e(EC_OK, 0);  
  25.                 bool errorOccured = createAsyncHandle_workaroundMinGWBug(resource, &mingwBug_e);  
  26.   
  27.     // ...  
  28. }  

Now it catches the exception without problems inside the nested createAsyncHandle_workaroundMinGWBug function. Everything works as expected.


And my first MinGW bug was...
I recently experienced another bug. I spent around an hour debugging. I was getting very random segfaults. Then I narrowed it down to a thread extracting an incorrect thread-local value. Let's check out their official site (I used GCC 4.4 port): "New features since the previous release: ... Thread local storage support: The __thread keyword is honoured."


This is a half-truth. Yes, mingw-4.4, unlike mingw-3.4.5, now doesn't complain anymore when sees the __thread keyword. However, it doesn't make it thread-local either, it still actually remains a shared static variable all threads write to, corrupting memory and logic.


Still have to wrap it with WinAPI's Tls** family of functions, as in MinGW 3.4.5


Such cases.

воскресенье, 19 июня 2011 г.

C++/C# interop

If you have a rooted hierarchy and have to deal with C++/C# interop without using C++/CLI, you may want to use my approach.

1. Remember that C++ pointers are tricky.

C# defines IntPtr which simply means "native integer". In C++, however, things are more complicated. For example, dynamically allocated objects are all "IntPtr", however, a class with multiple inheritance, depending on the implementation, may use different pointer offsets depending on the currently cast type. For example, if a TMyObject* object which inherits from both TObject and IInterface, is cast to IInterface*, calling TObject's methods on it from the C# side would result in undefined behavior, probably crashes. Same goes for classes with virtual functions -- underlying pointers may be different when cast to different classes in the hierarchy. Calling a method with a wrong pointer will segfault, or will invoke a wrong method, which is very difficult to debug from C#.

To circumvent this limitation, always cast objects to the root class, when exposing pointers to the C# side, Then, when you fetch a pointer, you cast it from the root class to the target class judging from the context. For example:

  1. EExceptionCode INTEROP_DECL SceneAddChild(TObject* self, TObject* ent)  
  2. {  
  3.     BARRIER_BEGIN  
  4.         ((TScene*)self)->AddChild((TEntity*)ent);  
  5.     BARRIER_END  
  6. }  

This will result in slower code because of implicit dynamic_casts (managed-to-unmanaged transition is slow anyway), on the other hand you have a lot of benefits: 1) type-safe code (across managed-unmanaged boundaries) 2) cross-compiler and cross-platform 3) less glue code (you don't need to write TParentClass_DoIt, TChildClass_DoIt, TGrandChildClass_DoIt method wrappers; TParentClass_Doit will be sufficient).

2. Don't let managed exceptions propagate through native code and vice versa.

It is usually just bad. C++ exceptions know nothing about C# exception handling mechanism (unless SEH is used, but you never know), they will skip all managed frames altogether and just crash the whole app.

Ignoring them is bad too, we need to make them communicate somehow. For the C++/glue side, I implemented two simple macros:
  1. #define BARRIER_BEGIN EExceptionCode e_code = EC_OK; try {  
  2. #define BARRIER_END } catch(TException& e) { laste = e; e_code = e.Code(); } return e_code;  

Surround any glue code with them even if you're sure it won't throw (to be sure and to have a consistent API). You have to change your signature so that it returns exception codes via the usual return mechanism, and actual objects, if any, are returned via a pointer as the last argument. laste in the macro is a thread-local variable that contains the actual exception object we have caught to be able to refer to it later from the C# side.

Then the C# side would have something similar to this:
  1. static Exception ConvertNativeExceptionToCLRException(EExceptionCode code, string msg)  
  2. {  
  3.     switch(code)  
  4.     {  
  5.         case EExceptionCode.EC_OK:  
  6.             return null;  
  7.           
  8.         case EExceptionCode.EC_NOT_IMPLEMENTED:  
  9.             return new NotImplementedException(msg);  
  10.   
  11.         case EExceptionCode.EC_INVALID_STATE:  
  12.             return new InvalidOperationException(msg);  
  13.           
  14.         case EExceptionCode.EC_OUT_OF_RANGE:  
  15.             return new ArgumentOutOfRangeException(msg);  
  16.           
  17.         case EExceptionCode.EC_FILE_NOT_FOUND:  
  18.             return new System.IO.FileNotFoundException(msg);  
  19.           
  20.         case EExceptionCode.EC_PLATFORM_DEPENDENT:  
  21.             return new PlatformDependentException(msg);  
  22.           
  23.         case EExceptionCode.EC_MALFORMED_STRING:  
  24.             return new MalformedStringException();  
  25.           
  26.         case EExceptionCode.EC_CUSTOM:  
  27.             return new CustomException();  
  28.           
  29.         case EExceptionCode.EC_ILLEGAL_ARGUMENT:  
  30.             return new ArgumentOutOfRangeException();  
  31.           
  32.         default:  
  33.             return new Exception(String.Format("Unknown exception '{0}'.", msg));
  34.     }             
  35. }  
  36.   
  37. static void ThrowFor(EExceptionCode e)  
  38. {  
  39.     string msg = "";  
  40.       
  41.     if(e != EExceptionCode.EC_OK)  
  42.     {  
  43.         IntPtr lastMsg = GetLastExceptionMessage();  
  44.         if(lastMsg != IntPtr.Zero)  
  45.             msg = Marshal.PtrToStringAnsi(lastMsg);  
  46.     }  
  47.   
  48.     Exception clrException = ConvertNativeExceptionToCLRException(e, msg);  
  49.     if(clrException != null)  
  50.         throw clrException;  
  51. }  


And then a usual P/Invoke call should have the following pattern:

  1. IntPtr canvshndl;  
  2. Native.ThrowFor(Native.CanvasCreate(out canvshndl));  

This will seamlessly integrate C++ and .NET exceptions.

Another caveat is when managed code inside a marshaled callback throws an exception while being possessed by native code. This is bad too. A managed exception knows nothing about native frames and C++ destructors, it will just skip them all until it reaches a managed frame, leaving us to memory leaks. For C#, there's no other way as to either ignore exceptions with an empty try...catch block, or just log the error, or in case you have some sort of queue, you can postpone the execution of the exception handler (probably rethrowing the exception) to a future time, outside native code.

3. Don't forget compilers are different.

Explicitly specify calling conventions. Use your own, explicit types instead of bools and enums in the glue code interface (for example, `typedef int` for both cases). Different compilers (even sometimes minor versions) implement these types differently, while C# expects something definite.