.NET Internals and Code Injection

Introduction
What is .NET Code Injection?
How does .NET Code Injection work?
.NET Internals (Part 1: JIT)
The .NET Assembly Loader
JIT Hooking Example
The .NET Code Ejector
Code Ejection Demonstration
.NET Internals (Part 2: MethodDesc)
.NET Internals (Part 3: IEE, Internal Calls, etc.)
Other Injection/Ejection Approaches
Conclusions

Introduction

This article is the obvious culmination of the previous effort of writing the Rebel.NET application and the first of a two series of articles about the .NET framework internals and the protections available for .NET assemblies. The next article will be about .NET native compiling. As the JIT inner workings haven't been analyzed yet, .NET protections are quite naïf nowadays. This situation will rapidly change as soon as the reverse engineering community will focus its attention on this technology. These two articles are aimed to raise the consiousness about the current state of .NET protections and what is possible to achieve but hasn't been done yet. In particular, the current article about .NET code injection represents, let's say, the present, whereas the next one about .NET native compiling represents the future. What I'm presenting in these two articles is new at the time I'm writing it, but I expect it to become obsolete in less than a year. Of course, this is obvious as I'm moving the first steps out from current .NET protections in the direction of better ones. But this article isn't really about protections: exploring the .NET framework internals can be useful for many purposes. So, talking about protections is just a means to an end.

What is .NET Code Injection?

.NET code injection is the "strong" brother of .NET packers (which unpack the entire assembly in memory). What .NET code injectors do is to hook the JIT and when the MSIL code of a method is requested they filter the request and provide the real MSIL instead of the MSIL contained in the assembly, which, most of the times, is just a ret. By injecting one (or quasi) method at a time, the MSIL code will remain conceiled. Even if one manages to dump the code, it isn't to be expected that the protection left the necessary space for the real MSIL code in the .NET assembly, although many commercial protections do so. Rebuilding the assembly from scratch is the universally valid way to proceed. This, of course, is not a problem with Rebel.NET.

It should be obvious to the reader that .NET code injectors aren't reliable protections. It's just playing hide and seek with the reverser. But, as many software producers are putting their intellectual property in the hands of such protections, it is necessary to analyze them throughout.

How does .NET Code Injection work?

One thing should be clear from the beginning: there isn't only one method to inject MSIL. Thus, to remove this kind of protection you have to evaluate the specific case.

A very clean approach, though unused yet, would be to inject the MSIL through the .NET profiling API. There's a very in depth article about the .NET profiling API by Aleksandr Mikunov on MSDN. Anyway, as I already said, this approach isn't used by .NET protections. I referred to this approach as clean, simply because it uses the API provided by the framework itself. Thus, it'll work on every .NET framework no matter what. Whereas .NET protections usually hook the JIT and this, although it might work just as well, it is not guaranteed to do so.

The .NET framework's JIT is contained in the mscorjit.dll module. To identify the part of the JIT being hooked by the .NET protection there's a very simple an effect way: dumping the mscorjit.dll module from the protection's process and comparing it to the original module on disk. I wrote a little CFF Explorer Script to make a comparision of a PE section which excludes the IAT in the comparision. This is especially useful when comparing the .text section of two executables.

-Download the Section Comparer script

It was a ten minutes job and it is extremely useful to identify the type of hook applied to the JIT. Let's look at a possible output of this script:

Comparision between section 0 of
C:\...\mscorjit.dll
and
C:\...\mscorjit_dumped.dll

Differences found at:

RVA1     RVA2

000460A0 000460A0
000460A1 000460A1
000460A2 000460A2
000460A3 000460A3

Number of differences found: 4

By looking at the patched dword through a disassembler, it is possible to understand the kind of hook:

.text:790A60A0 ??_7CILJit@@6B@ dd offset ?compileMethod@CILJit@@EAG?AW4CorJitResult@@PAVICorJitInfo@@PAUCORINFO_METHOD_INFO@@IPAPAEPAK@Z
.text:790A60A0      ; DATA XREF: getJit()+Eo
.text:790A60A0      ; CILJit::compileMethod(ICorJitInfo *,CORINFO_METHOD_INFO *,uint,uchar * *,ulong *)
.text:790A60A4       
dd offset ?clearCache@CILJit@@EAGXXZ ; CILJit::clearCache(void)
.text:790A60A8       
dd offset ?isCacheCleanupRequired@CILJit@@EAGHXZ ; CILJit::isCacheCleanupRequired(void)

The patched dword is the offset of the compileMethod method of the CILJit class. Which brings us to the next paragraph.

.NET Internals (Part 1: JIT)

In this paragraph I'll present you the compileMethod function as a way to gain complete control of the JIT. Code injectors don't need that much of control and their functionality is actually very easy.

As the mscorjit's debug symbols already signalled, the compileMethod function takes a pointer to the CORINFO_METHOD_INFO structure among other parameters. To explore in depth the JIT internals I'll rely upon the Microsoft Rotor Project, which basically is a smaller open source version of the .NET framework. I think almost everyone knows the existence of this project, but only a limited number of people know how much one can take from its internals to use in the official framework's context. And I'm not talking about those who create code injectors, because they don't need so much knowledge about the JIT internal workings to achieve what they do.

Let's take a look at the CORINFO_METHOD_INFO structure:

struct CORINFO_METHOD_INFO
{
    CORINFO_METHOD_HANDLE       ftn;
    CORINFO_MODULE_HANDLE       scope;
    BYTE *                      ILCode;
    unsigned                    ILCodeSize;
    unsigned short              maxStack;
    unsigned short              EHcount;
    CorInfoOptions              options;
    CORINFO_SIG_INFO            args;
    CORINFO_SIG_INFO            locals;
};

The only thing code injectors have to do is to provide a valid MSIL pointer and size given the two members of this structure: ILCode and ILCodeSize. Code injectors rely on the ILCode pointer to know which method is being requested. In fact, this pointer addresses the original MSIL code inside the .NET assembly. Many code injectors don't even need to know which method is being requested as the ILCode points to data which only needs to be decrypted.

The pointer to the vftable which contains the address to compileMethod is retrieved through the only API exported by mscorjit: getJit.

extern "C"
ICorJitCompiler* __stdcall getJit()
{
    static char FJitBuff[sizeof(FJitCompiler)];
    if (ILJitter == 0)
    {
        // no need to check for out of memory, since caller checks for return value of NULL
        ILJitter = new(FJitBuff) FJitCompiler();
        _ASSERTE(ILJitter != NULL);
    }
    return(ILJitter);
}

And this is about all that code injectors ought to know to do their job. But we go further. The FJitCompiler class is this:

class FJitCompiler : public ICorJitCompiler
{
public:

    /* the jitting function */
    CorJitResult __stdcall compileMethod (
            ICorJitInfo*            comp,               /* IN */
            CORINFO_METHOD_INFO*    info,               /* IN */
            unsigned                flags,              /* IN */
            BYTE **                 nativeEntry,        /* OUT */
            ULONG  *                nativeSizeOfCode    /* OUT */
            );
   
    /* notification from VM to clear caches */
    void __stdcall clearCache();
    BOOL __stdcall isCacheCleanupRequired();

    static BOOL Init();
    static void Terminate();

private:
    /* grab and remember the jitInterface helper addresses that we need at runtime */
    BOOL GetJitHelpers(ICorJitInfo* jitInfo);
};

ICorJitCompiler is only an interface, so we don't have to discuss it. compileMethod is the first memeber of the class, of course. The idea that I could gain complete control of the JIT hit me pretty fast. It's a bit difficult to explain in words, but in about ten minutes I noticed that the correspondencies between the disassembled mscorjit code and the Rotor one were just too many. So, I decided to include the header files necessary to use the ICorJitInfo class from the Rotor project. Actually, to use this class, only two header files were necessary: corinfo.h and corjit.h. All include files can be found in the path "sscli20\clr\src\inc" of the Rotor project. While the path of the JIT code is "sscli20\clr\src\fjit". Here's a brief summary of what the main files in the JIT path contain:

fjit.cpp The actual JIT. It's a huge file of code since it contains all the code to convert MSIL to native code, although the native code is defined externally.
fjitcompiler.cpp Contains the getJit and compileMethod functions. It's the interface provided by mscorjit to communicate with the JIT.

Basically, the ICorJitCompiler is the interface accessed by the Execution Engine to convert MSIL to native code. One of the arguments of compileMethod is ICorJitInfo which is a class used by the JIT to call back to the Execution Engine in order to retrieve the information it needs. Having complete access to the ICorJitInfo class doesn't open up the whole framework for us. But it's a very good start. Let's have a look at what this class can do. Here's the base declaration:

/*********************************************************************************
* a ICorJitInfo is the main interface that the JIT uses to call back to the EE and
* get information
*********************************************************************************/

class ICorJitInfo : public virtual ICorDynamicInfo
{
public:
    // return memory manager that the JIT can use to allocate a regular memory
    virtual IEEMemoryManager* __stdcall getMemoryManager() = 0;

    // get a block of memory for the code, readonly data, and read-write data
    virtual void __stdcall allocMem (
            ULONG               hotCodeSize,    /* IN */
            ULONG               coldCodeSize,   /* IN */
            ULONG               roDataSize,     /* IN */
            ULONG               rwDataSize,     /* IN */
            ULONG               xcptnsCount,    /* IN */
            CorJitAllocMemFlag  flag,           /* IN */
            void **             hotCodeBlock,   /* OUT */
            void **             coldCodeBlock,  /* OUT */
            void **             roDataBlock,    /* OUT */
            void **             rwDataBlock     /* OUT */
            ) = 0;


        // Get a block of memory needed for the code manager information,
        // (the info for enumerating the GC pointers while crawling the
        // stack frame).
        // Note that allocMem must be called first
    virtual void * __stdcall allocGCInfo (
            ULONG                    size        /* IN */
            ) = 0;

    virtual void * __stdcall getEHInfo(
            ) = 0;

    virtual void __stdcall yieldExecution() = 0;

   // indicate how many exception handlers blocks are to be returned
   // this is guarenteed to be called before any 'setEHinfo' call.
   // Note that allocMem must be called before this method can be called
    virtual void __stdcall setEHcount (
            unsigned           cEH    /* IN */
            ) = 0;

    // set the values for one particular exception handler block
    //
    // Handler regions should be lexically contiguous.
    // This is because FinallyIsUnwinding() uses lexicality to
    // determine if a "finally" clause is executing
    virtual void __stdcall setEHinfo (
            unsigned           EHnumber,   /* IN */
            const CORINFO_EH_CLAUSE *clause      /* IN */
            ) = 0;

    // Level -> fatalError, Level 2 -> Error, Level 3 -> Warning
    // Level 4 means happens 10 times in a run, level 5 means 100, level 6 means 1000 ...
    // returns non-zero if the logging succeeded
    virtual BOOL __cdecl logMsg(unsigned level, const char* fmt, va_list args) = 0;

    // do an assert. will return true if the code should retry (DebugBreak)
    // returns false, if the assert should be igored.
    virtual int __stdcall doAssert(const char* szFile, int iLine, const char* szExpr) = 0;

    struct ProfileBuffer
    {
        ULONG bbOffset;
        ULONG bbCount;
    };

    // allocate a basic block profile buffer where execution counts will be stored
    // for jitted basic blocks.
    virtual HRESULT __stdcall allocBBProfileBuffer (
            ULONG                 size,
            ProfileBuffer **      profileBuffer
            ) = 0;

    // get profile information to be used for optimizing the current method. The format
    // of the buffer is the same as the format the JIT passes to allocBBProfileBuffer.
    virtual HRESULT __stdcall getBBProfileData(
            CORINFO_METHOD_HANDLE ftnHnd,
            ULONG *               size,
            ProfileBuffer **      profileBuffer,
            ULONG *               numRuns
            ) = 0;


};

This doesn't seem much, but the ICorJitInfo class inherits from the ICorDynamicInfo one.

/*****************************************************************************
* ICorDynamicInfo contains EE interface methods which return values that may
* change from invocation to invocation. They cannot be embedded in persisted
* data; they must be requeried each time the EE is run.
*****************************************************************************/


class ICorDynamicInfo : public virtual ICorStaticInfo
{
public:

    //
    // These methods return values to the JIT which are not constant
    // from session to session.
    //
    // These methods take an extra parameter : void **ppIndirection.
    // If a JIT supports generation of prejit code (install-o-jit), it
    // must pass a non-null value for this parameter, and check the
    // resulting value. If *ppIndirection is NULL, code should be
    // generated normally. If non-null, then the value of
    // *ppIndirection is an address in the cookie table, and the code
    // generator needs to generate an indirection through the table to
    // get the resulting value. In this case, the return result of the
    // function must NOT be directly embedded in the generated code.
    //
    // Note that if a JIT does not support prejit code generation, it
    // may ignore the extra parameter & pass the default of NULL - the
    // prejit ICorDynamicInfo implementation will see this & generate
    // an error if the jitter is used in a prejit scenario.
    //

    // Return details about EE internal data structures

    virtual DWORD __stdcall getThreadTLSIndex(
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual const void * __stdcall getInlinedCallFrameVptr(
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual LONG * __stdcall getAddrOfCaptureThreadGlobal(
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual SIZE_T* __stdcall       getAddrModuleDomainID(CORINFO_MODULE_HANDLE   module) = 0;

    // return the native entry point to an EE helper (see CorInfoHelpFunc)
    virtual void* __stdcall getHelperFtn (
                    CorInfoHelpFunc         ftnNum,
                    void                  **ppIndirection = NULL,
                    InfoAccessModule       *pAccessModule = NULL
                    ) = 0;

    // return a callable address of the function (native code). This function
    // may return a different value (depending on whether the method has
    // been JITed or not. pAccessType is an in-out parameter. The JIT
    // specifies what level of indirection it desires, and the EE sets it
    // to what it can provide (which may not be the same).
    virtual void __stdcall getFunctionEntryPoint(
                              CORINFO_METHOD_HANDLE   ftn,                 /* IN */
                              InfoAccessType          requestedAccessType, /* IN */
                              CORINFO_CONST_LOOKUP *  pResult,             /* OUT */
                              CORINFO_ACCESS_FLAGS    accessFlags = CORINFO_ACCESS_ANY) = 0;

    // return a directly callable address. This can be used similarly to the
    // value returned by getFunctionEntryPoint() except that it is
    // guaranteed to be the same for a given function.
    // pAccessType is an in-out parameter. The JIT
    // specifies what level of indirection it desires, and the EE sets it
    // to what it can provide (which may not be the same).
    virtual void __stdcall getFunctionFixedEntryPointInfo(
                              CORINFO_MODULE_HANDLE  scopeHnd,
                              unsigned               metaTOK,
                              CORINFO_CONTEXT_HANDLE context,
                              CORINFO_LOOKUP *       pResult) = 0;

    // get the syncronization handle that is passed to monXstatic function
    virtual void* __stdcall getMethodSync(
                    CORINFO_METHOD_HANDLE               ftn,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // These entry points must be called if a handle is being embedded in
    // the code to be passed to a JIT helper function. (as opposed to just
    // being passed back into the ICorInfo interface.)

    // a module handle may not always be available. A call to embedModuleHandle should always
    // be preceeded by a call to canEmbedModuleHandleForHelper. A dynamicMethod does not have a module
    virtual bool __stdcall canEmbedModuleHandleForHelper(
                    CORINFO_MODULE_HANDLE   handle
                    ) = 0;

    virtual CORINFO_MODULE_HANDLE __stdcall embedModuleHandle(
                    CORINFO_MODULE_HANDLE   handle,
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual CORINFO_CLASS_HANDLE __stdcall embedClassHandle(
                    CORINFO_CLASS_HANDLE    handle,
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual CORINFO_METHOD_HANDLE __stdcall embedMethodHandle(
                    CORINFO_METHOD_HANDLE   handle,
                    void                  **ppIndirection = NULL
                    ) = 0;

    virtual CORINFO_FIELD_HANDLE __stdcall embedFieldHandle(
                    CORINFO_FIELD_HANDLE    handle,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // Given a module scope (module), a method handle (context) and
    // a metadata token (metaTOK), fetch the handle
    // (type, field or method) associated with the token.
    // If this is not possible at compile-time (because the current method's
    // code is shared and the token contains generic parameters)
    // then indicate how the handle should be looked up at run-time.
    //
    // Type tokens can be combined with CORINFO_ANNOT_MASK flags
    // to obtain array type handles. These are typically required by the 'newarr'
    // instruction which takes a token for the *element* type of the array.
    //
    // Similarly method tokens can be combined with CORINFO_ANNOT_MASK flags
    // method entry points. These are typically required by the 'call' and 'ldftn'
    // instructions.
    //
    // Byrefs or System.Void should only occur in method and local signatures, which
    // are accessed using ICorClassInfo and ICorClassInfo.getChildType. ldtoken is one
    // exception from this rule. allowAllTypes should be set to true only for ldtoken only!
    //
    virtual void __stdcall embedGenericHandle(
                        CORINFO_MODULE_HANDLE   module,
                        unsigned                metaTOK,
                        CORINFO_CONTEXT_HANDLE  context,
                        CorInfoTokenKind        tokenKind,
                        CORINFO_GENERICHANDLE_RESULT *pResult) = 0;

    // Return information used to locate the exact enclosing type of the current method.
    // Used only to invoke .cctor method from code shared across generic instantiations
    // !needsRuntimeLookup statically known (enclosing type of method itself)
    // needsRuntimeLookup:
    // CORINFO_LOOKUP_THISOBJ use vtable pointer of 'this' param
    // CORINFO_LOOKUP_CLASSPARAM use vtable hidden param
    // CORINFO_LOOKUP_METHODPARAM use enclosing type of method-desc hidden param
    virtual CORINFO_LOOKUP_KIND __stdcall getLocationOfThisType(
                    CORINFO_METHOD_HANDLE context
                    ) = 0;

    // return the unmanaged target *if method has already been prelinked.*
    virtual void* __stdcall getPInvokeUnmanagedTarget(
                    CORINFO_METHOD_HANDLE   method,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // return address of fixup area for late-bound PInvoke calls.
    virtual void* __stdcall getAddressOfPInvokeFixup(
                    CORINFO_METHOD_HANDLE   method,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // Generate a cookie based on the signature that would needs to be passed
    // to CORINFO_HELP_PINVOKE_CALLI
    virtual LPVOID GetCookieForPInvokeCalliSig(
            CORINFO_SIG_INFO* szMetaSig,
            void           ** ppIndirection = NULL
            ) = 0;

    // Gets a handle that is checked to see if the current method is
    // included in "JustMyCode"
    virtual CORINFO_JUST_MY_CODE_HANDLE __stdcall getJustMyCodeHandle(
                    CORINFO_METHOD_HANDLE       method,
                    CORINFO_JUST_MY_CODE_HANDLE**ppIndirection = NULL
                    ) = 0;

    // Gets a method handle that can be used to correlate profiling data.
    // This is the IP of a native method, or the address of the descriptor struct
    // for IL. Always guaranteed to be unique per process, and not to move. */
    virtual void __stdcall GetProfilingHandle(
                    CORINFO_METHOD_HANDLE      method,
                    BOOL                      *pbHookFunction,
                    void                     **pEEHandle,
                    void                     **pProfilerHandle,
                    BOOL                      *pbIndirectedHandles
                    ) = 0;

    // returns the offset into the interface table
    virtual unsigned __stdcall getInterfaceTableOffset (
                    CORINFO_CLASS_HANDLE    cls,
                    void                  **ppIndirection = NULL
                    ) = 0;

    //return the address of a pointer to a callable stub that will do the virtual or interface call
    //
    // When inlining methodBeingCompiledHnd should be the originating caller in a sequence of nested
    // inlines, e.g. it is used to determine if the code being generated is domain
    // neutral or not.

    virtual void __stdcall getCallInfo(
                        CORINFO_METHOD_HANDLE   methodBeingCompiledHnd,
                        CORINFO_MODULE_HANDLE   tokenScope,
                        unsigned                methodToken,
                        unsigned                constraintToken, // the type token from a preceding constraint.
                                        // prefix instruction (if any)
                        CORINFO_CONTEXT_HANDLE  tokenContext,
                        CORINFO_CALLINFO_FLAGS  flags,
                        CORINFO_CALL_INFO *pResult) = 0;

    // Returns TRUE if the Class Domain ID is the RID of the class (currently true for every class
    // except reflection emitted classes and generics)
    virtual BOOL __stdcall isRIDClassDomainID(CORINFO_CLASS_HANDLE cls) = 0;

    // returns the class's domain ID for accessing shared statics
    virtual unsigned __stdcall getClassDomainID (
                    CORINFO_CLASS_HANDLE    cls,
                    void                  **ppIndirection = NULL
                    ) = 0;


    virtual size_t __stdcall getModuleDomainID  (
                    CORINFO_MODULE_HANDLE    module,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // return the data's address (for static fields only)
    virtual void* __stdcall getFieldAddress(
                    CORINFO_FIELD_HANDLE    field,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // registers a vararg sig & returns a VM cookie for it (which can contain other stuff)
    virtual CORINFO_VARARGS_HANDLE __stdcall getVarArgsHandle(
                    CORINFO_SIG_INFO       *pSig,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // Allocate a string literal on the heap and return a handle to it
    virtual InfoAccessType __stdcall constructStringLiteral(
                    CORINFO_MODULE_HANDLE   module,
                    mdToken                 metaTok,
                    void                  **ppInfo
                    ) = 0;

    // (static fields only) given that 'field' refers to thread local store,
    // return the ID (TLS index), which is used to find the begining of the
    // TLS data area for the particular DLL 'field' is associated with.
    virtual DWORD __stdcall getFieldThreadLocalStoreID (
                    CORINFO_FIELD_HANDLE    field,
                    void                  **ppIndirection = NULL
                    ) = 0;

    // returns the class typedesc given a methodTok (needed for arrays since
    // they share a common method table, so we can't use getMethodClass)
    virtual CORINFO_CLASS_HANDLE __stdcall findMethodClass(
                    CORINFO_MODULE_HANDLE   module,
                    mdToken                 methodTok,
                    CORINFO_METHOD_HANDLE   context
                    ) = 0;

    // Sets another object to intercept calls to "self"
    virtual void __stdcall setOverride(
                ICorDynamicInfo             *pOverride
                ) = 0;

    // Adds an active dependency from the context method's module to the given module
    virtual void __stdcall addActiveDependency(
               CORINFO_MODULE_HANDLE       moduleFrom,
               CORINFO_MODULE_HANDLE       moduleTo
                ) = 0;

    virtual CORINFO_METHOD_HANDLE __stdcall GetDelegateCtor(
            CORINFO_METHOD_HANDLE  methHnd,
            CORINFO_CLASS_HANDLE   clsHnd,
            CORINFO_METHOD_HANDLE  targetMethodHnd,
            DelegateCtorArgs *     pCtorData
            ) = 0;

    virtual void __stdcall MethodCompileComplete(
                CORINFO_METHOD_HANDLE methHnd
                ) = 0;
};

Now this seems already much more interesting indeed. But we aren't done yet as the ICorDynamicInfo class inherits from the ICorStaticInfo. The ICorStaticInfo class inherits from many classes:

class ICorStaticInfo : public virtual ICorMethodInfo, public virtual ICorModuleInfo,
                       public virtual ICorClassInfo,  public virtual ICorFieldInfo,
                       public virtual ICorDebugInfo,  public virtual ICorArgInfo,
                       public virtual ICorLinkInfo,   public virtual ICorErrorInfo
{
public:
    // Return details about EE internal data structures
    virtual void __stdcall getEEInfo(
                CORINFO_EE_INFO            *pEEInfoOut
                ) = 0;
};

Let's look at just one of them (ICorMethodInfo):

class ICorMethodInfo
{
public:
    // this function is for debugging only. It returns the method name
    // and if 'moduleName' is non-null, it sets it to something that will
    // says which method (a class name, or a module name)
    virtual const char* __stdcall getMethodName (
            CORINFO_METHOD_HANDLE       ftn,        /* IN */
            const char                **moduleName  /* OUT */
            ) = 0;

    // this function is for debugging only. It returns a value that
    // is will always be the same for a given method. It is used
    // to implement the 'jitRange' functionality
    virtual unsigned __stdcall getMethodHash (
            CORINFO_METHOD_HANDLE       ftn         /* IN */
            ) = 0;

    // return flags (defined above, CORINFO_FLG_PUBLIC ...)
    // The callerHnd can be either the methodBeingCompiled or the immediate
    // caller of an inlined function.
    virtual DWORD __stdcall getMethodAttribs (
            CORINFO_METHOD_HANDLE       calleeHnd,        /* IN */
            CORINFO_METHOD_HANDLE       callerHnd     /* IN */
            ) = 0;

    // sets private JIT flags, which can be, retrieved using getAttrib.
    virtual void __stdcall setMethodAttribs (
            CORINFO_METHOD_HANDLE       ftn,        /* IN */
            CorInfoMethodRuntimeFlags   attribs     /* IN */
            ) = 0;

    // Given a method descriptor ftnHnd, extract signature information into sigInfo
    //
    // 'memberParent' is typically only set when verifying. It should be the
    // result of calling getMemberParent.
    virtual void __stdcall getMethodSig (
             CORINFO_METHOD_HANDLE      ftn,        /* IN */
             CORINFO_SIG_INFO          *sig,        /* OUT */
             CORINFO_CLASS_HANDLE      memberParent = NULL /* IN */
             ) = 0;

    /*********************************************************************
     * Note the following methods can only be used on functions known
     * to be IL. This includes the method being compiled and any method
     * that 'getMethodInfo' returns true for
     *********************************************************************/


    // return information about a method private to the implementation
    // returns false if method is not IL, or is otherwise unavailable.
    // This method is used to fetch data needed to inline functions
    virtual bool __stdcall getMethodInfo (
            CORINFO_METHOD_HANDLE   ftn,            /* IN */
            CORINFO_METHOD_INFO*    info            /* OUT */
            ) = 0;

    // Decides if you have any limitations for inlining. If everything's OK, it will return
    // INLINE_PASS and will fill out pRestrictions with a mask of restrictions the caller of this
    // function must respect. If caller passes pRestrictions = NULL, if there are any restrictions
    // INLINE_FAIL will be returned
    //
    //
    // The inlined method need not be verified

    virtual CorInfoInline __stdcall canInline (
            CORINFO_METHOD_HANDLE       callerHnd,                  /* IN */
            CORINFO_METHOD_HANDLE       calleeHnd,                  /* IN */
            DWORD*                      pRestrictions              /* OUT */
            ) = 0;


    // Returns false if the call is across assemblies thus we cannot tailcall

    virtual bool __stdcall canTailCall (
            CORINFO_METHOD_HANDLE   callerHnd,      /* IN */
            CORINFO_METHOD_HANDLE   calleeHnd,      /* IN */
            bool fIsTailPrefix                      /* IN */
            ) = 0;

    // Returns false if precompiled code must ensure that
    // the EE's DoPrestub function gets run before the
    // code for the method is used, i.e. if it returns false
    // then an indirect call must be made.
    //
    // Returning true does not guaratee that a direct call can be made:
    // there can be other reasons why the entry point cannot be embedded.
    //

    virtual bool __stdcall canSkipMethodPreparation (
            CORINFO_METHOD_HANDLE   callerHnd,      /* IN */
            CORINFO_METHOD_HANDLE   calleeHnd,      /* IN */
            bool                    fCheckCode,     /* IN */
            CorInfoIndirectCallReason *pReason = NULL,
            CORINFO_ACCESS_FLAGS    accessFlags = CORINFO_ACCESS_ANY) = 0;

    // Returns true if a direct call can be made via the method entry point
    //
    virtual bool __stdcall canCallDirectViaEntryPointThunk (
            CORINFO_METHOD_HANDLE   calleeHnd,      /* IN */
            void **                 pEntryPoint     /* OUT */
            ) = 0;

    // get individual exception handler
    virtual void __stdcall getEHinfo(
            CORINFO_METHOD_HANDLE ftn,              /* IN */
            unsigned          EHnumber,             /* IN */
            CORINFO_EH_CLAUSE* clause               /* OUT */
            ) = 0;

    // return class it belongs to
    virtual CORINFO_CLASS_HANDLE __stdcall getMethodClass (
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // return module it belongs to
    virtual CORINFO_MODULE_HANDLE __stdcall getMethodModule (
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // This function returns the offset of the specified method in the
    // vtable of it's owning class or interface.
    virtual unsigned __stdcall getMethodVTableOffset (
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // If a method's attributes have (getMethodAttribs) CORINFO_FLG_INTRINSIC set,
    // getIntrinsicID() returns the intrinsic ID.
    virtual CorInfoIntrinsics __stdcall getIntrinsicID(
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // return the unmanaged calling convention for a PInvoke
    virtual CorInfoUnmanagedCallConv __stdcall getUnmanagedCallConv(
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // return if any marshaling is required for PInvoke methods. Note that
    // method == 0 => calli. The call site sig is only needed for the varargs or calli case
    virtual BOOL __stdcall pInvokeMarshalingRequired(
            CORINFO_METHOD_HANDLE       method,
            CORINFO_SIG_INFO*           callSiteSig
            ) = 0;

    // Check Visibility rules.
    // For Protected (family access) members, type of the instance is also
    // considered when checking visibility rules.
    virtual BOOL __stdcall canAccessMethod(
            CORINFO_METHOD_HANDLE       context,
            CORINFO_CLASS_HANDLE        parent,
            CORINFO_METHOD_HANDLE       target,
            CORINFO_CLASS_HANDLE        instance
            ) = 0;

    // Check constraints on method type arguments (only).
    // The parent class should be checked separately using satisfiesClassConstraints(parent).
    virtual BOOL __stdcall satisfiesMethodConstraints(
            CORINFO_CLASS_HANDLE        parent, // the exact parent of the method
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // Given a delegate target class, a target method parent class, a target method,
    // a delegate class, a scope, the target method ref, and the delegate constructor member ref
    // check if the method signature is compatible with the Invoke method of the delegate
    // (under the typical instantiation of any free type variables in the memberref signatures).
    // NB: arguments 2-4 could be inferred from 5-7, but are assumed to be available, and thus passed in for efficiency.
    virtual BOOL __stdcall isCompatibleDelegate(
            CORINFO_CLASS_HANDLE        objCls,           /* type of the delegate target, if any */
            CORINFO_CLASS_HANDLE        methodParentCls,  /* exact parent of the target method, if any */
            CORINFO_METHOD_HANDLE       method,           /* (representative) target method, if any */
            CORINFO_CLASS_HANDLE        delegateCls,      /* exact type of the delegate */
            CORINFO_MODULE_HANDLE       moduleHnd,        /* scope of the following refs */
            unsigned        methodMemberRef,              /* memberref of the target method */
            unsigned        delegateConstructorMemberRef  /* memberref of the delegate constructor */
            ) = 0;


    // Indicates if the method is an instance of the generic
    // method that passes (or has passed) verification
    virtual CorInfoInstantiationVerification __stdcall isInstantiationOfVerifiedGeneric (
            CORINFO_METHOD_HANDLE   method /* IN */
            ) = 0;

    // Loads the constraints on a typical method definition, detecting cycles;
    // for use in verification.
    virtual void __stdcall initConstraintsForVerification(
            CORINFO_METHOD_HANDLE   method, /* IN */
            BOOL *pfHasCircularClassConstraints, /* OUT */
            BOOL *pfHasCircularMethodConstraint /* OUT */
            ) = 0;

    // Returns enum whether the method does not require verification
    // Also see ICorModuleInfo::canSkipVerification
    virtual CorInfoCanSkipVerificationResult __stdcall canSkipMethodVerification (
            CORINFO_METHOD_HANDLE       ftnHandle,     /* IN */
            BOOL                        fQuickCheckOnly
            ) = 0;

    // Determines whether a callout is allowed.
    virtual CorInfoIsCallAllowedResult __stdcall isCallAllowed (
            CORINFO_METHOD_HANDLE       callerHnd,                  // IN
            CORINFO_METHOD_HANDLE       calleeHnd,                  // IN
            CORINFO_CALL_ALLOWED_INFO * CallAllowedInfo             // OUT
            ) = 0;

    // load and restore the method
    virtual void __stdcall methodMustBeLoadedBeforeCodeIsRun(
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    virtual CORINFO_METHOD_HANDLE __stdcall mapMethodDeclToMethodImpl(
            CORINFO_METHOD_HANDLE       method
            ) = 0;

    // Returns the global cookie for the /GS unsafe buffer checks
    // The cookie might be a constant value (JIT), or a handle to memory location (Ngen)
    virtual void __stdcall getGSCookie(
            GSCookie * pCookieVal,                     // OUT
            GSCookie ** ppCookieVal                    // OUT
            ) = 0;
};

As you can see, the ICorMethodInfo class contains many methods which take one or more CORINFO_METHOD_HANDLE as parameters. Just like the ICorModuleInfo has methods which take CORINFO_MODULE_HANDLE parameters. This handles should be discussed, because all the JIT inner workings rely on them. Here's their declaration:

// Cookie types consumed by the code generator (these are opaque values
// not inspected by the code generator):

typedef struct CORINFO_ASSEMBLY_STRUCT_*    CORINFO_ASSEMBLY_HANDLE;
typedef struct CORINFO_MODULE_STRUCT_*      CORINFO_MODULE_HANDLE;
typedef struct CORINFO_DEPENDENCY_STRUCT_*  CORINFO_DEPENDENCY_HANDLE;
typedef struct CORINFO_CLASS_STRUCT_*       CORINFO_CLASS_HANDLE;
typedef struct CORINFO_METHOD_STRUCT_*      CORINFO_METHOD_HANDLE;
typedef struct CORINFO_FIELD_STRUCT_*       CORINFO_FIELD_HANDLE;
// represents a list of argument types
typedef struct CORINFO_ARG_LIST_STRUCT_*    CORINFO_ARG_LIST_HANDLE;   
// represents the whole list
typedef struct CORINFO_SIG_STRUCT_*         CORINFO_SIG_HANDLE;        
typedef struct CORINFO_JUST_MY_CODE_HANDLE_*CORINFO_JUST_MY_CODE_HANDLE;
// a handle guaranteed to be unique per process
typedef struct CORINFO_PROFILING_STRUCT_*   CORINFO_PROFILING_HANDLE;  
typedef DWORD*                              CORINFO_SHAREDMODULEID_HANDLE;
// a generic handle (could be any of the above)
typedef struct CORINFO_GENERIC_STRUCT_*     CORINFO_GENERIC_HANDLE;   

The structures are not defined in the code: as the comment states, they're "opaque". Actually, these handles are just pointers. But we'll see that later. What should be understood now is that they are used by the methods of the JIT to identify things. I pasted all the declarations above to give the reader an idea of the kind of power over the JIT given by the two includes i mentioned earlier. Let's have look, for instance, at the first method of ICorMethodInfo:

virtual const char* __stdcall getMethodName (
            CORINFO_METHOD_HANDLE       ftn,        /* IN */
            const char                **moduleName  /* OUT */
            ) = 0;

This function retrieves the method's name and class. And I'll use it to give a basic example of how to hook the JIT and retrieving basic information from it. But first, I have to introduce another thing: the .NET assembly loader.

The .NET Assembly Loader

As we need to hook the JIT before the victim assembly is jitted, the best way to do this is to load the victim assembly from another assembly which in meantime has already hooked the JIT. I call this the loader and it this method works only when the protection isn't wrapping the assembly into a native executable. In that case, you might consider adding the hook dll to the import table of the native exe or creating the victim process in a suspended state, injecting the dll and then resuming the execution.

There are several ways to load an assembly into the current address space. Unfortunately, the most used ones often do not work in all cases. For example a common way to do this it use the Assembly.Load/LoadFrom function. This approach will cause the application to crash if the hosted assembly needs its own appdomain.

This is the very simple approach I'm using in this article:

namespace rbloader
{
    static class Program
    {
        ///
        /// The main entry point for the application.
        ///
        ///

        [DllImport("rbcoree.dll", CallingConvention=CallingConvention.Cdecl)]
        static extern void HookJIT();

        [STAThread]
        static void Main(string[] args)
        {
            Application.EnableVisualStyles();
            Application.SetCompatibleTextRenderingDefault(false);

            OpenFileDialog openFileDialog = new OpenFileDialog();

            openFileDialog.Filter = "Exe Files (*.exe)|*.exe|Dll Files (*.dll)|*.dll|All Files (*.*)|*.*";
            if (openFileDialog.ShowDialog() == DialogResult.OK)
            {
                HookJIT();
             
                AppDomain ad = AppDomain.CreateDomain("subAppDomain");
                ad.ExecuteAssembly(openFileDialog.FileName);
            }


        }
    }
}

Keep in mind that with certain protections this code loading method won't work. So, you have to evaluate each case (since it depends on how the injector is implemented) and find a way to hook the JIT before the protection does.

Also, the first executed assembly decides the .NET framework platform. Thus, the loader should be compiled consequently. What this means is that you can choose the platform on which a certain assembly has to run from the Visual Studio options. If you choose an 64-bit platform, then the PE of the assembly will be a 64-bit PE which can run only on the platform it was meant for. Whereas 32-bit PEs can run on every platform. To force a 32-bit PE to use the x86 framework, a flag has to be set in the .NET Directory's Flags field:

The code I wrote in this article is 64-bit compatible, but I only tested it on x86. Thus, the loader has the 32-bit code set. To use the loader in a 64-bit context, you have to unset this flag.

JIT Hooking Example

What I'm going to present here is a little example of how to hook the JIT and retrieve information from it. What I'm going to do is to show the class and method name of each jitted method and of each method this method is calling. This is a good time to introduce the fact that the compileMethod functions not only jits the method it's supposed to, but also all the methods called by that method and all the methods called by those methods etc. This is obviously so, as a call can't be jitted if the called method hasn't already been jitted along with its submethods etc. However, even if this is pretty obvious, it is worth to keep that in mind. To disassemble the method's MSIL, I use my DisasMSIL engine.

- Download Call Displayer

#include "stdafx.h"
#include <tchar.h>
#include <CorHdr.h>
#include "corinfo.h"
#include "corjit.h"
#include "DisasMSIL.h"

HINSTANCE hInstance;

extern "C" __declspec(dllexport) void HookJIT();

VOID DisplayMethodAndCalls(ICorJitInfo *comp,
                     CORINFO_METHOD_INFO *info);


BOOL APIENTRY DllMain( HMODULE hModule,
                  DWORD  dwReason,
                  LPVOID lpReserved
                 )
{
   hInstance = (HINSTANCE) hModule;

   HookJIT();

   return TRUE;
}


//
// Hook JIT's compileMethod
//

BOOL bHooked = FALSE;

ULONG_PTR *(__stdcall *p_getJit)();
typedef int (__stdcall *compileMethod_def)(ULONG_PTR classthis, ICorJitInfo *comp,
                                 CORINFO_METHOD_INFO *info, unsigned flags,        
                                 BYTE **nativeEntry, ULONG  *nativeSizeOfCode);
struct JIT
{
   compileMethod_def compileMethod;
};

compileMethod_def compileMethod;

int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
                        CORINFO_METHOD_INFO *info, unsigned flags,        
                        BYTE **nativeEntry, ULONG  *nativeSizeOfCode);

extern "C" __declspec(dllexport) void HookJIT()
{
   if (bHooked) return;

   LoadLibrary(_T("mscoree.dll"));

   HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));

   if (!hJitMod)
      return;

   p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");

   if (p_getJit)
   {
      JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());

      if (pJit)
      {
         DWORD OldProtect;
         VirtualProtect(pJit, sizeof (ULONG_PTR), PAGE_READWRITE, &OldProtect);
         compileMethod =  pJit->compileMethod;
         pJit->compileMethod = &my_compileMethod;
         VirtualProtect(pJit, sizeof (ULONG_PTR), OldProtect, &OldProtect);
         bHooked = TRUE;
      }
   }
}

//
// hooked compileMethod
//
/*__declspec (naked) */
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
                        CORINFO_METHOD_INFO *info, unsigned flags,        
                        BYTE **nativeEntry, ULONG  *nativeSizeOfCode)
{
   // in case somebody hooks us (x86 only)
#ifdef _M_IX86
   __asm
   {
      nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
         nop
   }
#endif

   // call original method
   // I'm not using the naked + jmp approach to avoid x64 incompatibilities
   int nRet = compileMethod(classthis, comp, info, flags, nativeEntry, nativeSizeOfCode);

   //
   // Displays the current method and its calls
   //

   DisplayMethodAndCalls(comp, info);

   return nRet;
}

VOID DisplayMethodAndCalls(ICorJitInfo *comp,
                     CORINFO_METHOD_INFO *info)
{
   const char *szMethodName = NULL;
   const char *szClassName = NULL;

   szMethodName = comp->getMethodName(info->ftn, &szClassName);

   char CurMethod[200];

   sprintf_s(CurMethod, 200, "%s::%s", szClassName, szMethodName);

   char Calls[0x1000];

   strcpy_s(Calls, 0x1000, "Methods called:\r\n\r\n");

   //
   // retrieve calls
   //

#define MAX_INSTR      100

   ILOPCODE_STRUCT ilopar[MAX_INSTR];

   DISASMSIL_OFFSET CodeBase = 0;

   BYTE *pCur = info->ILCode;
   UINT nSize = info->ILCodeSize;

   UINT nDisasmedInstr;

   while (DisasMSIL(pCur, nSize, CodeBase, ilopar, MAX_INSTR,
      &nDisasmedInstr))
   {
      //
      // check the instructions for calls
      //

      for (UINT x = 0; x < nDisasmedInstr; x++)
      {
         if (info->ILCode[ilopar[x].Offset] == ILOPCODE_CALL)
         {
            DWORD dwToken = *((DWORD *) &info->ILCode[ilopar[x].Offset + 1]);

            CORINFO_METHOD_HANDLE hCallHandle =
               comp->findMethod(info->scope, dwToken, info->ftn);

            szMethodName = comp->getMethodName(hCallHandle, &szClassName);

            strcat_s(Calls, 0x1000, szClassName);
            strcat_s(Calls, 0x1000, "::");
            strcat_s(Calls, 0x1000, szMethodName);
            strcat_s(Calls, 0x1000, "\r\n");
         }
      }

      //
      // end loop?
      //

      if (nDisasmedInstr < MAX_INSTR) break;

      //
      // next instructions
      //

      DISASMSIL_OFFSET next = ilopar[nDisasmedInstr - 1].Offset - CodeBase;
      next += ilopar[nDisasmedInstr - 1].Size;

      pCur += next;
      nSize -= next;
      CodeBase += next;
   }

   //
   // Show MessageBox
   //

   MessageBoxA(0, Calls, CurMethod, MB_ICONINFORMATION);
}

The findMethod has some limitations which I will talk about later. The output of this little example is a series of message boxes informing the user about the method currently being jitted and its calls.

As you can see, I didn't include the callvirt and calli opcodes in the search. This was only for the sake of simplicity, there's no other reason. If you want to create a complete logger, you have to consider those opcodes as well.

The .NET Code Ejector

When writing a dumper, or better a code ejector, for a code injecting protection, one has to choose how to proceed in order to collect the original MSIL code. There are two ways to proceed: either "stealth" or "brute". By stealth I mean dumping the MSIL only of those methods which have been jitted during execution. If you proceed that way, you'll need to retrieve along with the MSIL code, the token of the method, in order to rebuild the assembly then with Rebel.NET. There's a very useful function to retrieve the token from a CORINFO_METHOD_HANDLE:

comp->getMethodDefFromMethod(info->ftn)

The stealth method will always work 100%, but it has the disadvantage that one can't be sure that all the methods in an assembly will be jitted. However, it is worth mentioning, since in some cases the goal to achieve is to dump just a few methods in order to decompile and analyze them.

The other way to eject the MSIL code I called "brute". By brute I mean forcing the protection to decrypt every method in a .NET assembly at once. What is necessary to do is to collect the CORINFO_METHOD_INFO data (or at least ILCode and ILCodeSize), then retrieve the compileMethod from the getJit function. By now the compileMethod has already been hooked by the protection. Thus, calling it means to decrypt the MSIL data. When the protection's compileMethod has decrypted the MSIL code, it will call the code ejector's compileMethod, which, in some way, will notice by checking the parameters that this is the code ejection process and won't call the real compileMethod function.

The code ejector I wrote features also a little dumper of the jitted assemblies. This is quite useful, since most times you have to dump the protected assembly. This can also be achieved by a generic .NET unpacker.

As you can see from the image, the dialog contains a "Generate Rebel File" button. When this button is pressed a rebel report file is requested as input. Basically, one needs to dump (if necessary) the assembly to rebuild first. Then, use the Rebel.NET to create a report file from that assembly. Only the methods should be included in the report file:

This report file will be used during the code ejection process to calculate the number of methods and to retrieve the MSIL code address and size. Actually, the JIT can be used to retrieve this information, but there's a problem. If you noticed, the findMethod takes three arguments, the last one is a CORINFO_METHOD_HANDLE called hContext. This module is used to check the context of the request. If a method tries to access another method without being entitled to, the framework will show an error and terminate the process. The main goal is to obtain a valid CORINFO_METHOD_HANDLE for every method token in the assembly. As I said earlier, these kind of handles are just pointers. Let's have a look at the memory pointer by a CORINFO_METHOD_HANDLE:

03B8E1F0                            01 00 00 3B 74 01 00 00           ...;t...
03B8E200   02 00 01 08 0D 00 00 00  03 00 02 08 75 01 00 00   ............u...
03B8E210   04 00 03 39 76 01 20 00  05 00 04 08 77 01 00 00   ...9v. .....w...

The first word seems to represent the method's number and after 8 bytes comes the next method. So, in theory, it might even be possible to calculate the right CORINFO_METHOD_HANDLE, given a method's token. After having retrieved the CORINFO_METHOD_HANDLE, it would be only a matter of calling getMethodInfo to get the CORINFO_METHOD_INFO structure. Note: I'll discuss later what the CORINFO_METHOD_HANDLE really represents: this topic needs a paragraph on his own.

But, as said, this is not the way I proceeded. I used a Rebel.NET report file to retrieve the necessary data. This approach could obviously be changed. What follows is the code of the .NET code ejector.

- Download the Code Ejector

#include "stdafx.h"
#include <CommCtrl.h>
#include <CommDlg.h>
#include <tlhelp32.h>
#include <tchar.h>
#include <CorHdr.h>
#include "corinfo.h"
#include "corjit.h"
#include "RebelDotNET.h"
#include "resource.h"

#ifndef PAGE_SIZE
#define PAGE_SIZE 0x1000
#endif

#define IS_FLAG(Value, Flag) ((Value & Flag) == Flag)

HINSTANCE hInstance;

extern "C" __declspec(dllexport) void HookJIT();

VOID ListThread();


BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  dwReason,
                       LPVOID lpReserved
                )
{
   hInstance = (HINSTANCE) hModule;

   HookJIT();

   if (dwReason == DLL_PROCESS_ATTACH)
   {
      CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE) ListThread,
         NULL, 0, NULL);
   }

   return TRUE;
}

// unimportant
extern "C" __declspec(dllexport) int __stdcall _CorExeMain(void)
{
   return 0;
}


//
// Hook JIT's compileMethod
//

BOOL bHooked = FALSE;

ULONG_PTR *(__stdcall *p_getJit)();
typedef int (__stdcall *compileMethod_def)(ULONG_PTR classthis, ICorJitInfo *comp,
                                 CORINFO_METHOD_INFO *info, unsigned flags,        
                                 BYTE **nativeEntry, ULONG  *nativeSizeOfCode);
struct JIT
{
   compileMethod_def compileMethod;
};

compileMethod_def compileMethod;

int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
                        CORINFO_METHOD_INFO *info, unsigned flags,        
                        BYTE **nativeEntry, ULONG  *nativeSizeOfCode);

extern "C" __declspec(dllexport) void HookJIT()
{
   if (bHooked) return;

   LoadLibrary(_T("mscoree.dll"));

   HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));

   if (!hJitMod)
      return;

   p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");

   if (p_getJit)
   {
      JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());

      if (pJit)
      {
         DWORD OldProtect;
         VirtualProtect(pJit, sizeof (ULONG_PTR), PAGE_READWRITE, &OldProtect);
         compileMethod =  pJit->compileMethod;
         pJit->compileMethod = &my_compileMethod;
         VirtualProtect(pJit, sizeof (ULONG_PTR), OldProtect, &OldProtect);
         bHooked = TRUE;
      }
   }
}

//
// Logging
//

struct AssemblyInfo
{
   CORINFO_MODULE_HANDLE hCorModule;

   WCHAR AssemblyName[MAX_PATH];

   VOID *ImgBase;
   UINT ImgSize;

   BOOL bIdentified;

   HANDLE hRebReport;

   BOOL bDump;
   TCHAR DumpFileName[MAX_PATH];

} LoggedAssemblies[100];

UINT NumberOfLoggedAssemblies = 0;

VOID LogAssembly(ICorJitInfo *comp, CORINFO_METHOD_INFO *info);

DWORD GetTokenFromMethodHandle(ICorJitInfo *comp, CORINFO_METHOD_INFO *info);

VOID AddMethod(CORINFO_METHOD_INFO *mi);
BOOL CreateRebFile(AssemblyInfo *ai);
//
// hooked compileMethod
//
/*__declspec (naked) */
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
                        CORINFO_METHOD_INFO *info, unsigned flags,        
                        BYTE **nativeEntry, ULONG  *nativeSizeOfCode)
{
   // in case somebody hooks us (x86 only)
#ifdef _M_IX86
   __asm
   {
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
      nop
   }
#endif

   //
   // check if it's the dump process
   //

   if (comp == NULL)
   {
      AddMethod(info);
      return 0;
   }

   LogAssembly(comp, info);

   // call original method
   // I'm not using the naked + jmp approach to avoid x64 incompatibilities
   int nRet = compileMethod(classthis, comp, info, flags, nativeEntry, nativeSizeOfCode);
   
   return nRet;
}

//
// convert an address to its module ImgBase and Name (if possible)
//

VOID AddressToModuleInfo(VOID *pAddress, WCHAR *AssemblyName,
                   VOID **pImgBase, UINT *pImgSize,
                   BOOL *pbIdentified)
{
   DWORD dwPID = GetCurrentProcessId();
   HANDLE hModuleSnap = INVALID_HANDLE_VALUE;
   MODULEENTRY32 me32;

   static BOOL bFirstUnkAsm = TRUE;

   hModuleSnap = CreateToolhelp32Snapshot( TH32CS_SNAPMODULE, dwPID );

   if (hModuleSnap == INVALID_HANDLE_VALUE)
      return;

   me32.dwSize = sizeof (MODULEENTRY32);

   if (!Module32First(hModuleSnap, &me32 ))
   {
      CloseHandle(hModuleSnap);   
      return;
   }

   do
   {
      if (((ULONG_PTR) pAddress) > ((ULONG_PTR) me32.modBaseAddr) &&
         ((ULONG_PTR) pAddress) < (((ULONG_PTR) me32.modBaseAddr) +
         me32.modBaseSize))
      {
         if (pImgBase) *pImgBase = (VOID *) me32.modBaseAddr;
         if (pImgSize) *pImgSize = me32.modBaseSize;
         wcscpy_s(AssemblyName, MAX_PATH, me32.szExePath);
         if (pbIdentified) *pbIdentified = TRUE;
         return;
      }

   } while (Module32Next(hModuleSnap, &me32));

   CloseHandle(hModuleSnap);

   if (pbIdentified) *pbIdentified = FALSE;

   MEMORY_BASIC_INFORMATION mbi = { 0 };
   VirtualQuery(pAddress, &mbi, sizeof (MEMORY_BASIC_INFORMATION));

   if (pImgBase) *pImgBase = mbi.AllocationBase;

   DWORD ImgSize = 0;

   __try
   {
      IMAGE_DOS_HEADER *pDosHeader = (IMAGE_DOS_HEADER *)
         mbi.AllocationBase;

      if (pDosHeader->e_magic == IMAGE_DOS_SIGNATURE)
      {
         IMAGE_NT_HEADERS *pNtHeaders = (IMAGE_NT_HEADERS *)
            (pDosHeader->e_lfanew + (ULONG_PTR) pDosHeader);

         if (pNtHeaders->Signature == IMAGE_NT_SIGNATURE)
         {
            ImgSize = pNtHeaders->OptionalHeader.SizeOfImage;
         }
      }
   }
   __except (EXCEPTION_EXECUTE_HANDLER)
   {
      goto endinfo;
   }

endinfo:

   if (pImgSize) *pImgSize = ImgSize;

   if (bFirstUnkAsm)
   {
      wsprintfW(AssemblyName, L"Base: %p - Size: %08X - Primary Assembly",
         mbi.AllocationBase, ImgSize);
      bFirstUnkAsm = FALSE;
   }
   else
   {
      wsprintfW(AssemblyName, L"Base: %p - Size: %08X - unidentfied",
         mbi.AllocationBase, ImgSize);
   }
   
}

VOID LogAssembly(ICorJitInfo *comp, CORINFO_METHOD_INFO *info)
{
   // already in the list?
   for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
   {
      if (LoggedAssemblies[x].hCorModule == info->scope)
         return;
   }

   //
   // Add assembly to the logged list
   //

   AddressToModuleInfo(info->ILCode,
      LoggedAssemblies[NumberOfLoggedAssemblies].AssemblyName,
      &LoggedAssemblies[NumberOfLoggedAssemblies].ImgBase,
      &LoggedAssemblies[NumberOfLoggedAssemblies].ImgSize,
      &LoggedAssemblies[NumberOfLoggedAssemblies].bIdentified);

   LoggedAssemblies[NumberOfLoggedAssemblies].hCorModule = info->scope;
   LoggedAssemblies[NumberOfLoggedAssemblies].bDump = FALSE;

   NumberOfLoggedAssemblies++;
}

//
// Listing
//

LRESULT CALLBACK ListDlgProc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam);

VOID ListThread()
{
   Sleep(2000);

   InitCommonControls();

   DialogBox(hInstance, MAKEINTRESOURCE(IDD_ASMLIST), NULL, (DLGPROC) ListDlgProc);
}

LRESULT CALLBACK ListDlgProc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
   switch (uMsg)
   {

   case WM_INITDIALOG:
      {
         HWND hList = GetDlgItem(hDlg, LST_ASMS);

         LV_COLUMN lvc;

         ZeroMemory(&lvc, sizeof (LV_COLUMN));

         lvc.mask = LVCF_FMT | LVCF_WIDTH | LVCF_TEXT | LVCF_SUBITEM;
         lvc.fmt = LVCFMT_LEFT;

         lvc.cx = 500;
         lvc.pszText = _T("Assembly Path");
         ListView_InsertColumn(hList, 0, &lvc);

         SendMessage(hList, LVM_SETEXTENDEDLISTVIEWSTYLE,
            LVS_EX_FULLROWSELECT | LVS_EX_INFOTIP,
            LVS_EX_FULLROWSELECT | LVS_EX_INFOTIP);


         SendMessage(hDlg, WM_COMMAND, IDC_REFRESH, 0);

         break;
      }

   case WM_CLOSE:
      {
         EndDialog(hDlg, 0);
         break;
      }

   case WM_COMMAND:
      {
         switch (LOWORD(wParam))
         {

         case IDC_REFRESH:
            {
               HWND hList = GetDlgItem(hDlg, LST_ASMS);

               ListView_DeleteAllItems(hList);

               LV_ITEM lvi;

               ZeroMemory(&lvi, sizeof (LV_ITEM));

               lvi.mask = LVIF_TEXT | LVIF_STATE | LVIF_PARAM;

               for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
               {
                  lvi.lParam = (LPARAM) LoggedAssemblies[x].hCorModule;
                  lvi.pszText = LoggedAssemblies[x].AssemblyName;
                  ListView_InsertItem(hList, &lvi);
               }

               break;
            }

         case IDC_DUMPASM:
            {
               HWND hList = GetDlgItem(hDlg, LST_ASMS);

               int nSel = ListView_GetNextItem(hList, -1, LVNI_SELECTED);

               if (nSel == -1) break;

               OPENFILENAME SaveFileName;

               TCHAR DumpFileName[MAX_PATH];

               ZeroMemory(DumpFileName, MAX_PATH * sizeof (TCHAR));

               ZeroMemory(&SaveFileName, sizeof (OPENFILENAME));

               SaveFileName.lStructSize = sizeof (OPENFILENAME);
               SaveFileName.hwndOwner = hDlg;
               SaveFileName.lpstrFilter = _T("All Files (*.*)\0*.*\0");
               SaveFileName.lpstrFile = DumpFileName;
               SaveFileName.nMaxFile = MAX_PATH;
               SaveFileName.lpstrTitle = _T("Save Assembly As...");

               if (!GetSaveFileName(&SaveFileName))
                  break;

               LV_ITEM lvi;
               ZeroMemory(&lvi, sizeof (LV_ITEM));

               lvi.mask = LVIF_PARAM;
               lvi.iItem = nSel;

               ListView_GetItem(hList, &lvi);

               for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
               {
                  if (LoggedAssemblies[x].hCorModule ==
                     (CORINFO_MODULE_HANDLE) lvi.lParam)
                  {
                     HANDLE hFile = CreateFile(DumpFileName, GENERIC_WRITE,
                        FILE_SHARE_READ, NULL, CREATE_ALWAYS, 0, NULL);

                     if (hFile == INVALID_HANDLE_VALUE)
                        break;

                     DWORD dwOldProtect;

                     VirtualProtect(LoggedAssemblies[x].ImgBase,
                        LoggedAssemblies[x].ImgSize, PAGE_EXECUTE_READ,
                        &dwOldProtect);

                     for (UINT nPage = 0;
                        nPage < (LoggedAssemblies[x].ImgSize / PAGE_SIZE);
                        nPage++)
                     {
                        DWORD BW;

                        __try
                        {
                           VOID *pPage = (VOID *) ((nPage * PAGE_SIZE) +
                              (ULONG_PTR) LoggedAssemblies[x].ImgBase);

                           WriteFile(hFile, pPage, PAGE_SIZE, &BW, NULL);
                        }

                        __except (EXCEPTION_EXECUTE_HANDLER)
                        {
                           SetFilePointer(hFile, PAGE_SIZE, NULL, FILE_CURRENT);
                           SetEndOfFile(hFile);
                        }
                     }
                     
                     CloseHandle(hFile);
                     
                     MessageBox(hDlg, _T("Assembly successfully dumped."),
                        _T("Dumped"), MB_ICONINFORMATION);
                  }
               }

               break;
            }

         case IDC_REBFILE:
            {
               HWND hList = GetDlgItem(hDlg, LST_ASMS);

               int nSel = ListView_GetNextItem(hList, -1, LVNI_SELECTED);

               if (nSel == -1) break;

               LV_ITEM lvi;
               ZeroMemory(&lvi, sizeof (LV_ITEM));

               lvi.mask = LVIF_PARAM;
               lvi.iItem = nSel;

               ListView_GetItem(hList, &lvi);

               for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
               {
                  if (LoggedAssemblies[x].hCorModule ==
                     (CORINFO_MODULE_HANDLE) lvi.lParam)
                  {
                     OPENFILENAME OpenFileName;

                     TCHAR ReportFileName[MAX_PATH];

                     ZeroMemory(ReportFileName, MAX_PATH * sizeof (TCHAR));

                     ZeroMemory(&OpenFileName, sizeof (OPENFILENAME));

                     OpenFileName.lStructSize = sizeof (OPENFILENAME);
                     OpenFileName.hwndOwner = hDlg;
                     OpenFileName.lpstrFilter = _T("Report Rebel File (*.rebel)\0*.rebel\0");
                     OpenFileName.lpstrFile = ReportFileName;
                     OpenFileName.nMaxFile = MAX_PATH;
                     OpenFileName.lpstrTitle = _T("Select a Report Rebel File...");

                     if (!GetOpenFileName(&OpenFileName))
                        break;

                     LoggedAssemblies[x].hRebReport = CreateFile(ReportFileName,
                        GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);

                     if (LoggedAssemblies[x].hRebReport == INVALID_HANDLE_VALUE)
                        break;

                     OPENFILENAME SaveFileName;

                     ZeroMemory(LoggedAssemblies[x].DumpFileName, MAX_PATH * sizeof (TCHAR));

                     ZeroMemory(&SaveFileName, sizeof (OPENFILENAME));

                     SaveFileName.lStructSize = sizeof (OPENFILENAME);
                     SaveFileName.hwndOwner = hDlg;
                     SaveFileName.lpstrFilter = _T("Rebel File (*.rebel)\0*.rebel\0");
                     SaveFileName.lpstrFile = LoggedAssemblies[x].DumpFileName;
                     SaveFileName.nMaxFile = MAX_PATH;
                     SaveFileName.lpstrTitle = _T("Save Rebel File As...");
                     SaveFileName.lpstrDefExt = _T("rebel");

                     if (!GetSaveFileName(&SaveFileName))
                     {
                        CloseHandle(LoggedAssemblies[x].hRebReport);
                        break;
                     }

                     // dump
                     CreateRebFile(&LoggedAssemblies[x]);

                     break;
                  }
               }

               break;
            }
         }

         break;
      }
   }

   return FALSE;
}

//
// Dumping
//

DWORD RvaToOffset(VOID *pBase, DWORD Rva)
{
   __try
   {
      DWORD Offset = Rva, Limit;

      IMAGE_DOS_HEADER *pDosHeader = (IMAGE_DOS_HEADER *) pBase;

      if (pDosHeader->e_magic != IMAGE_DOS_SIGNATURE)
         return 0;

      IMAGE_NT_HEADERS *pNtHeaders = (IMAGE_NT_HEADERS *) (
         pDosHeader->e_lfanew + (ULONG_PTR) pDosHeader);

      if (pNtHeaders->Signature != IMAGE_NT_SIGNATURE)
         return 0;

      IMAGE_SECTION_HEADER *Img = IMAGE_FIRST_SECTION(pNtHeaders);

      if (Rva < Img->PointerToRawData)
         return Rva;

      for (WORD i = 0; i < pNtHeaders->FileHeader.NumberOfSections; i++)
      {
         if (Img[i].SizeOfRawData)
            Limit = Img[i].SizeOfRawData;
         else
            Limit = Img[i].Misc.VirtualSize;

         if (Rva >= Img[i].VirtualAddress &&
            Rva < (Img[i].VirtualAddress + Limit))
         {
            if (Img[i].PointerToRawData != 0)
            {
               Offset -= Img[i].VirtualAddress;
               Offset += Img[i].PointerToRawData;
            }

            return Offset;
         }
      }
   }
   __except (EXCEPTION_EXECUTE_HANDLER)
   {
      return 0;
   }

   return 0;
}

UINT GetMethodSize(REBEL_METHOD *rbMethod)
{
   UINT nMethodSize = sizeof (REBEL_METHOD);

   if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_NAMEOFFSET))
      nMethodSize += rbMethod->NameOffsetOrSize;

   if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_SIGOFFSET))
      nMethodSize += rbMethod->SignatureOffsetOrSize;

   if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_LOCVARSIGOFFSET))
      nMethodSize += rbMethod->LocalVarSigOffsetOrSize;

   nMethodSize += rbMethod->CodeSize;
   nMethodSize += rbMethod->ExtraSectionsSize;

   return nMethodSize;
}

static HANDLE hRebuildDump = INVALID_HANDLE_VALUE;

VOID AddMethod(CORINFO_METHOD_INFO *mi)
{
   REBEL_METHOD rbMethod;

   ZeroMemory(&rbMethod, sizeof (REBEL_METHOD));

   rbMethod.Token = mi->locals.token;
   rbMethod.CodeSize = mi->ILCodeSize;
   
   DWORD BRW;

   SetFilePointer(hRebuildDump, 0, NULL, FILE_END);

   WriteFile(hRebuildDump, &rbMethod, sizeof (REBEL_METHOD), &BRW, NULL);
   WriteFile(hRebuildDump, mi->ILCode, mi->ILCodeSize, &BRW, NULL);

   //
   // Increase number of methods
   //

   SetFilePointer(hRebuildDump, 0, NULL, FILE_BEGIN);

   REBEL_NET_BASE rbBase;

   ReadFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);

   rbBase.NumberOfMethods++;

   SetFilePointer(hRebuildDump, 0, NULL, FILE_BEGIN);

   WriteFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);
}

BOOL CreateRebFile(AssemblyInfo *ai)
{
   DWORD BRW;

   hRebuildDump = CreateFile(ai->DumpFileName, GENERIC_READ |
      GENERIC_WRITE, FILE_SHARE_READ, NULL,
      CREATE_ALWAYS, 0, NULL);

   if (hRebuildDump == INVALID_HANDLE_VALUE)
      return FALSE;

   REBEL_NET_BASE rbBase;

   ZeroMemory(&rbBase, sizeof (REBEL_NET_BASE));

   rbBase.Signature = REBEL_NET_SIGNATURE;
   rbBase.MethodsOffset = sizeof (REBEL_NET_BASE);

   WriteFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);

   //
   // Get the new JIT compileMethod which by now should be
   // hooked by the code protection
   //

   HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));

   p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");

   if (!p_getJit)
   {
      CloseHandle(hRebuildDump);
      hRebuildDump = INVALID_HANDLE_VALUE;
      return FALSE;
   }

   JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());

   //

   REBEL_NET_BASE repBase;

   if (!ReadFile(ai->hRebReport, &repBase, sizeof (REBEL_NET_BASE), &BRW, NULL))
   {
      CloseHandle(hRebuildDump);
      hRebuildDump = INVALID_HANDLE_VALUE;
      return FALSE;
   }

   UINT CurMethodOffset = repBase.MethodsOffset;

   for (UINT x = 0; x < repBase.NumberOfMethods; x++)
   {
      SetFilePointer(ai->hRebReport, CurMethodOffset, NULL, FILE_BEGIN);

      REBEL_METHOD rbRepMethod;

      ReadFile(ai->hRebReport, &rbRepMethod, sizeof (REBEL_METHOD), &BRW, NULL);

      //
      // Calculate current method's code location
      // Use RvaToOffset only when the module wasn't mapped
      //

      BYTE *pMethodCode;

      if (ai->bIdentified)
      {
         pMethodCode = (BYTE *) (rbRepMethod.RVA + (ULONG_PTR) ai->ImgBase);
      }
      else
      {
         DWORD Offset = RvaToOffset(ai->ImgBase, rbRepMethod.RVA);

         if (Offset == 0) continue;

         pMethodCode = (BYTE *) (Offset + (ULONG_PTR) ai->ImgBase);
      }

      //
      // we should check the validity of the memory
      // pointer by pMethodCode
      //

      // TODO

      //
      // skips the method header
      //

      BYTE HeaderFormat = *pMethodCode;

      HeaderFormat &= 3;

      if (HeaderFormat == 2)         // Tiny = 2 (CorILMethod_TinyFormat)
         pMethodCode++;
      else                     // Fat = 3 (CorILMethod_FatFormat)
         pMethodCode += (sizeof (DWORD) * 3);

      //
      // Do the fake compileMethod request
      //

      CORINFO_METHOD_INFO mi = { 0 };

      mi.ILCode = pMethodCode;
      mi.ILCodeSize = rbRepMethod.CodeSize;
      mi.scope = ai->hCorModule;
      // use this to pass the token to our AddMethod
      mi.locals.token = rbRepMethod.Token;

      pJit->compileMethod((ULONG_PTR) pJit, NULL, &mi, 0, NULL, NULL);

      //
      // next method
      //

      CurMethodOffset += GetMethodSize(&rbRepMethod);
   }

   //
   // Close file and notify the user
   //

   CloseHandle(hRebuildDump);
   hRebuildDump = INVALID_HANDLE_VALUE;

   MessageBox(0, _T("Assembly code successfully dumped."), _T("JIT Dumper"),
      MB_ICONINFORMATION);

   return TRUE;
}

I excuse myself if the code is a bit messy, but I didn't take much care in writing it as no time at all was put into designing it in the first place. I also had to re-write several parts of the code as I changed approach three times. I hope you're able to understand it nonetheless.

Code Ejection Demonstration

Of course, a demonstration is necessary. As I do not intend to violate the license of commercial products, the victim of this paragraph will be rendari's "cryxenet 2 unpackme", a little .NET crackme which uses code injection by hooking the compileMethod function. The crackme comes with 3 files: an unpackme.exe, a native.dll and a cryxed.dll. When started it shows a form which asks for a name / serial and then checks them if the user presses the button "Check". The crackme has to be considered as solved when the serial check process displays the message box of valid name / serial. Thanks to rendari for making this demonstration possible.

Actually, it's not necessary to analyze the crackme in order to solve it. But I'll do it anyway, as it might give an idea of how a basic code injector may work.

If one tries to decompile unpackme.exe's main function with the reflector, it'll show an error. So, let's see the MSIL code:

.method public static void  main() cil managed
// SIG: 00 00 01
{
  .entrypoint
  .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 )
  // Method begins at RVA 0x2050
  // Code size       1483 (0x5cb)
  .maxstack  4
  .locals init (int64 V_0,
           class [mscorlib]System.IO.FileInfo V_1,
           class [mscorlib]System.Reflection.Assembly V_2,
           object[] V_3,
           string[] V_4,
           int64 V_5,
           int64 V_6,
           int64 V_7,
           int64 V_8,
           int64 V_9,
           class [mscorlib]System.IO.FileStream V_10,
           native int V_11,
           uint8[] V_12,
           class Project1.Program/obfuscation8 V_13,
           uint8[] V_14,
           int64 V_15,
           int64 V_16,
           int64 V_17)
  IL_0000:  /* 00   |                  */ nop
  IL_0001:  /* 1F   | 63               */ ldc.i4.s   99
  IL_0003:  /* 6A   |                  */ conv.i8
  IL_0004:  /* 13   | 05               */ stloc.s    V_5
  IL_0006:  /* 1F   | 4B          &nbs

bsp; 75
  IL_0008:  /* 6A   |                  */ conv.i8
  IL_0009:  /* 13   | 06               */ stloc.s    V_6
  IL_000b:  /* 20   | 8D030000         */ ldc.i4     0x38d
  IL_0010:  /* 6A   |                  */ conv.i8
  IL_0011:  /* 13   | 07               */ stloc.s    V_7
  IL_0013:  /* 28   | (06)000002       */ call       int64 Project1.Program::IsDebuggerPresent()
  IL_0018:  /* 26   |                  */ pop
  IL_0019:  /* 11   | 06               */ ldloc.s    V_6
  IL_001b:  /* 2B   | 20               */ br.s       IL_003d
  IL_001d:  /* D6   |                  */ add.ovf
  IL_001e:  /* 13   | 05               */ stloc.s    V_5
  IL_0020:  /* 11   | 07               */ ldloc.s    V_7
  IL_0022:  /* 1D   |                  */ ldc.i4.7
  IL_0023:  /* 6A   |                  */ conv.i8
  IL_0024:  /* DA   |                  */ sub.ovf
  IL_0025:  /* 13   | 07               */ stloc.s    V_7
  IL_0027:  /* 11   | 07               */ ldloc.s    V_7
  IL_0029:  /* 11   | 05               */ ldloc.s    V_5
  IL_002b:  /* D6   |                  */ add.ovf
  IL_002c:  /* 13   | 06               */ stloc.s    V_6
  IL_002e:  /* 11   | 07               */ ldloc.s    V_7
  IL_0030:  /* 1D   |                  */ ldc.i4.7
  IL_0031:  /* 6A   |                  */ conv.i8
  IL_0032:  /* D6   |                  */ add.ovf
  IL_0033:  /* 13   | 07               */ stloc.s    V_7
  IL_0035:  /* 28   | (06)000002       */ call       int64 Project1.Program::IsDebuggerPresent()
  IL_003a:  /* 26   |                  */ pop
  IL_003b:  /* 11   | 06               */ ldloc.s    V_6

  IL_003d:  /* 11   | 07               */ ldloc.s    V_7
  IL_003f:  /* D6   |                  */ add.ovf
  IL_0040:  /* 13   | 05               */ stloc.s    V_5
  IL_0042:  /* 11   | 07               */ ldloc.s    V_7
  IL_0044:  /* 1D   |                  */ ldc.i4.7
  IL_0045:  /* 6A   |                  */ conv.i8
  IL_0046:  /* DA   |                  */ sub.ovf
  IL_0047:  /* 13   | 07               */ stloc.s    V_7
  IL_0049:  /* 11   | 07               */ ldloc.s    V_7
  IL_004b:  /* 11   | 05               */ ldloc.s    V_5
  IL_004d:  /* D6   |                  */ add.ovf
  IL_004e:  /* 13   | 06               */ stloc.s    V_6
  IL_0050:  /* 11   | 07               */ ldloc.s    V_7
  IL_0052:  /* 1D   |                  */ ldc.i4.7
  IL_0053:  /* 6A   |                  */ conv.i8

I'm not going to paste all of the code, since it's a very huge amount. As we can notice, there are plenty of nops in the code, but that doesn't matter: a decompiler simply ignores them. Since I have developed an obfuscator in the past, I know what makes a decompiler crash. Thus, I started looking for jumps. I noticed one at the beginning of the code:

  IL_001b:  /* 2B   | 20               */ br.s       IL_003d
  IL_001d:  /* D6   |                  */ add.ovf
  IL_001e:  /* 13   | 05               */ stloc.s    V_5
  IL_0020:  /* 11   | 07               */ ldloc.s    V_7
  IL_0022:  /* 1D   |                  */ ldc.i4.7
  IL_0023:  /* 6A   |                  */ conv.i8
  IL_0024:  /* DA   |                  */ sub.ovf
  IL_0025:  /* 13   | 07               */ stloc.s    V_7
  IL_0027:  /* 11   | 07               */ ldloc.s    V_7
  IL_0029:  /* 11   | 05               */ ldloc.s    V_5
  IL_002b:  /* D6   |                  */ add.ovf
  IL_002c:  /* 13   | 06               */ stloc.s    V_6
  IL_002e:  /* 11   | 07               */ ldloc.s    V_7
  IL_0030:  /* 1D   |                  */ ldc.i4.7
  IL_0031:  /* 6A   |                  */ conv.i8
  IL_0032:  /* D6   |                  */ add.ovf
  IL_0033:  /* 13   | 07               */ stloc.s    V_7
  IL_0035:  /* 28   | (06)000002       */ call       int64 Project1.Program::IsDebuggerPresent()
  IL_003a:  /* 26   |                  */ pop
  IL_003b:  /* 11   | 06               */ ldloc.s    V_6

  IL_003d:  /* 11   | 07               */ ldloc.s    V_7

The jump is unconditional. The opcodes between offset 0x1B and 0x3D will never be executed: I checked all the code after that, there's absolutely no reference to these opcodes. So, I nopped them (jump included) with the CFF Explorer and then decompiled again with the Reflector:

[STAThread]
public static void main()
{
    long num6;
    long num2 = 0x63L;
    long num3 = 0x4bL;
    long num4 = 0x38dL;
    IsDebuggerPresent();
    num2 = num3 + num4;
    num4 -= 7L;
    num3 = num4 + num2;
    num4 += 7L;
    IsDebuggerPresent();
    num2 = num3 + num4;
    num4 -= 7L;
    num3 = num4 + num2;
    num4 += 7L;
    num2 = num3 + num4;
    num4 -= 7L;

It works, but now we're facing code jungle. I only pasted a few instructions, since the jungle pattern is very easy, as you can see, it just repeats:

x = y + z;
z -= 7; / z += 7

I could have just removed all the jungle with the notepad, but a little CFF Explorer script to do the job seemed a much more elegant solution to me. You can identify the pattern from these two jungle examples:

  IL_003d:  /* 11   | 07               */ ldloc.s    V_7
  IL_003f:  /* D6   |                  */ add.ovf
  IL_0040:  /* 13   | 05               */ stloc.s    V_5
  IL_0042:  /* 11   | 07               */ ldloc.s    V_7
  IL_0044:  /* 1D   |                  */ ldc.i4.7
  IL_0045:  /* 6A   |                  */ conv.i8
  IL_0046:  /* DA   |                  */ sub.ovf
  IL_0047:  /* 13   | 07               */ stloc.s    V_7

  IL_0049:  /* 11   | 07               */ ldloc.s    V_7
  IL_004b:  /* 11   | 05               */ ldloc.s    V_5
  IL_004d:  /* D6   |                  */ add.ovf
  IL_004e:  /* 13   | 06               */ stloc.s    V_6
  IL_0050:  /* 11   | 07               */ ldloc.s    V_7
  IL_0052:  /* 1D   |                  */ ldc.i4.7
  IL_0053:  /* 6A   |                  */ conv.i8
  IL_0054:  /* D6   |                  */ add.ovf
  IL_0055:  /* 13   | 07               */ stloc.s    V_7

The instructions can change a bit, but it's still very simple to find a pattern. Here's the CFF Explorer script to de-jungle the code:

filename = GetOpenFile()

if filename == null then
   return
end

hFile = OpenFile(filename)

if hFile == null then
   return
end

-- nop the initial assignment of the variables
-- and also the jump that causes the decompiler
-- to crash

FillBytes(hFile, 0x105C, 0x49, 0)

jungle = { 0x11, ND, ND, ND, ND, ND, ND, 0x11, ND, 0x1D,
      0x6A, ND, 0x13, 0x07 }

Offset = SearchBytes(hFile, 0x105D, jungle)

while Offset != null do

   -- check if it exceeds the method function

   if Offset > 0x1050 + 1483 then
      break
   end

   -- nop jungle

   FillBytes(hFile, Offset , #jungle, 0)

   Offset = SearchBytes(hFile, Offset + 1, jungle)
end

if SaveFile(hFile) == true then
   MsgBox("dejungled")
end

And now it's possible decompile (and read) the code:

public static void main()
{
    long num6;
    IsDebuggerPresent();
    FileStream stream = new FileInfo("native.dll").OpenRead();
    long length = stream.Length;
    byte[] array = new byte[((int) length) + 1];
    stream.Read(array, 0, (int) length);
    stream.Close();
    long num7 = length;
    for (num6 = 0L; num6 <= num7; num6 += 1L)
    {
        array[(int) num6] = (byte) (array[(int) num6] ^ 0x37);
    }
    IntPtr destination = new IntPtr();
    destination = Marshal.AllocCoTaskMem((int) length);
    Marshal.Copy(array, 0, destination, (int) length);
    obfuscation8 delegateForFunctionPointer = (obfuscation8)
        Marshal.GetDelegateForFunctionPointer(destination, typeof(obfuscation8));
    IsDebuggerPresent();
    long num = new long();
    num = delegateForFunctionPointer();
    IsDebuggerPresent();
    stream = new FileInfo("cryxed.dll").OpenRead();
    length = stream.Length;
    byte[] buffer2 = new byte[((int) length) + 1];
    IsDebuggerPresent();
    stream.Read(buffer2, 0, (int) length);
    stream.Close();
    IsDebuggerPresent();
    long num8 = length;
    for (num6 = 0L; num6 <= num8; num6 += 1L)
    {
        buffer2[(int) num6] = (byte) (buffer2[(int) num6] ^ 0x37);
    }
    IsDebuggerPresent();
    Assembly assembly = Assembly.Load(buffer2);
    IsDebuggerPresent();
    object[] parameters = new object[1];
    string[] strArray = new string[] { "" };
    parameters[0] = strArray;
    assembly.EntryPoint.Invoke(null, parameters);
    IsDebuggerPresent();
}

This part of the code decrypts (with a xor) native.dll and transforms it into a native function:

    FileStream stream = new FileInfo("native.dll").OpenRead();
    long length = stream.Length;
    byte[] array = new byte[((int) length) + 1];
    stream.Read(array, 0, (int) length);
    stream.Close();
    long num7 = length;
    for (num6 = 0L; num6 <= num7; num6 += 1L)
    {
        array[(int) num6] = (byte) (array[(int) num6] ^ 0x37);
    }
    IntPtr destination = new IntPtr();
    destination = Marshal.AllocCoTaskMem((int) length);
    Marshal.Copy(array, 0, destination, (int) length);
    obfuscation8 delegateForFunctionPointer = (obfuscation8)
        Marshal.GetDelegateForFunctionPointer(destination, typeof(obfuscation8));

GetDelegateForFunctionPointer makes it possible to call a native function by passing the function's pointer. To view the code of the function, just open the CFF Explorer, go to Hex Editor, right click on the hex view and press Select All. Then right click again and click on Modify. Put the byte 0x37 in the value box and press ok. You now have the decrypted file which can be disassembled.

Exactly the same approach is used to decrypt cryxed.dll, which is the protected .NET assembly. So, in order to obtain the assembly to rebuild, it's not necessary to dump it from memory: it can be obtained following the simple decryption approach just explained. It should be noted that the crackme uses the Assembly.Load approach which I have addressed earlier in the .NET loader paragraph. To sum up, the main function of the crackme hooks the JIT, then loads the protected assembly and invokes its entry point.

The first thing we should be done is to create a Rebel.NET report file out of either the decrypted cryxed.dll or the dumped assembly. The protected assembly is the unidentfied one:

If the rebel report file has already been created, then it is possible to generate the rebuilding rebel file by clicking on "Generate Rebel File". If the ejection process succeded, a message box will prompt informing the user about the success of the operation.

After having successfully created the rebuilding rebl file, a simple rebuilding with Rebel.NET will generate a fully decompilable / runnable assembly:

Now that we have the virgin assembly, we can disassemble it. The UnpackMe.Form1 namespace contains three button events. Here's the first button event:

.method private instance void  Button1_Click(object sender,
                                             class [mscorlib]System.EventArgs e) cil managed
{
  // Code size       95 (0x5f)
  .maxstack  3
  .locals init (string V_0,
           string V_1,
           string V_2,
           string V_3,
           bool V_4)
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  callvirt   instance class [System.Windows.Forms]System.Windows.Forms.TextBox UnpackMe.Form1::get_TextBox1()
  IL_0007:  callvirt   instance string [System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
  IL_000c:  stloc.1
  IL_000d:  ldarg.0
  IL_000e:  callvirt   instance class [System.Windows.Forms]System.Windows.Forms.TextBox UnpackMe.Form1::get_TextBox2()
  IL_0013:  callvirt   instance string [System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
  IL_0018:  stloc.3
  IL_0019:  ldstr      "Such a naiive serial routine :D"
  IL_001e:  stloc.0
  IL_001f:  ldarg.0
  IL_0020:  ldloc.1
  IL_0021:  ldloc.0
  IL_0022:  callvirt   instance string UnpackMe.Form1::TripleDESEncode(string,
                                                                       string)
  IL_0027:  stloc.2
  IL_0028:  ldarg.0
  IL_0029:  callvirt   instance class [System.Windows.Forms]System.Windows.Forms.TextBox UnpackMe.Form1::get_TextBox2()
  IL_002e:  callvirt   instance string [System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
  IL_0033:  ldloc.2
  IL_0034:  ldc.i4.0
  IL_0035:  call       int32 [Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.Operators::CompareString(string,
                                                                                                                    string,
                                                                                                                    bool)
  IL_003a:  ldc.i4.0
  IL_003b:  ceq
  IL_003d:  stloc.s    V_4
  IL_003f:  ldloc.s    V_4
  IL_0041:  brfalse.s  IL_0050
  IL_0043:  ldstr      "Good work! Now go and post a solution or suggestio"
  + "ns so that I can improve the protector =)"
  IL_0048:  call       valuetype [System.Windows.Forms]System.Windows.Forms.DialogResult [System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
  IL_004d:  pop
  IL_004e:  br.s       IL_005c
  IL_0050:  nop
  IL_0051:  ldstr      "Invalid Serial. Pls don't hack me :'("
  IL_0056:  call       valuetype [System.Windows.Forms]System.Windows.Forms.DialogResult [System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
  IL_005b:  pop
  IL_005c:  nop
  IL_005d:  nop
  IL_005e:  ret
} // end of method Form1::Button1_Click

This obviously is the serial check routine. To overcome it in order to display the valid serial message box, it is only necessary to invert the highlighted branch in the code. This can easily be accomplished with the CFF Explorer:

Now, the crackme can be considered as solved, as it always shows the right message (except of course if you insert the valid name and serial, but solving a 3DES encryption just by taking a guess shouldn't be expected). Of course, the solved crackme along with its original files are available to download.

- Download Crackme + Solution

The crackme, of course, could have been solved in a different manner. But this goes beyond the scope of this paragraph which was a code ejection demonstration.

.NET Internals (Part 2: MethodDesc)

I said earlier that I was going to address the real meaning of the CORINFO_METHOD_HANDLE. So, that's what I'm doing in this paragraph.

I first become conscious of the meaning of this pointer when I came across this code in the jitinterface.cpp

CHECK CheckContext(CORINFO_MODULE_HANDLE scopeHnd, CORINFO_CONTEXT_HANDLE context)
{
    CHECK_MSG(scopeHnd != NULL, "Illegal null scope");
    CHECK_MSG(((size_t) context & ~CORINFO_CONTEXTFLAGS_MASK) != NULL, "Illegal null context");
    if (((size_t) context & CORINFO_CONTEXTFLAGS_MASK) == CORINFO_CONTEXTFLAGS_CLASS)
    {
        TypeHandle handle((CORINFO_CLASS_HANDLE) ((size_t) context & ~CORINFO_CONTEXTFLAGS_MASK));
        CHECK_MSG(handle.GetModule() == GetModule(scopeHnd), "Inconsistent scope and context");
    }
    else
    {
        MethodDesc* handle = (MethodDesc*) ((size_t) context & ~CORINFO_CONTEXTFLAGS_MASK);
        CHECK_MSG(handle->GetModule() == GetModule(scopeHnd), "Inconsistent scope and context");
    }

    CHECK_OK;
}

Nevermind the fact that an CORINFO_CONTEXT_HANDLE is the second argument of the function. The code which calls CheckContext passes a CORINFO_METHOD_HANDLE as context.

What can be concluded is that CORINFO_METHOD_HANDLE only is a pointer to a MethodDesc class. The MethodDesc class is one of the most important parts of the framework as it provides an incredible amount of information. The declaration of this class is inside the "clr\src\vm\method.hpp" file.

// The size of this structure needs to be a multiple of 8-bytes
//
// The following members insure that the size of this structure obeys this rule
//
// m_pDebugAlignPad
// m_dwAlign2
//
// If the layout of this struct changes, these may need to be revisited
// to make sure the size is a multiple of 8-bytes.
//
// @GENERICS:
// Method descriptors for methods belonging to instantiated types may be shared between compatible instantiations
// Hence for reflection and elsewhere where exact types are important it's necessary to pair a method desc
// with the exact owning type handle.
//
// See genmeth.cpp for details of instantiated generic method descriptors.
class MethodDesc
{
    friend class EEClass;
    friend class MethodTableBuilder;
    friend class ArrayClass;
    friend class NDirect;
    friend class InstantiatedMethodDesc;
    friend class MDEnums;
    friend class MethodImpl;
    friend class CheckAsmOffsets;
    friend class ClrDataAccess;
    friend class ZapMonitor;

    friend class MethodDescCallSite;

public:

    [...]


    inline BOOL HasStableEntryPoint()
    {
        LEAF_CONTRACT;
        return (m_bFlags2 & enum_flag2_HasStableEntryPoint) != 0;
    }

    inline TADDR GetStableEntryPoint()
    {
        WRAPPER_CONTRACT;
        _ASSERTE(HasStableEntryPoint());
        return *GetAddrOfSlotUnchecked();
    }

    BOOL SetStableEntryPointInterlocked(TADDR addr);

    BOOL HasTemporaryEntryPoint();
    TADDR GetTemporaryEntryPoint();

    void SetTemporaryEntryPoint(BaseDomain *pDomain, AllocMemTracker *pamTracker);

    inline BOOL HasPrecode()
    {
        LEAF_CONTRACT;
        return (m_bFlags2 & enum_flag2_HasPrecode) != 0;
    }

    inline void SetHasPrecode()
    {
        LEAF_CONTRACT;
        m_bFlags2 |= (enum_flag2_HasPrecode | enum_flag2_HasStableEntryPoint);
    }

    inline void ResetHasPrecode()
    {
        LEAF_CONTRACT;
        m_bFlags2 &= ~enum_flag2_HasPrecode;
        m_bFlags2 |= enum_flag2_HasStableEntryPoint;
    }

    inline Precode* GetPrecode()
    {
        LEAF_CONTRACT;
        PRECONDITION(HasPrecode());
        Precode* pPrecode = Precode::GetPrecodeFromEntryPoint(GetStableEntryPoint());
        PREFIX_ASSUME(pPrecode != NULL);
        return pPrecode;
    }

    inline BOOL MayHavePrecode()
    {
        WRAPPER_CONTRACT;
        return !MayHaveNativeCode() || PrestubMayInsertStub() || RequiresPrestub();
    }

    void InterlockedUpdateFlags2(BYTE bMask, BOOL fSet);

    Precode* GetOrCreatePrecode();



    inline BYTE* GetCallablePreStubAddr()
    {
        WRAPPER_CONTRACT;


        return HasStableEntryPoint() ? (BYTE*)GetStableEntryPoint() : (BYTE*)GetTemporaryEntryPoint();

    }

    // return the address of the stub
    static inline MethodDesc* GetMethodDescFromStubAddr(TADDR addr, BOOL fSpeculative = FALSE);

    DWORD GetAttrs();

    DWORD GetImplAttrs();

    // This function can lie if a method impl was used to implement
    // more than one method on this class. Use GetName(int) to indicate
    // which slot you are interested in.
    // See the TypeString class for better control over name formatting.
    LPCUTF8 GetName();

    LPCUTF8 GetName(USHORT slot);

    FORCEINLINE LPCUTF8 GetNameOnNonArrayClass()
    {
        WRAPPER_CONTRACT;
        return (GetMDImport()->GetNameOfMethodDef(GetMemberDef()));
    }

    COUNT_T GetStableHash();

    // Non-zero for InstantiatedMethodDescs
    DWORD GetNumGenericMethodArgs();

    // Return the number of class type parameters that are in scope for this method
    DWORD GetNumGenericClassArgs()
    {
        WRAPPER_CONTRACT;
        return GetMethodTable()->GetNumGenericArgs();
    }

    BOOL IsGenericMethodDefinition();

    // True if the declaring type or instantiation of method (if any) contains formal generic type parameters
    BOOL ContainsGenericVariables();

    Module* GetDefiningModuleForOpenMethod();

    // True if this has a class or method instantiation that is anything other than
    BOOL HasNonObjectClassOrMethodInstantiation();


    // True if and only if this is a method desriptor for :
    // 1. a non-generic method or a generic method at its typical method instantiation
    // 2. in a non-generic class or a typical instantiation of a generic class
    // This method can be called on a non-restored method desc
    BOOL IsTypicalMethodDefinition();

The size of this class is impressive because of the methods it contains. I could only paste a very small part, jut to give the reader an idea. The comments above the class declaration remind of the data pointed by a CORINFO_METHOD_HANDLE, which also was 8-byte aligned.

This is what can be found at the end of the MethodDesc class declaration:

    //================================================================
    // The actual data stored in a MethodDesc follows.

protected:
    UINT16      m_wTokenRemainder;
    BYTE        m_chunkIndex;

    enum {
        // enum_flag2_HasPrecode implies that enum_flag2_HasStableEntryPoint is set.
        enum_flag2_HasStableEntryPoint      = 0x01,   // The method entrypoint is stable (either precode or actual code)
        enum_flag2_HasPrecode               = 0x02,   // Precode has been allocated for this method

        enum_flag2_IsUnboxingStub           = 0x04,
        enum_flag2_MayHaveNativeCode        = 0x08,   // May have jitted code, ngened code or fcall entrypoint.
    };
    BYTE        m_bFlags2;

    // The slot number of this MethodDesc in the vtable array.
    WORD           m_wSlotNumber;

    // Flags.
    WORD           m_wFlags;

And this data exactly matches my previous intuition. We can now use every CORINFO_METHOD_HANDLE as a MethodDesc class. Of course, including the whole MethodDesc class would be rather painful given its complexity. But one could write his own simplified version of the MethodDesc class: all what is necessary to do is to include the members I pasted above which will result in the 8-byte multiple size of the class.

The MethodDesc class is useful for many purposes and its use is rather safe, since it is not supposed to change any time soon. And even if: its members (excluding the methods) are rather few, so I guess it won't be difficult to have a working simplified MethodDesc class.

I would have provided an example of how to use the MethodDesc class myself, but as I'm writing the article is already rather big and, although it's too late to keep it short, I'm still hoping to keep it readable. In fact, the journey into the .NET framework internals is not yet concluded and some things have still to be discussed.

.NET Internals (Part 3: IEE, Internal Calls, etc.)

There are other very interesting parts, apart from the JIT, of the .NET framework which should be discussed. Of course, I can't discuss them all in these two articles; I'm just trying to give the reader an idea of how easily they can be explored.

Some things have to be said about the execution engine, even though the interface which can be easily retrieved from the mscorwks is not much intereseting. But in this paragraph I'm also addressing things which seem to be useful but really aren't.

The mscorwks.dll module exports a function named IEE which could intrigue a reverser. However, the internals of this API are rather disappointing:

// This is the instance that exposes interfaces out to all the other DLLs of the CLR
// so they can use our services for TLS, synchronization, memory allocation, etc.
static BYTE g_CEEInstance[sizeof(CExecutionEngine)];
static IExecutionEngine * g_pCEE = NULL;

PTLS_CALLBACK_FUNCTION CExecutionEngine::Callbacks[MAX_PREDEFINED_TLS_SLOT];

extern "C" IExecutionEngine * __stdcall IEE()
{
    LEAF_CONTRACT;




    if ( !g_pCEE )
    {
        // Create a local copy on the stack and then copy it over to the static instance.
        // This avoids race conditions caused by multiple initializations of vtable in the constructor
       CExecutionEngine local;
       memcpy(&g_CEEInstance, &local, sizeof(CExecutionEngine));

       g_pCEE = (IExecutionEngine *)(CExecutionEngine*)&g_CEEInstance;
    }
    //END_ENTRYPOINT_VOIDRET;

    return g_pCEE;
}
 

As can be seen from the comments, this function only offers an interface for memory allocation and process synchronization. In fact, this is the declaration of the return class:

// We have an internal class that can be used to expose EE functionality to other CLR
// DLLs, via the deliberately obscure IEE DLL exports from the shim and the EE
class CExecutionEngine : public IExecutionEngine, public IEEMemoryManager
{
    //***************************************************************************
    // public API:
    //***************************************************************************
public:

    // Notification of a DLL_THREAD_DETACH or a Thread Terminate.
    static void ThreadDetaching(void **pTlsData);

    // Delete on TLS block
    static void DeleteTLS(void **pTlsData);

    // Fiber switch notifications
    static void SwitchIn();
    static void SwitchOut();

    static void **CheckThreadState(DWORD slot, BOOL force = TRUE);
    static void **CheckThreadStateNoCreate(DWORD slot);

    // Setup FLS simulation block, including ClrDebugState and StressLog.
    static void SetupTLSForThread(Thread *pThread);

    static DWORD GetTlsIndex () {return TlsIndex;}

    static BOOL HasDetachedTlsInfo();

    static void CleanupDetachedTlsInfo();

    static void DetachTlsInfo(void **pTlsData);

    //***************************************************************************
    // private implementation:
    //***************************************************************************
private:

    // The debugger needs access to the TlsIndex so that we can read it from OOP.
    friend class EEDbgInterfaceImpl;

    SVAL_DECL (DWORD, TlsIndex);

    static PTLS_CALLBACK_FUNCTION Callbacks[MAX_PREDEFINED_TLS_SLOT];


    //***************************************************************************
    // IUnknown methods
    //***************************************************************************

    HRESULT STDMETHODCALLTYPE QueryInterface(
            REFIID id,
            void **pInterface);

    ULONG STDMETHODCALLTYPE AddRef();

    ULONG STDMETHODCALLTYPE Release();

    //***************************************************************************
    // IExecutionEngine methods for TLS
    //***************************************************************************

    // Associate a callback for cleanup with a TLS slot
    VOID  STDMETHODCALLTYPE TLS_AssociateCallback(
            DWORD slot,
            PTLS_CALLBACK_FUNCTION callback);

    // May be called once to get the master TLS block slot for fast Get/Set operations
    DWORD STDMETHODCALLTYPE TLS_GetMasterSlotIndex();

    // Get the value at a slot
    LPVOID STDMETHODCALLTYPE TLS_GetValue(DWORD slot);

    // Get the value at a slot, return FALSE if TLS info block doesn't exist
    BOOL STDMETHODCALLTYPE TLS_CheckValue(DWORD slot, LPVOID * pValue);

    // Set the value at a slot
    VOID STDMETHODCALLTYPE TLS_SetValue(DWORD slot, LPVOID pData);

    // Free TLS memory block and make callback
    VOID STDMETHODCALLTYPE TLS_ThreadDetaching();
   
    //***************************************************************************
    // IExecutionEngine methods for locking
    //***************************************************************************

    CRITSEC_COOKIE STDMETHODCALLTYPE CreateLock(LPCSTR szTag, LPCSTR level, CrstFlags flags);

    void STDMETHODCALLTYPE DestroyLock(CRITSEC_COOKIE lock);

    void STDMETHODCALLTYPE AcquireLock(CRITSEC_COOKIE lock);

    void STDMETHODCALLTYPE ReleaseLock(CRITSEC_COOKIE lock);

    EVENT_COOKIE STDMETHODCALLTYPE CreateAutoEvent(BOOL bInitialState);
    EVENT_COOKIE STDMETHODCALLTYPE CreateManualEvent(BOOL bInitialState);
    void STDMETHODCALLTYPE CloseEvent(EVENT_COOKIE event);
    BOOL STDMETHODCALLTYPE ClrSetEvent(EVENT_COOKIE event);
    BOOL STDMETHODCALLTYPE ClrResetEvent(EVENT_COOKIE event);
    DWORD STDMETHODCALLTYPE WaitForEvent(EVENT_COOKIE event, DWORD dwMilliseconds, BOOL bAlertable);
    DWORD STDMETHODCALLTYPE WaitForSingleObject(HANDLE handle, DWORD dwMilliseconds);

    SEMAPHORE_COOKIE STDMETHODCALLTYPE ClrCreateSemaphore(DWORD dwInitial, DWORD dwMax);
    void STDMETHODCALLTYPE ClrCloseSemaphore(SEMAPHORE_COOKIE semaphore);
    DWORD STDMETHODCALLTYPE ClrWaitForSemaphore(SEMAPHORE_COOKIE semaphore, DWORD dwMilliseconds, BOOL bAlertable);
    BOOL STDMETHODCALLTYPE ClrReleaseSemaphore(SEMAPHORE_COOKIE semaphore, LONG lReleaseCount, LONG *lpPreviousCount);

    MUTEX_COOKIE STDMETHODCALLTYPE ClrCreateMutex(LPSECURITY_ATTRIBUTES lpMutexAttributes,
                                                  BOOL bInitialOwner,
                                                  LPCTSTR lpName);
    void STDMETHODCALLTYPE ClrCloseMutex(MUTEX_COOKIE mutex);
    BOOL STDMETHODCALLTYPE ClrReleaseMutex(MUTEX_COOKIE mutex);
    DWORD STDMETHODCALLTYPE ClrWaitForMutex(MUTEX_COOKIE mutex,
                                            DWORD dwMilliseconds,
                                            BOOL bAlertable);

    DWORD STDMETHODCALLTYPE ClrSleepEx(DWORD dwMilliseconds, BOOL bAlertable);

    BOOL STDMETHODCALLTYPE ClrAllocationDisallowed();

    void STDMETHODCALLTYPE GetLastThrownObjectExceptionFromThread(void **ppvException);

    //***************************************************************************
    // IEEMemoryManager methods for locking
    //***************************************************************************
    LPVOID STDMETHODCALLTYPE ClrVirtualAlloc(LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocationType, DWORD flProtect);
    BOOL STDMETHODCALLTYPE ClrVirtualFree(LPVOID lpAddress, SIZE_T dwSize, DWORD dwFreeType);
    SIZE_T STDMETHODCALLTYPE ClrVirtualQuery(LPCVOID lpAddress, PMEMORY_BASIC_INFORMATION lpBuffer, SIZE_T dwLength);
    BOOL STDMETHODCALLTYPE ClrVirtualProtect(LPVOID lpAddress, SIZE_T dwSize, DWORD flNewProtect, PDWORD lpflOldProtect);
    HANDLE STDMETHODCALLTYPE ClrGetProcessHeap();
    HANDLE STDMETHODCALLTYPE ClrHeapCreate(DWORD flOptions, SIZE_T dwInitialSize, SIZE_T dwMaximumSize);
    BOOL STDMETHODCALLTYPE ClrHeapDestroy(HANDLE hHeap);
    LPVOID STDMETHODCALLTYPE ClrHeapAlloc(HANDLE hHeap, DWORD dwFlags, SIZE_T dwBytes);
    BOOL STDMETHODCALLTYPE ClrHeapFree(HANDLE hHeap, DWORD dwFlags, LPVOID lpMem);
    BOOL STDMETHODCALLTYPE ClrHeapValidate(HANDLE hHeap, DWORD dwFlags, LPCVOID lpMem);
    HANDLE STDMETHODCALLTYPE ClrGetProcessExecutableHeap();
   
};

IExecutionEngine and IEEMemoryManager are just interfaces. Thus, no additional functionality is provided by the IEE interface.

The framework also exports two functions which may call the reverser's attention: GetRealProcAddress and GetCLRFunction. Unfortunately, they're both useless. GetRealProcAddress is only a call to LoadLibrary("mscorwks.dll") followed by a GetProcAddress:

extern "C"
STDAPI GetRealProcAddress(LPCSTR pwszProcName, VOID** ppv)
{
    if(!ppv)
    {
        return E_POINTER;
    }

    HMODULE hLib = GetLibrary(LIB_mscorwks);
    if(hLib == NULL)
    {
        return HRESULT_FROM_GetLastError();
    }

    *ppv = (void*) GetProcAddress(hLib,pwszProcName);
    if(*ppv == NULL)
    {
        return HRESULT_FROM_GetLastError();
    }
    return S_OK;
}

And GetCLRFunction can only retrieve the address of three functions and won't accept any other.

.text:79EA0C1B ; int __stdcall GetCLRFunction(char *)
.text:79EA0C1B public ?GetCLRFunction@@YGPAXPBD@Z
.text:79EA0C1B ?GetCLRFunction@@YGPAXPBD@Z proc near
.text:79EA0C1B
.text:79EA0C1B [...]

.text:79EA0C1B
.text:79EA0C44 mov esi, [ebp+arg_0]
.text:79EA0C47 push offset aClrloadlibrary ; "CLRLoadLibraryEx"

.text:79EA0C4C push esi ; char *
.text:79EA0C4D call _strcmp
.text:79EA0C52 test eax, eax
.text:79EA0C54 pop ecx
.text:79EA0C55 pop ecx
.text:79EA0C56 jz loc_79EEF97B
.text:79EA0C5C push offset aClrfreelibrary ; "CLRFreeLibrary"

.text:79EA0C61 push esi ; char *
.text:79EA0C62 call _strcmp
.text:79EA0C67 test eax, eax
.text:79EA0C69 pop ecx
.text:79EA0C6A pop ecx
.text:79EA0C6B jz loc_7A0D7B8F
.text:79EA0C71 push offset aEeheapallocinp ; "EEHeapAllocInProcessHeap"

.text:79EA0C76 push esi ; char *
.text:79EA0C77 call _strcmp
.text:79EA0C7C test eax, eax
.text:79EA0C7E pop ecx
.text:79EA0C7F pop ecx
.text:79EA0C80 jnz loc_79ED7512

I had to disassemble the function, because GetCLRFunction is not available in the Rotor project. Now that I got those two out of the way, I can talk about an interesting topic: internal calls.

Internal calls are methods implemented natively by the framework which can be called from managed code, although only in a very limited way, as we'll see later.

Such functions are defined in the "clr\src\vm\ecall.cpp" in this way:

FCFuncStart(gExceptionFuncs)
    FCFuncElement("GetClassName", ExceptionNative::GetClassName)
    FCFuncElement("IsImmutableAgileException", ExceptionNative::IsImmutableAgileException)
    FCFuncElement("_InternalGetMethod", SystemNative::CaptureStackTraceMethod)
    FCFuncElement("nIsTransient", ExceptionNative::IsTransient)
    FCFuncElement("GetMessageFromNativeResources", ExceptionNative::GetMessageFromNativeResources)
FCFuncEnd()

FCFuncStart(gSafeHandleFuncs)
    FCFuncElement("InternalDispose", SafeHandle::DisposeNative)
    FCFuncElement("InternalFinalize", SafeHandle::Finalize)
    FCFuncElement("SetHandleAsInvalid", SafeHandle::SetHandleAsInvalid)
    FCFuncElement("DangerousAddRef", SafeHandle::DangerousAddRef)
    FCFuncElement("DangerousRelease", SafeHandle::DangerousRelease)
FCFuncEnd()

FCFuncStart(gCriticalHandleFuncs)
    FCFuncElement("FireCustomerDebugProbe", CriticalHandle::FireCustomerDebugProbe)
FCFuncEnd()

FCFuncStart(gPathFuncs)
FCFuncEnd()

FCFuncStart(gFusionWrapFuncs)
    FCFuncElement("GetNextAssembly",  FusionWrap::GetNextAssembly)
    FCFuncElement("GetDisplayName",  FusionWrap::GetDisplayName)
    FCFuncElement("ReleaseFusionHandle",  FusionWrap::ReleaseFusionHandle)
FCFuncEnd()

// etc.

The first argument of FCFuncElement specifies the name of the function in the managed context, whereas the second one specifies the location of the function. The syntax to access one of these ecalls (I suppose it stands for engine calls) is the following:

[MethodImpl(MethodImplOptions.InternalCall)]
internal extern type ECallMethodName();

In order to use MethodImpl, one has to include the System.Runtime.CompilerServices namespace. The problem is, even though you can implement such a call in your project, when you try to actually call one of these internal calls, such a message will be delivered by the framework:

These functions are, in fact, wrapped by the framework. Of course, I didn't introduce internal calls just to take note of that. The interesting part is the interaction between managed code and internal calls. Let's take for instance this ecall:

FCIMPL2(MethodBody *, RuntimeMethodHandle::GetMethodBody, MethodDesc **ppMethod, EnregisteredTypeHandle enregDeclaringTypeHandle)

// MethodBody * RuntimeMethodHandle::GetMethodBody(MethodDesc **, EnregisteredTypeHandle)

The _GetMethodBody internal call takes as first paramater a MethodDesc pointer to pointer. The first managed wrapping of this function happens in the mscorlib ("clr\src\bcl\system\runtimehandles.cs").

        [MethodImpl(MethodImplOptions.InternalCall)]
        internal extern MethodBody _GetMethodBody(IntPtr declaringType);
        internal MethodBody GetMethodBody(RuntimeTypeHandle declaringType)
        {
            return _GetMethodBody(declaringType.Value);
        }

The first parameter disappears and becomes implicit. The class which contains this method also defines the implicit parameter at the beginning:

[Serializable()]
[System.Runtime.InteropServices.ComVisible(true)]
    public unsafe struct RuntimeMethodHandle : ISerializable
    {
        internal static RuntimeMethodHandle EmptyHandle { get { return new RuntimeMethodHandle(null); } }
       
        private IntPtr m_ptr;

The m_ptr paramater is private, so it can't be accessed normally from the outside. But maybe there's another way to obtain an equivalent value...

        // ISerializable interface
        private RuntimeMethodHandle(SerializationInfo info, StreamingContext context)
        {
            if(info == null)
                throw new ArgumentNullException("info");
           
            MethodInfo m =(RuntimeMethodInfo)info.GetValue("MethodObj", typeof(RuntimeMethodInfo));

            m_ptr = m.MethodHandle.Value;
           
            if(m_ptr.ToPointer() == null)
                throw new SerializationException(Environment.GetResourceString("Serialization_InsufficientState"));
        }

MethodHandle.Value is a public value. Thus, we can obtain the same value contained in m_ptr through the MethodInfo class. And m_ptr is just a pointer to a MethodDesc class, also known as CORINFO_METHOD_HANDLE. So, in order to obtain a MethodDesc pointer through managed code one can write this kind of code:

MethodInfo mi = typeof(Form1).GetMethod("button1_Click");
// displays pointer
MessageBox.Show(mi.MethodHandle.Value.ToString("X"));

The point I wanted to make is that it's possible access part of the .NET internals from managed code as well. Looking at the interaction between managed code and ecalls is one good way to discover some interesting things.

Other Injection/Ejection Approaches

Digging into the .NET framework internals opens up many new possibilities. For instance, hooking one of the MSIL related methods in the MethodDesc class could be an alternative way of code injection. The truth is that there isn't just "a way". Just like there isn't only one way to eject MSIL code. In fact, code ejection can go much further than code injection. In this article I presented a very simple, non-intrusive solution to retrieve the original MSIL of an assembly, but if one wants to become serious about code ejection, one could consider using a modified version of the Rotor (or Mono) project to retrieve the original MSIL. Or, to keep it simpler, modifying the official .NET framework, though not legal, might be a valid option. In either case, a code injector simply can't protect the original MSIL when the code ejection process is brought that far.  There's nothing such a protection can do when the code ejector is the framework itself. That's why I said from the beginning that code injection protections are weak, they can hide the code as long as the reverser doesn't decide to become serious about retrieving the MSIL code.

Conclusions

As I've never read a book nor an article about the CLR infrastructure, what has been presented in this article are the .NET internals from the perspective of a reverser. Having the (almost complete) source code of the .NET framework made things very easy and the days of research (development included) spent to write this article can be counted on a hand with only two fingers. It has been a much bigger effort writing the article. An effort which can only be compared to the pain one endures from actually reading it. The next article of this kind will be about .NET native compiling. It'll surely be less boring as I don't have to re-explain the basics of .NET internals already covered in this article.

Daniel Pistelli