The Cluster Computing Virtualization in Windows

This article was released on www.codeproject.com.

Description: How to establish the Cluster Computing Virtualization in Windows system.

Overview

Nowadays the virtualization technique presents several applications to accelerate the evaluation of code without the target machine. For instance, it is possible to execute a Windows executable file on a Linux system by dealing with the import table and simulating the Windows components, for instance WineHQ project [1]. In some cases, the x86 architecture instructions perform on Alpha architecture by emulating the x-86 instructions, for example Bochs Project [2]. The main purpose of this paper is not total  virtualization, it explains a simple method that could be used as the base structure to establish a Message Passing Interface (MPI) in Visual C++. It shows how it is possible to load multi-executable files in a running process in extra threads and allow them to run inside separate threads. They will communicate with each other with two functions according to the MPI protocol. This approach is currently represented for Portable Executable (PE) files. This technique can also be implemented for the Executable and Linkable format (ELF) files of Linux systems.

Tracing cluster computing codes encounters several problems. Firstly, the financial problem of purchasing an enormous number of machines. Furthermore, the difficulty of tracing the code in distributed systems. We can also use a Virtual Server and Virtual Machines. The Virtual Server will emulate the network communication, and the Virtual Machines will emulate the System and the Machine. However, this approach has some problems in addition to the delay of emulating the codes and the network. This was the reason that we developed programs to virtualize cluster programming according to the MPI protocol. For instance, MPICH2 [3] is a well-known program from this branch. If we look around, you will see that the main intention of this kind of programs was to run on UNIX systems. Of course, some of them have Windows versions. Nonetheless, they have not yet been applied in Windows systems as well as their applications on Linux systems. For they use some threading or emulation approach to porting the UNIX version’s source. Because of all of these, the author attempts to represent a simple technique to implement the cluster computing virtualization in Windows systems. He claims this method is much faster and more comfortable than previous methods. It uses a Virtual C++ compiler that accelerates the evolution of MPI codes on Windows systems.

1. Introduction

The LoadLibrary function is the normal procedure that we use to load a module in a process. It can load a module only once. It will not charge a module twice or more. Our intention is to load several modules in a process and then let them communicate with each other through the MPI protocol. Be aware that this module is a normal portable executable file. The portable executable file contains the headers and the sections. The headers explain the position and size of information. We can use the header to map the sections, but this is not the only implementation that is to be done. We should correct the special data in special addresses by using import table and relocation table. We call this processes - fix-up of import table and fix-up of relocation table. Of course, if we want to be so professional, we should perform dynamic recompiling of code in the virtualization technique. But we have selected the easiest way, the fix-up of the relocation table. We have passed all involvement related to establishing a code translator and recompiling engine by the fix-up relocation table. If we are experts, it will be good to include the VX32 project [4] in this project, it is an excellent project and it could accomplish this project. The only reason I did not use it is I do not like to be involved in difficulties in the code recompiling engine. My intention is to establish in a short time an MPI environment in a Windows system.

    I also want to make an environment to emulate the parallel machines communication. One choice is to use the pipe and connecting the processes to each others. This method is the simple way, but we will lose our management over the whole of the cluster computing program. It is so hard to trace a pipe and also make a debugger to trace it. Our intention is to make an entire environment that could be used as a ground to trace the cluster computing program. Be award that the tracing a process of multi-threads is much easer than a job of multi-processes in Windows system. This was the reason that we use a process to include the virtual processes in multithreads in this project. Therefore, we can make a debugger tools to trace the cluster program in incoming project.

2. Prime Insight

    The PE file includes the Headers and the Sections analogues ELF file. The headers represent the information to map and to initialize a executable file during the creation of process, FIG. 1 .

FIG. 1. Portable Executable file format [5].

    The PE structure has been explained completely in other articles. For instance, the Pietrek’s article is the best one [6]. To simulate the process creation, it is necessary to know how this process is done in Windows. I organized both the process creation and the module loading in the next sections. This will help us to establish the same implementation in order to load a foreign PE file in a process.

2.1 Create Process

    When a process is creating through CreateProcess or other similar creation process functions, the Windows maps the file image into the virtual memory. ZwCreateSection implements the function to map the PE image in virtual memory in CreateProcess function [7]. It makes a section object which is used by ZwCreateProcess to map the images. ZwCreateProcess allocates a new kernel process block and initializes the block according to the new process. It also fill the process environment block (PEB). In the next step, we see it makes a primary thread and initializes the context’s registers. Some special addresses fix up according to the current image base or the current loaded modules. These addresses are the import table, the relocation table, and others. Eventually, the process and the main thread are ready to inform to Windows Subsystem. CsrClientCallSever is the function that carries out this task by sending a message to the client runtime process (CSrss). In some paper the client runtime process named the Windows Subsystem. The process is running by resuming the primary thread by ZwResumeThread. Now process is running, FIG. 2 .

FIG. 2. The stage of process creation [8].

2.2 Load Library

     LoadLibrary works in another manner to load a module in a process. At the first, it investigates the loaded modules respect to existence of the module by using LdrpCheckForLoadedDll. It walks through the process environment block, so the LoadLibrary function will not continue when it finds the module already exist in memory. Therefore, this was the reason that halt us to use LoadLibrary in our virtualization technique. If it does not find the module, it maps the module into virtual memory by LdrpMapDll. This function accommodates the module according to value of the image base. If the requested memory was engaged, it will allocate another place and fix up the relocation table according to the current image base. When there is not the relocation table and also the requested space is not free , the routine can not continue and will be failed. In the next, LdrpWalkImportDescrptor, a NT internal function, fixes up the import table. This native API initializes the import table addresses and loaded import module by using two other native API function: LdrpLoadImportModule and LdrpSnapIAT. The first one is similar to GetModuleHandle or LoadLibrary. The second is comparable with GetProcAddress. The routine will not continue when it can not load one of the imported modules. Consequently, LdrpUpdatwLodCount increases the number of loaded modules and add the module information to the process through the process environment block [9]. LdrpRunInitializeRoutines initializes the thread local storage (TLS) when we have the TLS directory. Finally, calling DllMain by DLL_PROCESS_ATTACH declaration reason [10].

3. Load Foreigner PE in Process

    It is necessary to emulate some special steps of CreateProcess and LoadLibray to load a portable executable file in unusual manner in a process. Since, we do not attempt to load a module, there is not necessary to inform it as new module by adding them through the process environment block. Further, the thread local storage does not have the rule in our approach. This issue leads us to establish the routine for mapping the PE file, and fixing up the relocation and import tables.

3.1 Map Headers and Sections

    The Section headers represent the size and the location of sections to map in the virtual memory, FIG. 3 . We map the sections to memory by using the section headers as similar as ZwCreateSection or LdrpMapDll. We emulate the mapping functions by our code.

FIG. 3. Mapping PE file into memory [6].

3.2 Fix up Relocation Table

    We should consider some special data related to the code section  images that need to relocate by using the current image base. When a PE file is created, the code section image set according to the purposed image base. Therefore, the addresses are correct when PE file is loaded at the default image base. Otherwise, it is necessary to fix them up according to the current image base. The relocation table is the part was added to PE file when it was linked by the following command line:

/FIXED:No

The relocation table is useful when we plan to correct the code section image according to the current image base. The relocation table contains the packages accomplished with their headers and relocation data. The header includes the address base and the package size. The memory data is corrected by differential between the current image base and the original image base. For instance, if the Image base is 0x1000000:

0101251B   MOV DWORD PTR DS:[1015018],EAX

When the image base is 0x400000, this instruction will be:

0041251B   MOV DWORD PTR DS:[0415018],EAX

The base relocations are a list of location in image where a delta value need to be added to the memory content. The relevant values are corrected according to the delta:

delta = current_ImageBase – image_nt_headers->OptionalHeader.ImageBase

The package header represents the type and the offset:

type offset
03 00 00 00

The type shows the relocation technique:

The offset and the package base address are used to correct the image data. For instance, the following package point out to 0x1000 the base address, 0x184 the size.

008E1000: 00001000 00000184 30163000 30403028
008E1010: 30683054 308C3080 30AC309C 30D830CC
...

It represents the addresses in the following manner:

0x1000 + 0x0000 = 0x1000
0x1000 + 0x0016 = 0x1016
0x1000 + 0x0028 = 0x1028
0x1000 + 0x0040 = 0x1040
0x1000 + 0x0054 = 0x1054
...

The packages represent the relocation way that we see in the next [11]:

mem[ current_ImageBase + 0x1000 ] = 
   mem[ current_ImageBase + 0x1000 ] + delta_ImageBase ;
   
mem[ current_ImageBase + 0x1016 ] = 
   mem[ current_ImageBase + 0x1016 ] + delta_ImageBase ;
   
mem[ current_ImageBase + 0x1028 ] = 
   mem[ current_ImageBase + 0x1028 ] + delta_ImageBase ;
   
mem[ current_ImageBase + 0x1040 ] = 
   mem[ current_ImageBase + 0x1040 ] + delta_ImageBase ;
   
mem[ current_ImageBase + 0x1054 ] = 
  mem[ current_ImageBase + 0x1054 ] + delta_ImageBase ;

Therefore, we can correct the code image by using relocation table.

3.3 Fix up Import Table

    We knows a process runs by using the functions from other loaded modules. There is a table in PE file to represent the imported modules and imported functions. This table names the import table that will be initialized during the creation process or the load a module. Import table presents us the packages to load our modules and our functions, FIG. 4 .

    Each package has a import descriptor that includes the first thunk, the original first thunk. The first thunk points to location of the thunks that will be initialized. The original first thunk refers to the first position of the thunks, the place where represents the hint data, the function names. When the original first thunk does not exist, the first thunk will point to the place of the hint data and function names. Fixing the import table can be done by using LoadLibrary and GetProcessAddress.

FIG. 4. Import Table [12].

4. Communication

    We establish a virtual network for the communication in a virtual network. This virtual network transfers the message package between the threads or the virtual processes. To establish the multithread communication, we purpose a header for each virtual process that shows the thread ID, the image base and the virtual process ID. This header helps to the communication in the virtual network. We send and receive our package by using the header of a virtual process. The message contains the sender process ID, the receiver process ID, the message size, and the message data.

    The message scheduling requests an appropriate database. A database to store the sender and receiver IDs, message size, and message data. This data base helps to deliver the messages to the target processes. The Msg_Send, Msg_Recv, and Msg_Ready in our codes carry out the message passing process, FIG. 5 .

    In the next step, we define the virtual network communication functions according to Message Passing Interface (MPI) protocol. The message sending and receiving are done by MPI_Send and MPI_Recv according to MPI protocol. MPI_Init is used to construct the object and MPI_Finalize to destruct the object. We can also consider the MPI functions to represent the process rank and process size.

FIG. 5. Virtual Processes communication.

5. Conclusion

    We are seeing that the virtualization technique are extensively used in the several applications. For instance, the running guest operating system on the host operating system, the execution the code of other processor architecture, and the virtualization of distributed machines on a single machine. These applications nominate the virtual machine monitor as a technology with more dedication for the future of the operating systems. We can save our budget to develop a system that prevents us to spend recklessly budget to purchase a special operating system or a particular machine to run the code of other architectures. We are currently able to test an executable file format of a special system in another system. This said that it is not necessary to purchase the new system anymore. Furthermore, the distrusted virtual machines provides economical equipments to evaluate our cluster programming on a single machine

Downloads

References

  1. Julliard, A., Paun, D., Leo Puoti, I., Newman, J., Meissner, M., Vincent, B., "Wine is Not an Emulator", Source Forge Project, (2000).
  2. Denney, B., Bothamy, C., Becker, D., Ruppert, V., Alexander, G., "Bochs: The Open Source IA-32 Emulation Project", Source Forge Project, (2000).
  3. "MPICH2 is an implementation of the Message-Passing Interface", (2005).
  4. Ford, B., "The VX32 Virtual Extension Environment", Massachusetts Institute of Technology, (2005).
  5. "Microsoft Portable Executable and Common Object File Format Specification", Microsoft Corporation, Revision 6.0, Feb. (1999).
  6. Pietrek, M.,An In-Depth Look into the Win32 Portable Executable File Format, part 1/2, MSDN Magazine, Feb. (2002).
  7. Nebbett, G., "Windows® NT®/2000 Native API reference", MTP, (2000), ISBN 1-57870-199-6.
  8. Russinovich, M. E., Solomon, D. A., "Microsoft® Windows® Internals", 4th Edition: Microsoft Windows Sever & XP & 2000, Microsoft Press, Dec. (2004), ISBN 0-73561-917-4.
  9. Osterlund, R., "What Goes On Inside Windows 2000: Solving the Mysteries of the Loader", MSDN Magazine, Mar. (2002).
  10. Pietrek, M., "Under the Hood", Microsoft Systems Journal, Sep. (1999).
  11. Danehkar, A., "Inject your code to a Portable Executable file", The Code Project, Dec. (2005).
  12. Danehkar, A., "Injective Code inside Import Table", The Code Project, Jun. (2006).

By: Ashkbiz Danehkar