Overview Brief overview of Wine's architecture... Wine Overview With the fundamental architecture of Wine stabilizing, and people starting to think that we might soon be ready to actually release this thing, it may be time to take a look at how Wine actually works and operates. Foreword Wine is often used as a recursive acronym, standing for "Wine Is Not an Emulator". Sometimes it is also known to be used for "Windows Emulator". In a way, both meanings are correct, only seen from different perspectives. The first meaning says that Wine is not a virtual machine, it does not emulate a CPU, and you are not supposed to install Windows nor any Windows device drivers on top of it; rather, Wine is an implementation of the Windows API, and can be used as a library to port Windows applications to Unix. The second meaning, obviously, is that to Windows binaries (.exe files), Wine does look like Windows, and emulates its behaviour and quirks rather closely. "Emulator" The "Emulator" perspective should not be thought of as if Wine is a typical inefficient emulation layer that means Wine can't be anything but slow - the faithfulness to the badly designed Windows API may of course impose a minor overhead in some cases, but this is both balanced out by the higher efficiency of the Unix platforms Wine runs on, and that other possible abstraction libraries (like Motif, GTK+, CORBA, etc) has a runtime overhead typically comparable to Wine's. Executables Wine's main task is to run Windows executables under non Windows operating systems. It supports different types of executables: DOS executable. Those are even older programs, using the DOS format (either .com or .exe (the later being also called MZ)). Windows NE executable, also called 16 bit. They were the native processes run by Windows 2.x and 3.x. NE stands for New Executable <g>. Windows PE executable. These are programs were introduced in Windows 95 (and became the native formats for all later Windows version), even if 16 bit applications were still supported. PE stands for Portable Executable, in a sense where the format of the executable (as a file) is independent of the CPU (even if the content of the file - the code - is CPU dependent). WineLib executable. These are applications, written using the Windows API, but compiled as a Unix executable. Wine provides the tools to create such executables. Let's quickly review the main differences for the supported executables: Wine executables DOS (.COM or .EXE) Win16 (NE) Win32 (PE) WineLib Multitasking Only one application at a time (except for TSR) Cooperative Preemptive Preemptive Address space One MB of memory, where each application is loaded and unloaded. All 16 bit applications share a single address space, protected mode. Each application has it's own address space. Requires MMU support from CPU. Each application has it's own address space. Requires MMU support from CPU. Windows API No Windows API but the DOS API (like Int 21h traps). Will call the 16 bit Windows API. Will call the 32 bit Windows API. Will call the 32 bit Windows API, and possibly also the Unix APIs. Code (CPU level) Only available on x86 in real mode. Code and data are in segmented forms, with 16 bit offsets. Processor is in real mode. Only available on IA-32 architectures, code and data are in segmented forms, with 16 bit offsets (hence the 16 bit name). Processor is in protected mode. Available (with NT) on several CPUs, including IA-32. On this CPU, uses a flat memory model with 32 bit offsets (hence the 32 bit name). Flat model, with 32 bit addresses. Multi-threading Not available. Not available. Available. Available, but must use the Win32 APIs for threading and synchronization, not the Unix ones.
Wine deals with this issue by launching a separate Wine process (which is in fact a Unix process) for each Win32 process, but not for Win16 tasks. Win16 tasks (as well as DOS programs) are run as different intersynchronized Unix-threads in the same dedicated Wine process; this Wine process is commonly known as a WOW process (Windows on Windows), referring to a similar mechanism used by Windows NT. Synchronization between the Win16 tasks running in the WOW process is normally done through the Win16 mutex - whenever one of them is running, it holds the Win16 mutex, keeping the others from running. When the task wishes to let the other tasks run, the thread releases the Win16 mutex, and one of the waiting threads will then acquire it and let its task run.
Standard Windows Architectures Windows 9x architecture The windows architecture (Win 9x way) looks like this: +---------------------+ \ | Windows EXE | } application +---------------------+ / +---------+ +---------+ \ | Windows | | Windows | \ application & system DLLs | DLL | | DLL | / +---------+ +---------+ / +---------+ +---------+ \ | GDI32 | | USER32 | \ | DLL | | DLL | \ +---------+ +---------+ } core system DLLs +---------------------+ / | Kernel32 DLL | / +---------------------+ / +---------------------+ \ | Win9x kernel | } kernel space +---------------------+ / +---------------------+ \ | Windows low-level | \ drivers (kernel space) | drivers | / +---------------------+ / Windows NT architecture The windows architecture (Windows NT way) looks like the following drawing. Note the new DLL (NTDLL) which allows implementing different subsystems (as win32); kernel32 in NT architecture implements the Win32 subsystem on top of NTDLL. +---------------------+ \ | Windows EXE | } application +---------------------+ / +---------+ +---------+ \ | Windows | | Windows | \ application & system DLLs | DLL | | DLL | / +---------+ +---------+ / +---------+ +---------+ +-----------+ \ | GDI32 | | USER32 | | | \ | DLL | | DLL | | | \ +---------+ +---------+ | | \ core system DLLs +---------------------+ | | / (on the left side) | Kernel32 DLL | | Subsystem | / | (Win32 subsystem) | |Posix, OS/2| / +---------------------+ +-----------+ / +---------------------------------------+ | NTDLL.DLL | +---------------------------------------+ +---------------------------------------+ \ | NT kernel | } NT kernel (kernel space) +---------------------------------------+ / +---------------------------------------+ \ | Windows low-level drivers | } drivers (kernel space) +---------------------------------------+ / Note also (not depicted in schema above) that the 16 bit applications are supported in a specific subsystem. Some basic differences between the Win9x and the NT architectures include: Several subsystems (Win32, Posix...) can be run on NT, while not on Win 9x Win 9x roots its architecture in 16 bit systems, while NT is truely a 32 bit system. The drivers model and interfaces in Win 9x and NT are different (even if Microsoft tried to bridge the gap with some support of WDM drivers in Win 98 and above). Wine architecture Global picture Wine implementation is closer to the Windows NT architecture, even if several subsystems are not implemented yet (remind also that 16bit support is implemented in a 32-bit Windows EXE, not as a subsystem). Here's the overall picture: +---------------------+ \ | Windows EXE | } application +---------------------+ / +---------+ +---------+ \ | Windows | | Windows | \ application & system DLLs | DLL | | DLL | / +---------+ +---------+ / +---------+ +---------+ +-----------+ +--------+ \ | GDI32 | | USER32 | | | | | \ | DLL | | DLL | | | | Wine | \ +---------+ +---------+ | | | Server | \ core system DLLs +---------------------+ | | | | / (on the left side) | Kernel32 DLL | | Subsystem | | NT-like| / | (Win32 subsystem) | |Posix, OS/2| | Kernel | / +---------------------+ +-----------+ | | / | | +---------------------------------------+ | | | NTDLL | | | +---------------------------------------+ +--------+ +---------------------------------------+ \ | Wine executable (wine-?thread) | } unix executable +---------------------------------------+ / +---------------------------------------------------+ \ | Wine drivers | } Wine specific DLLs +---------------------------------------------------+ / +------------+ +------------+ +--------------+ \ | libc | | libX11 | | other libs | } unix shared libraries +------------+ +------------+ +--------------+ / (user space) +---------------------------------------------------+ \ | Unix kernel (Linux,*BSD,Solaris,OS/X) | } (Unix) kernel space +---------------------------------------------------+ / +---------------------------------------------------+ \ | Unix device drivers | } Unix drivers (kernel space) +---------------------------------------------------+ / Wine must at least completely replace the "Big Three" DLLs (KERNEL/KERNEL32, GDI/GDI32, and USER/USER32), which all other DLLs are layered on top of. But since Wine is (for various reasons) leaning towards the NT way of implementing things, the NTDLL is another core DLL to be implemented in Wine, and many KERNEL32 and ADVAPI32 features will be implemented through the NTDLL. As of today, no real subsystem (apart the Win32 one) has been implemented in Wine. The Wine server provides the backbone for the implementation of the core DLLs. It mainly implementents inter-process synchronization and object sharing. It can be seen, from a functional point of view, as a NT kernel (even if the APIs and protocols used between Wine's DLL and the Wine server are Wine specific). Wine uses the Unix drivers to access the various hardware pieces on the box. However, in some cases, Wine will provide a driver (in Windows sense) to a physical hardware device. This driver will be a proxy to the Unix driver (this is the case, for example, for the graphical part with X11 or SDL drivers, audio with OSS or ALSA drivers...). All DLLs provided by Wine try to stick as much as possible to the exported APIs from the Windows platforms. There are rare cases where this is not the case, and have been propertly documented (Wine DLLs export some Wine specific APIs). Usually, those are prefixed with __wine. Let's now review in greater details all of those components. The Wine server The Wine server is among the most confusing concepts in Wine. What is its function in Wine? Well, to be brief, it provides Inter-Process Communication (IPC), synchronization, and process/thread management. When the Wine server launches, it creates a Unix socket for the current host based on (see below) your home directory's .wine subdirectory (or wherever the WINEPREFIX environment variable points to) - all Wine processes launched later connects to the Wine server using this socket. (If a Wine server was not already running, the first Wine process will start up the Wine server in auto-terminate mode (i.e. the Wine server will then terminate itself once the last Wine process has terminated).) In earlier versions of Wine the master socket mentioned above was actually created in the configuration directory; either your home directory's /wine subdirectory or wherever the WINEPREFIX environment variable points>. Since that might not be possible the socket is actually created within the /tmp directory with a name that reflects the configuration directory. This means that there can actually be several separate copies of the Wine server running; one per combination of user and configuration directory. Note that you should not have several users using the same configuration directory at the same time; they will have different copies of the Wine server running and this could well lead to problems with the registry information that they are sharing. Every thread in each Wine process has its own request buffer, which is shared with the Wine server. When a thread needs to synchronize or communicate with any other thread or process, it fills out its request buffer, then writes a command code through the socket. The Wine server handles the command as appropriate, while the client thread waits for a reply. In some cases, like with the various WaitFor??? synchronization primitives, the server handles it by marking the client thread as waiting and does not send it a reply before the wait condition has been satisfied. The Wine server itself is a single and separate Unix process and does not have its own threading - instead, it is built on top of a large poll() loop that alerts the Wine server whenever anything happens, such as a client having sent a command, or a wait condition having been satisfied. There is thus no danger of race conditions inside the Wine server itself - it is often called upon to do operations that look completely atomic to its clients. Because the Wine server needs to manage processes, threads, shared handles, synchronization, and any related issues, all the clients' Win32 objects are also managed by the Wine server, and the clients must send requests to the Wine server whenever they need to know any Win32 object handle's associated Unix file descriptor (in which case the Wine server duplicates the file descriptor, transmits it back to the client, and leaves it to the client to close the duplicate when the client has finished with it). Wine builtin DLLs: about Relays, Thunks, and DLL descriptors This section mainly applies to builtin DLLs (DLLs provided by Wine). See section for the details on native vs. builtin DLL handling. Loading a Windows binary into memory isn't that hard by itself, the hard part is all those various DLLs and entry points it imports and expects to be there and function as expected; this is, obviously, what the entire Wine implementation is all about. Wine contains a range of DLL implementations. You can find the DLLs implementation in the dlls/ directory. Each DLL (at least, the 32 bit version, see below) is implemented in a Unix shared library. The file name of this shared library is the module name of the DLL with a .dll.so suffix (or .drv.so or any other relevant extension depending on the DLL type). This shared library contains the code itself for the DLL, as well as some more information, as the DLL resources and a Wine specific DLL descriptor. The DLL descriptor, when the DLL is instanciated, is used to create an in-memory PE header, which will provide access to various information about the DLL, including but not limited to its entry point, its resources, its sections, its debug information... The DLL descriptor and entry point table is generated by the winebuild tool (previously just named build), taking DLL specification files with the extension .spec as input. Resources (after compilation by wrc) or message tables (after compilation by wmc) are also added to the descriptor by winebuild. Once an application module wants to import a DLL, Wine will look at: through its list of registered DLLs (in fact, both the already loaded DLLs, and the already loaded shared libraries which has registered a DLL descriptor). Since, the DLL descriptor is automatically registered when the shared library is loaded - remember, registration call is put inside a shared library constructor - using the PRELOAD environment variable when running a Wine process can force the registration of some DLL descriptors. If it's not registered, Wine will look for it on disk, building the shared library name from the DLL module name. Directory searched for are specified by the WINEDLLPATH environment variable. Failing that, it will look for a real Windows .DLL file to use, and look through its imports, etc) and use the loading of native DLLs. After the DLL has been identified (assuming it's still a native one), it's mapped into memory using a dlopen() call. Note, that Wine doesn't use the shared library mechanisms for resolving and/or importing functions between two shared libraries (for two DLLs). The shared library is only used for providing a way to load a piece of code on demand. This piece of code, thanks the DLL descriptor, will provide the same type of information a native DLL would. Wine can then use the same code for native and builtin DLL to handle imports/exports. Wine also relies on the dynamic loading features of the Unix shared libraries to relocate the DLLs if needed (the same DLL can be loaded at different address in two different processes, and even in two consecutive run of the same executable if the order of loading the DLLs differ). The DLL descriptor is registered in the Wine realm using some tricks. The winebuild tool, while creating the code for DLL descriptor, also creates a constructor, that will be called when the shared library is loaded into memory. This constructor will actually register the descriptor to the Wine DLL loader. Hence, before the dlopen call returns, the DLL descriptor will be known and registered. This also helps to deal with the cases where there's still dependencies (at the ELF shared lib level, not at the embedded DLL level) between different shared libraries: the embedded DLLs will be properly registered, and even loaded (from a Windows point of view). Since Wine is 32-bit code itself, and if the compiler supports Windows' calling convention, stdcall (gcc does), Wine can resolve imports into Win32 code by substituting the addresses of the Wine handlers directly without any thunking layer in between. This eliminates the overhead most people associate with "emulation", and is what the applications expect anyway. However, if the user specified WINEDEBUG=+relay , a thunk layer is inserted between the application imports and the Wine handlers (actually the export table of the DLL is modified, and a thunk is inserted in the table); this layer is known as "relay" because all it does is print out the arguments/return values (by using the argument lists in the DLL descriptor's entry point table), then pass the call on, but it's invaluable for debugging misbehaving calls into Wine code. A similar mechanism also exists between Windows DLLs - Wine can optionally insert thunk layers between them, by using WINEDEBUG=+snoop, but since no DLL descriptor information exists for non-Wine DLLs, this is less reliable and may lead to crashes. For Win16 code, there is no way around thunking - Wine needs to relay between 16-bit and 32-bit code. These thunks switch between the app's 16-bit stack and Wine's 32-bit stack, copies and converts arguments as appropriate (an int is 16 bit 16-bit and 32 bits in 32-bit, pointers are segmented in 16 bit (and also near or far) but are 32 bit linear values in 32 bit), and handles the Win16 mutex. Some finer control can be obtained on the conversion, see winebuild reference manual for the details. Suffice to say that the kind of intricate stack content juggling this results in, is not exactly suitable study material for beginners. A DLL descriptor is also created for every 16 bit DLL. However, this DLL normally paired with a 32 bit DLL. Either, it's the 16 bit counterpart of the 16 bit DLL (KRNL386.EXE for KERNEL32, USER for USER32...), or a 16 bit DLL directly linked to a 32 bit DLL (like SYSTEM for KERNEL32, or DDEML for USER32). In those cases, the 16 bit descriptor(s) is (are) inserted in the same shared library as the the corresponding 32 bit DLL. Wine will also create symbolic links between kernel32.dll.so and system.dll.so so that loading of either kernel32.dll or system.dll will end up on the same shared library. Memory management Every Win32 process in Wine has its own dedicated native process on the host system, and therefore its own address space. This section explores the layout of the Windows address space and how it is emulated. Firstly, a quick recap of how virtual memory works. Physical memory in RAM chips is split into frames, and the memory that each process sees is split into pages. Each process has its own 4 gigabytes of address space (4gig being the maximum space addressable with a 32 bit pointer). Pages can be mapped or unmapped: attempts to access an unmapped page cause an EXCEPTION_ACCESS_VIOLATION which has the easily recognizable code of 0xC0000005. Any page can be mapped to any frame, therefore you can have multiple addresses which actually "contain" the same memory. Pages can also be mapped to things like files or swap space, in which case accessing that page will cause a disk access to read the contents into a free frame. Initial layout (in Windows) When a Win32 process starts, it does not have a clear address space to use as it pleases. Many pages are already mapped by the operating system. In particular, the EXE file itself and any DLLs it needs are mapped into memory, and space has been reserved for the stack and a couple of heaps (zones used to allocate memory to the app from). Some of these things need to be at a fixed address, and others can be placed anywhere. The EXE file itself is usually mapped at address 0x400000 and up: indeed, most EXEs have their relocation records stripped which means they must be loaded at their base address and cannot be loaded at any other address. DLLs are internally much the same as EXE files but they have relocation records, which means that they can be mapped at any address in the address space. Remember we are not dealing with physical memory here, but rather virtual memory which is different for each process. Therefore OLEAUT32.DLL may be loaded at one address in one process, and a totally different one in another. Ensuring all the functions loaded into memory can find each other is the job of the Windows dynamic linker, which is a part of NTDLL. So, we have the EXE and its DLLs mapped into memory. Two other very important regions also exist: the stack and the process heap. The process heap is simply the equivalent of the libc malloc arena on UNIX: it's a region of memory managed by the OS which malloc/HeapAlloc partitions and hands out to the application. Windows applications can create several heaps but the process heap always exists. Windows 9x also implements another kind of heap: the shared heap. The shared heap is unusual in that anything allocated from it will be visible in every other process. Comparison So far we've assumed the entire 4 gigs of address space is available for the application. In fact that's not so: only the lower 2 gigs are available, the upper 2 gigs are on Windows NT used by the operating system and hold the kernel (from 0x80000000). Why is the kernel mapped into every address space? Mostly for performance: while it's possible to give the kernel its own address space too - this is what Ingo Molnars 4G/4G VM split patch does for Linux - it requires that every system call into the kernel switches address space. As that is a fairly expensive operation (requires flushing the translation lookaside buffers etc) and syscalls are made frequently it's best avoided by keeping the kernel mapped at a constant position in every processes address space. Basically, the comparison of memory mappings looks as follows: Memory layout (Windows and Wine) Address Windows 9x Windows NT Linux 00000000-7fffffff User User User 80000000-bfffffff Shared User User c0000000-ffffffff Kernel Kernel Kernel
On Windows 9x, in fact only the upper gigabyte (0xC0000000 and up) is used by the kernel, the region from 2 to 3 gigs is a shared area used for loading system DLLs and for file mappings. The bottom 2 gigs on both NT and 9x are available for the programs memory allocation and stack.
Implementation Wine (with a bit of black magic) is able to map all items at the correct locations as depicted above. Wine also implements the shared heap so native win9x DLLs can be used. This heap is always created at the SYSTEM_HEAP_BASE address or 0x80000000 and defaults to 16 megabytes in size. There are a few other magic locations. The bottom 64k of memory is deliberately left unmapped to catch null pointer dereferences. The region from 64k to 1mb+64k are reserved for DOS compatibility and contain various DOS data structures. Finally, the address space also contains mappings for the Wine binary itself, any native libaries Wine is using, the glibc malloc arena and so on.
Processes Let's take a closer look at the way Wine loads and run processes in memory. Starting a process from command line When starting a Wine process from command line (we'll get later on to the differences between NE, PE and Winelib executables), there are a couple of things Wine need to do first. A first executable is run to check the threading model of the underlying OS (see for the details) and will start the real Wine loader corresponding to the choosen threading model. Then Wine graps a few elements from the Unix world: the environment, the program arguments. Then the ntdll.dll.so is loaded into memory using the standard shared library dynamic loader. When loaded, NTDLL will mainly first create a decent Windows environment: create a PEB and a TEB set up the connection to the Wine server - and eventually launching the Wine server if none runs create the process heap Then Kernel32 is loaded (but now using the Windows dynamic loading capabilities) and a Wine specific entry point is called __wine_kernel_init. This function will actually handle all the logic of the process loading and execution, and will never return from it's call. __wine_kernel_init will undergo the following tasks: initialization of program arguments from Unix program arguments lookup of executable in the file system If the file is not found, then an error is printed and the Wine loader stops. We'll cover the non-PE file type later on, so assume for now it's a PE file. The PE module is loaded in memory using the Windows shared library mechanism. Note that the dependencies on the module are not resolved at this point. A new stack is created, which size is given in the PE header, and this stack is made the one of the running thread (which is still the only one in the process). The stack used at startup will no longer be used. Which this new stack, ntdll.LdrInitializeThunk is called which performs the remaining initialization parts, including resolving all the DLL imports on the PE module, and doing the init of the TLS slots. Control can now be passed to the EntryPoint of the PE module, which will let the executable run. Creating a child process from a running process The steps used are closely link to what is done in the previous case. There are however a few points to look at a bit more closely. The inner implementation creates the child process using the fork() and exec() calls. This means that we don't need to check again for the threading model, we can use what the parent (or the grand-parent process...) started from command line has found. The Win32 process creation allows to pass a lot of information between the parent and the child. This includes object handles, windows title, console parameters, environment strings... Wine makes use of both the standard Unix inheritance mechanisms (for environment for example) and the Wine server (to pass from parent to child a chunk of data containing the relevant information). The previously described loading mechanism will check in the Wine server if such a chunk exists, and, if so, will perform the relevant initialization. Some further synchronization is also put in place: a parent will wait until the child has started, or has failed. The Wine server is also used to perform those tasks. Starting a Winelib process Before going into the gory details, let's first go back to what a Winelib application is. It can be either a regular Unix executable, or a more specific Wine beast. This later form in fact creates two files for a given executable (say foo.exe). The first one, named foo will be a symbolic link to the Wine loader (wine). The second one, named foo.exe.so, is the equivalent of the .dll.so files we've already described for DLLs. As in Windows, an executable is, among other things, a module with its import and export information, as any DLL, it makes sense Wine uses the same mechanisms for loading native executables and DLLs. When starting a Winelib application from the command line (say with foo arg1 arg2), the Unix shell will execute foo as a Unix executable. Since this is in fact the Wine loader, Wine will fire up. However, will notice that it hasn't been started as wine but as foo, and hence, will try to load (using Unix shared library mechanism) the second file foo.exe.so. Wine will recognize a 32 bit module (with its descriptor) embedded in the shared library, and once the shared library loaded, it will proceed the same path as when loading a standard native PE executable. Wine needs to implement this second form of executable in order to maintain the order of initialization of some elements in the executable. One particular issue is when dealing with global C++ objects. In standard Unix executable, the call of the constructor to such objects is stored in the specific section of the executable (.init not to name it). All constructors in this section are called before the main function is called. Creating a Wine executable using the first form mentionned above will let those constructors being called before Wine gets a chance to initialize itself. So, any constructor using a Windows API will fail, because Wine infrastructure isn't in place. The use of the second form for Winelib executables ensures that we do the initialization using the following steps: initialize the Wine infrastructure load the executable into memory handle the import sections for the executable call the global object constructors (if any). They now can properly call the Windows APIs call the executable entry point The attentive reader would have noted that the resolution of imports for the executable is done, as for a DLL, when the executable/DLL descriptor is registered. However, this is done also by adding a specific constructor in the .init section. For the above describe scheme to function properly, this constructor must be the first constructor to be called, before all the other constructors, generated by the executable itself. The Wine build chain takes care of that, and also generating the executable/DLL descriptor for the Winelib executable. Starting a NE (Win16) process When Wine is requested to run a NE (Win 16 process), it will in fact hand over the execution of it to a specific executable winevdm. VDM stands for Virtual DOS Machine. This winevdm will in fact set up the correct 16 bit environment to run the executable. Any new 16 bit process created by this executable (or its children) will run into the same winevdm instance. Among one instance, several functionalities will be provided to those 16 bit processes, including the cooperative multitasking, sharing the same address space, managing the selectors for the 16 bit segments needed for code, data and stack. Note that several winevdm instances can run in the same Wine session, but the functionalities described above are only shared among a given instance, not among all the instances. winevdm is built as Winelib application, and hence has access to any facility a 32 bit application has. The behaviour we just described also applies to DOS executables, which are handled the same way by winevdm. Wine drivers Wine will not allow running native Windows drivers under Unix. This comes mainly because (look at the generic architecture schemas) Wine doesn't implement the kernel features of Windows (kernel here really means the kernel, not the KERNEL32 DLL), but rather sets up a proxy layer on top of the Unix kernel to provide the NTDLL and KERNEL32 features. This means that Wine doesn't provide the inner infrastructure to run native drivers, either from the Win9x family or from the NT family. In other words, Wine will only be able to provide access to a specific device, if and only if, 1/ this device is supported in Unix (there is Unix-driver to talk to it), 2/ Wine has implemented the proxy code to make the glue between the API of a Windows driver, and the Unix interface of the Unix driver. Wine, however, tries to implement in the various DLLs needing to access devices to do it through the standard Windows APIs for device drivers in user space. This is for example the case for the multimedia drivers, where Wine loads Wine builtin DLLs to talk to the OSS interface, or the ALSA interface. Those DLLs implement the same interface as any user space audio driver in Windows.
Module Overview NTDLL Module NTDLL provides most of the services you'd expect from a kernel. Process and thread management are part of them (even if process management is still mainly done in KERNEL32, unlike NT). A Windows process runs as a Unix process, and a Windows thread runs as a Unix thread. Wine also provide fibers (which is the Windows name of co-routines). Most of the Windows memory handling (Heap, Global and Local functions, virtual memory...) are easily mapped upon their Unix equivalents. Note the NTDLL doesn't know about 16 bit memory, which is only handled in KERNEL32/KRNL386.EXE (and also the DOS routines). File management Wine uses some configuration in order to map Windows filenames (either defined with drive letters, or as UNC names) to the unix filenames. Wine also uses some incantation so that most of file related APIs can also take full unix names. This is handy when passing filenames on the command line. File handles can be waitable objects, as Windows define them. Asynchronous I/O is implemented on file handles by queueing pseudo APC. They are not real APC in the sense that they have the same priority as the threads in the considered process (while APCs on NT have normally a higher priority). These APCs get called when invoking Wine server (which should lead to correct behavior when the program ends up waiting on some object - waiting always implies calling Wine server). FIXME: this should be enhanced and updated to latest work on FS. Synchronization Most of the synchronization (between threads or processes) is done in Wine server, which handles both the waiting operation (on a single object or a set of objects) and the signaling of objects. Module (DLL) loading Wine is able to load any NE and PE module. In all cases, the module's binary code is directly executed by the processor. Device management Wine allows usage a wide variety of devices: Communication ports are mapped to Unix communication ports (if they have sufficient permissions). Parallel ports are mapped to Unix parallel ports (if they have sufficient permissions). CDROM: the Windows device I/O control calls are mapped onto Unix ioctl(). Some Win9x VxDs are supported, by rewriting some of their internal behavior. But this support is limited. Portable programs to Windows NT shouldn't need them. Wine will not support native VxD. KERNEL Module FIXME: Needs some content... GDI Module X Windows System interface The X libraries used to implement X clients (such as Wine) do not work properly if multiple threads access the same display concurrently. It is possible to compile the X libraries to perform their own synchronization (initiated by calling XInitThreads()). However, Wine does not use this approach. Instead Wine performs its own synchronization using the wine_tsx11_lock() / wine_tsx11_unlock() functions. This locking protects library access with a critical section, and also arranges things so that X libraries compiled without (eg. with global errno variable) will work with Wine. In the past, all calls to X used to go through a wrapper called TSX...() (for "Thread Safe X ..."). While it is still being used in the code, it's inefficient as the lock is potentially aquired and released unnecessarily. New code should explicitly aquire the lock. USER Module USER implements windowing and messaging subsystems. It also contains code for common controls and for other miscellaneous stuff (rectangles, clipboard, WNet, etc). Wine USER code is located in windows/, controls/, and misc/ directories. Windowing subsystem windows/win.c windows/winpos.c Windows are arranged into parent/child hierarchy with one common ancestor for all windows (desktop window). Each window structure contains a pointer to the immediate ancestor (parent window if WS_CHILD style bit is set), a pointer to the sibling (returned by GetWindow(..., GW_NEXT)), a pointer to the owner window (set only for popup window if it was created with valid hwndParent parameter), and a pointer to the first child window (GetWindow(.., GW_CHILD)). All popup and non-child windows are therefore placed in the first level of this hierarchy and their ancestor link (wnd->parent) points to the desktop window. Desktop window - root window | \ `-. | \ `-. popup -> wnd1 -> wnd2 - top level windows | \ `-. `-. | \ `-. `-. child1 child2 -> child3 child4 - child windows Horizontal arrows denote sibling relationship, vertical lines - ancestor/child. To summarize, all windows with the same immediate ancestor are sibling windows, all windows which do not have desktop as their immediate ancestor are child windows. Popup windows behave as topmost top-level windows unless they are owned. In this case the only requirement is that they must precede their owners in the top-level sibling list (they are not topmost). Child windows are confined to the client area of their parent windows (client area is where window gets to do its own drawing, non-client area consists of caption, menu, borders, intrinsic scrollbars, and minimize/maximize/close/help buttons). Another fairly important concept is z-order. It is derived from the ancestor/child hierarchy and is used to determine "above/below" relationship. For instance, in the example above, z-order is child1->popup->child2->child3->wnd1->child4->wnd2->desktop. Current active window ("foreground window" in Win32) is moved to the front of z-order unless its top-level ancestor owns popup windows. All these issues are dealt with (or supposed to be) in windows/winpos.c with SetWindowPos() being the primary interface to the window manager. Wine specifics: in default and managed mode each top-level window gets its own X counterpart with desktop window being basically a fake stub. In desktop mode, however, only desktop window has an X window associated with it. Also, SetWindowPos() should eventually be implemented via Begin/End/DeferWindowPos() calls and not the other way around. Visible region, clipping region and update region windows/dce.c windows/winpos.c windows/painting.c ________________________ |_________ | A and B are child windows of C | A |______ | | | | | |---------' | | | | B | | | | | | | `------------' | | C | `------------------------' Visible region determines which part of the window is not obscured by other windows. If a window has the WS_CLIPCHILDREN style then all areas below its children are considered invisible. Similarly, if the WS_CLIPSIBLINGS bit is in effect then all areas obscured by its siblings are invisible. Child windows are always clipped by the boundaries of their parent windows. B has a WS_CLIPSIBLINGS style: . ______ : | | | ,-----' | | | B | - visible region of B | | | : `------------' When the program requests a display context (DC) for a window it can specify an optional clipping region that further restricts the area where the graphics output can appear. This area is calculated as an intersection of the visible region and a clipping region. Program asked for a DC with a clipping region: ______ ,--|--. | . ,--. ,--+--' | | : _: | | | B | | => | | | - DC region where the painting will | | | | | | | be visible `--|-----|---' : `----' `-----' When the window manager detects that some part of the window became visible it adds this area to the update region of this window and then generates WM_ERASEBKGND and WM_PAINT messages. In addition, WM_NCPAINT message is sent when the uncovered area intersects a nonclient part of the window. Application must reply to the WM_PAINT message by calling the BeginPaint()/EndPaint() pair of functions. BeginPaint() returns a DC that uses accumulated update region as a clipping region. This operation cleans up invalidated area and the window will not receive another WM_PAINT until the window manager creates a new update region. A was moved to the left: ________________________ ... / C update region |______ | : .___ / | A |_________ | => | ...|___|.. | | | | | : | | |------' | | | : '---' | | B | | | : \ | | | | : \ | `------------' | B update region | C | `------------------------' Windows maintains a display context cache consisting of entries that include the DC itself, the window to which it belongs, and an optional clipping region (visible region is stored in the DC itself). When an API call changes the state of the window tree, window manager has to go through the DC cache to recalculate visible regions for entries whose windows were involved in the operation. DC entries (DCE) can be either private to the window, or private to the window class, or shared between all windows (Windows 3.1 limits the number of shared DCEs to 5). Messaging subsystem windows/queue.c windows/message.c Each Windows task/thread has its own message queue - this is where it gets messages from. Messages can be: generated on the fly (WM_PAINT, WM_NCPAINT, WM_TIMER) created by the system (hardware messages) posted by other tasks/threads (PostMessage) sent by other tasks/threads (SendMessage) Message priority: First the system looks for sent messages, then for posted messages, then for hardware messages, then it checks if the queue has the "dirty window" bit set, and, finally, it checks for expired timers. See windows/message.c. From all these different types of messages, only posted messages go directly into the private message queue. System messages (even in Win95) are first collected in the system message queue and then they either sit there until Get/PeekMessage gets to process them or, as in Win95, if system queue is getting clobbered, a special thread ("raw input thread") assigns them to the private queues. Sent messages are queued separately and the sender sleeps until it gets a reply. Special messages are generated on the fly depending on the window/queue state. If the window update region is not empty, the system sets the QS_PAINT bit in the owning queue and eventually this window receives a WM_PAINT message (WM_NCPAINT too if the update region intersects with the non-client area). A timer event is raised when one of the queue timers expire. Depending on the timer parameters DispatchMessage either calls the callback function or the window procedure. If there are no messages pending the task/thread sleeps until messages appear. There are several tricky moments (open for discussion) - System message order has to be honored and messages should be processed within correct task/thread context. Therefore when Get/PeekMessage encounters unassigned system message and this message appears not to be for the current task/thread it should either skip it (or get rid of it by moving it into the private message queue of the target task/thread - Win95, AFAIK) and look further or roll back and then yield until this message gets processed when system switches to the correct context (Win16). In the first case we lose correct message ordering, in the second case we have the infamous synchronous system message queue. Here is a post to one of the OS/2 newsgroup I found to be relevant:
by David Charlap " Here's the problem in a nutshell, and there is no good solution. Every possible solution creates a different problem. With a windowing system, events can go to many different windows. Most are sent by applications or by the OS when things relating to that window happen (like repainting, timers, etc.) Mouse input events go to the window you click on (unless some window captures the mouse). So far, no problem. Whenever an event happens, you put a message on the target window's message queue. Every process has a message queue. If the process queue fills up, the messages back up onto the system queue. This is the first cause of apps hanging the GUI. If an app doesn't handle messages and they back up into the system queue, other apps can't get any more messages. The reason is that the next message in line can't go anywhere, and the system won't skip over it. This can be fixed by making apps have bigger private message queues. The SIQ fix does this. PMQSIZE does this for systems without the SIQ fix. Applications can also request large queues on their own. Another source of the problem, however, happens when you include keyboard events. When you press a key, there's no easy way to know what window the keystroke message should be delivered to. Most windowing systems use a concept known as "focus". The window with focus gets all incoming keyboard messages. Focus can be changed from window to window by apps or by users clicking on windows. This is the second source of the problem. Suppose window A has focus. You click on window B and start typing before the window gets focus. Where should the keystrokes go? On the one hand, they should go to A until the focus actually changes to B. On the other hand, you probably want the keystrokes to go to B, since you clicked there first. OS/2's solution is that when a focus-changing event happens (like clicking on a window), OS/2 holds all messages in the system queue until the focus change actually happens. This way, subsequent keystrokes go to the window you clicked on, even if it takes a while for that window to get focus. The downside is that if the window takes a real long time to get focus (maybe it's not handling events, or maybe the window losing focus isn't handling events), everything backs up in the system queue and the system appears hung. There are a few solutions to this problem. One is to make focus policy asynchronous. That is, focus changing has absolutely nothing to do with the keyboard. If you click on a window and start typing before the focus actually changes, the keystrokes go to the first window until focus changes, then they go to the second. This is what X-windows does. Another is what NT does. When focus changes, keyboard events are held in the system message queue, but other events are allowed through. This is "asynchronous" because the messages in the system queue are delivered to the application queues in a different order from that with which they were posted. If a bad app won't handle the "lose focus" message, it's of no consequence - the app receiving focus will get its "gain focus" message, and the keystrokes will go to it. The NT solution also takes care of the application queue filling up problem. Since the system delivers messages asynchronously, messages waiting in the system queue will just sit there and the rest of the messages will be delivered to their apps. The OS/2 SIQ solution is this: When a focus-changing event happens, in addition to blocking further messages from the application queues, a timer is started. When the timer goes off, if the focus change has not yet happened, the bad app has its focus taken away and all messages targeted at that window are skipped. When the bad app finally handles the focus change message, OS/2 will detect this and stop skipping its messages. As for the pros and cons: The X-windows solution is probably the easiest. The problem is that users generally don't like having to wait for the focus to change before they start typing. On many occasions, you can type and the characters end up in the wrong window because something (usually heavy system load) is preventing the focus change from happening in a timely manner. The NT solution seems pretty nice, but making the system message queue asynchronous can cause similar problems to the X-windows problem. Since messages can be delivered out of order, programs must not assume that two messages posted in a particular order will be delivered in that same order. This can break legacy apps, but since Win32 always had an asynchronous queue, it is fair to simply tell app designers "don't do that". It's harder to tell app designers something like that on OS/2 - they'll complain "you changed the rules and our apps are breaking." The OS/2 solution's problem is that nothing happens until you try to change window focus, and then wait for the timeout. Until then, the bad app is not detected and nothing is done."
Intertask/interthread SendMessage. The system has to inform the target queue about the forthcoming message, then it has to carry out the context switch and wait until the result is available. Win16 stores necessary parameters in the queue structure and then calls DirectedYield() function. However, in Win32 there could be several messages pending sent by preemptively executing threads, and in this case SendMessage has to build some sort of message queue for sent messages. Another issue is what to do with messages sent to the sender when it is blocked inside its own SendMessage.
Wine/Windows DLLs This document mainly deals with the status of current DLL support by Wine. The Wine ini file currently supports settings to change the load order of DLLs. The load order depends on several issues, which results in different settings for various DLLs. Pros of Native DLLs Native DLLs of course guarantee 100% compatibility for routines they implement. For example, using the native USER DLL would maintain a virtually perfect and Windows 95-like look for window borders, dialog controls, and so on. Using the built-in Wine version of this library, on the other hand, would produce a display that does not precisely mimic that of Windows 95. Such subtle differences can be engendered in other important DLLs, such as the common controls library COMMCTRL or the common dialogs library COMMDLG, when built-in Wine DLLs outrank other types in load order. More significant, less aesthetically-oriented problems can result if the built-in Wine version of the SHELL DLL is loaded before the native version of this library. SHELL contains routines such as those used by installer utilities to create desktop shortcuts. Some installers might fail when using Wine's built-in SHELL. Cons of Native DLLs Not every application performs better under native DLLs. If a library tries to access features of the rest of the system that are not fully implemented in Wine, the native DLL might work much worse than the corresponding built-in one, if at all. For example, the native Windows GDI library must be paired with a Windows display driver, which of course is not present under Intel Unix and Wine. Finally, occasionally built-in Wine DLLs implement more features than the corresponding native Windows DLLs. Probably the most important example of such behavior is the integration of Wine with X provided by Wine's built-in USER DLL. Should the native Windows USER library take load-order precedence, such features as the ability to use the clipboard or drag-and-drop between Wine windows and X windows will be lost. Deciding Between Native and Built-In DLLs Clearly, there is no one rule-of-thumb regarding which load-order to use. So, you must become familiar with what specific DLLs do and which other DLLs or features a given library interacts with, and use this information to make a case-by-case decision. Load Order for DLLs Using the DLL sections from the wine configuration file, the load order can be tweaked to a high degree. In general it is advised not to change the settings of the configuration file. The default configuration specifies the right load order for the most important DLLs. The default load order follows this algorithm: for all DLLs which have a fully-functional Wine implementation, or where the native DLL is known not to work, the built-in library will be loaded first. In all other cases, the native DLL takes load-order precedence. The DefaultLoadOrder from the [DllDefaults] section specifies for all DLLs which version to try first. See manpage for explanation of the arguments. The [DllOverrides] section deals with DLLs, which need a different-from-default treatment. The [DllPairs] section is for DLLs, which must be loaded in pairs. In general, these are DLLs for either 16-bit or 32-bit applications. In most cases in Windows, the 32-bit version cannot be used without its 16-bit counterpart. For Wine, it is customary that the 16-bit implementations rely on the 32-bit implementations and cast the results back to 16-bit arguments. Changing anything in this section is bound to result in errors. For the future, the Wine implementation of Windows DLL seems to head towards unifying the 16 and 32 bit DLLs wherever possible, resulting in larger DLLs. They are stored in the dlls/ subdirectory using the 32-bit name. Understanding What DLLs Do The following list briefly describes each of the DLLs commonly found in Windows whose load order may be modified during the configuration and compilation of Wine. (See also ./DEVELOPER-HINTS or the dlls/ subdirectory to see which DLLs are currently being rewritten for Wine) ADVAPI32.DLL: 32-bit application advanced programming interfaces like crypto, systeminfo, security and event logging AVIFILE.DLL: 32-bit application programming interfaces for the Audio Video Interleave (AVI) Windows-specific Microsoft audio-video standard COMMCTRL.DLL: 16-bit common controls COMCTL32.DLL: 32-bit common controls COMDLG32.DLL: 32-bit common dialogs COMMDLG.DLL: 16-bit common dialogs COMPOBJ.DLL: OLE 16- and 32-bit compatibility libraries CRTDLL.DLL: Microsoft C runtime DCIMAN.DLL: 16-bit DCIMAN32.DLL: 32-bit display controls DDEML.DLL: DDE messaging D3D*.DLL DirectX/Direct3D drawing libraries DDRAW.DLL: DirectX drawing libraries DINPUT.DLL: DirectX input libraries DISPLAY.DLL: Display libraries DPLAY.DLL, DPLAYX.DLL: DirectX playback libraries DSOUND.DLL: DirectX audio libraries GDI.DLL: 16-bit graphics driver interface GDI32.DLL: 32-bit graphics driver interface IMAGEHLP.DLL: 32-bit IMM API helper libraries (for PE-executables) IMM32.DLL: 32-bit IMM API IMGUTIL.DLL: KERNEL32.DLL 32-bit kernel DLL KEYBOARD.DLL: Keyboard drivers LZ32.DLL: 32-bit Lempel-Ziv or LZ file compression used by the installshield installers (???). LZEXPAND.DLL: LZ file expansion; needed for Windows Setup MMSYSTEM.DLL: Core of the Windows multimedia system MOUSE.DLL: Mouse drivers MPR.DLL: 32-bit Windows network interface MSACM.DLL: Core of the Addressed Call Mode or ACM system MSACM32.DLL: Core of the 32-bit ACM system Audio Compression Manager ??? MSNET32.DLL 32-bit network APIs MSVFW32.DLL: 32-bit Windows video system MSVIDEO.DLL: 16-bit Windows video system OLE2.DLL: OLE 2.0 libraries OLE32.DLL: 32-bit OLE 2.0 components OLE2CONV.DLL: Import filter for graphics files OLE2DISP.DLL, OLE2NLS.DLL: OLE 2.1 16- and 32-bit interoperability OLE2PROX.DLL: Proxy server for OLE 2.0 OLE2THK.DLL: Thunking for OLE 2.0 OLEAUT32.DLL 32-bit OLE 2.0 automation OLECLI.DLL: 16-bit OLE client OLECLI32.DLL: 32-bit OLE client OLEDLG.DLL: OLE 2.0 user interface support OLESVR.DLL: 16-bit OLE server libraries OLESVR32.DLL: 32-bit OLE server libraries PSAPI.DLL: Proces Status API libraries RASAPI16.DLL: 16-bit Remote Access Services libraries RASAPI32.DLL: 32-bit Remote Access Services libraries SHELL.DLL: 16-bit Windows shell used by Setup SHELL32.DLL: 32-bit Windows shell (COM object?) TAPI/TAPI32/TAPIADDR: Telephone API (for Modems) W32SKRNL: Win32s Kernel ? (not in use for Win95 and up!) WIN32S16.DLL: Application compatibility for Win32s WIN87EM.DLL: 80387 math-emulation libraries WINASPI.DLL: Advanced SCSI Peripheral Interface or ASPI libraries WINDEBUG.DLL Windows debugger WINMM.DLL: Libraries for multimedia thunking WING.DLL: Libraries required to "draw" graphics WINSOCK.DLL: Sockets APIs WINSPOOL.DLL: Print spooler libraries WNASPI32.DLL: 32-bit ASPI libraries WSOCK32.DLL: 32-bit sockets APIs
The Wine initialization process Wine has a rather complex startup procedure, so unlike many programs the best place to begin exploring the code-base is not in fact at the main() function but instead at some of the more straightforward DLLs that exist on the periphery such as MSI, the widget library (in USER and COMCTL32) etc. The purpose of this section is to document and explain how Wine starts up from the moment the user runs "wine myprogram.exe" to the point at which myprogram gets control. First Steps The actual wine binary that the user runs does not do very much, in fact it is only responsible for checking the threading model in use (NPTL vs LinuxThreads) and then invoking a new binary which performs the next stage in the startup sequence. See the threading chapter for more information on this check and why it's necessary. You can find this code in loader/glibc.c. The result of this check is an exec of either wine-pthread or wine-kthread, potentially (on Linux) via the preloader. We need to use separate binaries here because overriding the native pthreads library requires us to exploit a property of ELF symbol fixup semantics: it's not possible to do this without starting a new process. The Wine preloader is found in loader/preloader.c, and is required in order to impose a Win32 style address space layout upon the newly created Win32 process. The details of what this does is covered in the address space layout chapter. The preloader is a statically linked ELF binary which is passed the name of the actual Wine binary to run (either wine-kthread or wine-pthread) along with the arguments the user passed in from the command line. The preloader is an unusual program: it does not have a main() function. In standard ELF applications, the entry point is actually at a symbol named _start: this is provided by the standard gcc infrastructure and normally jumps to __libc_start_main which initializes glibc before passing control to the main function as defined by the programmer. The preloader takes control direct from the entry point for a few reasons. Firstly, it is required that glibc is not initialized twice: the result of such behaviour is undefined and subject to change without notice. Secondly, it's possible that as part of initializing glibc, the address space layout could be changed - for instance, any call to malloc will initialize a heap arena which modifies the VM mappings. Finally, glibc does not return to _start at any point, so by reusing it we avoid the need to recreate the ELF bootstrap stack (env, argv, auxiliary array etc). The preloader is responsible for two things: protecting important regions of the address space so the dynamic linker does not map shared libraries into them, and once that is done loading the real Wine binary off disk, linking it and starting it up. Normally all this is done automatically by glibc and the kernel but as we intercepted this process by using a static binary it's up to us to restart the process. The bulk of the code in the preloader is about loading wine-[pk]thread and ld-linux.so.2 off disk, linking them together, then starting the dynamic linking process. One of the last things the preloader does before jumping into the dynamic linker is scan the symbol table of the loaded Wine binary and set the value of a global variable directly: this is a more efficient way of passing information to the main Wine program than flattening the data structures into an environment variable or command line parameter then unpacking it on the other side, but it achieves pretty much the same thing. The global variable set points to the preload descriptor table, which contains the VMA regions protected by the preloader. This allows Wine to unmap them once the dynamic linker has been run, so leaving gaps we can initialize properly later on. Starting the emulator The process of starting up the emulator itself is mostly one of chaining through various initializer functions defined in the core libraries and DLLs: libwine, then NTDLL, then kernel32. Both the wine-pthread and wine-kthread binaries share a common main function, defined in loader/main.c, so no matter which binary is selected after the preloader has run we start here. This passes the information provided by the preloader into libwine and then calls wine_init, defined in libs/wine/loader.c. This is where the emulation really starts: wine_init can, with the correct preparation, be called from programs other than the wine loader itself. wine_init does some very basic setup tasks such as initializing the debugging infrastructure, yet more address space manipulation (see the information on the 4G/4G VM split in the address space chapter), before loading NTDLL - the core of both Wine and the Windows NT series - and jumping to the __wine_process_init function defined in dlls/ntdll/loader.c This function is responsible for initializing the primary Win32 environment. In thread_init(), it sets up the TEB, the wineserver connection for the main thread and the process heap. See the threading chapter for more information on this. Finally, it loads and jumps to __wine_kernel_init in kernel32.dll: this is defined in dlls/kernel32/process.c. This is where the bulk of the work is done. The kernel32 initialization code retrieves the startup info for the process from the server, initializes the registry, sets up the drive mapping system and locale data, then begins loading the requested application itself. Each process has a STARTUPINFO block that can be passed into CreateProcess specifying various things like how the first window should be displayed: this is sent to the new process via the wineserver. After determining the type of file given to Wine by the user (a Win32 EXE file, a Win16 EXE, a Winelib app etc), the program is loaded into memory (which may involve loading and initializing other DLLs, the bulk of Wines startup code), before control reaches the end of __wine_kernel_init. This function ends with the new process stack being initialized, and start_process being called on the new stack. Nearly there! The final element of initializing Wine is starting the newly loaded program itself. start_process sets up the SEH backstop handler, calls LdrInitializeThunk which performs the last part of the process initialization (such as performing relocations and calling the DllMains with PROCESS_ATTACH), grabs the entry point of the executable and then on this line: ExitProcess( entry( peb ) ); ... jumps to the entry point of the program. At this point the users program is running and the API provided by Wine is ready to be used. When entry returns, the ExitProcess API will be used to initialize a graceful shutdown.