609 lines
21 KiB
Plaintext
609 lines
21 KiB
Plaintext
<chapter id="implementation">
|
||
<title>Low-level Implementation</title>
|
||
<para>Details of Wine's Low-level Implementation...</para>
|
||
|
||
<sect1 id="config-keyboard">
|
||
<title>Keyboard</title>
|
||
|
||
<para>
|
||
Wine now needs to know about your keyboard layout. This
|
||
requirement comes from a need from many apps to have the
|
||
correct scancodes available, since they read these directly,
|
||
instead of just taking the characters returned by the X
|
||
server. This means that Wine now needs to have a mapping from
|
||
X keys to the scancodes these programs expect.
|
||
</para>
|
||
<para>
|
||
On startup, Wine will try to recognize the active X layout by
|
||
seeing if it matches any of the defined tables. If it does,
|
||
everything is alright. If not, you need to define it.
|
||
</para>
|
||
<para>
|
||
To do this, open the file
|
||
<filename>dlls/x11drv/keyboard.c</filename> and take a look
|
||
at the existing tables. Make a backup copy of it, especially
|
||
if you don't use CVS.
|
||
</para>
|
||
<para>
|
||
What you really would need to do, is find out which scancode
|
||
each key needs to generate. Find it in the
|
||
<function>main_key_scan</function> table, which looks like
|
||
this:
|
||
</para>
|
||
<programlisting>
|
||
static const int main_key_scan[MAIN_LEN] =
|
||
{
|
||
/* this is my (102-key) keyboard layout, sorry if it doesn't quite match yours */
|
||
0x29,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0A,0x0B,0x0C,0x0D,
|
||
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1A,0x1B,
|
||
0x1E,0x1F,0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x2B,
|
||
0x2C,0x2D,0x2E,0x2F,0x30,0x31,0x32,0x33,0x34,0x35,
|
||
0x56 /* the 102nd key (actually to the right of l-shift) */
|
||
};
|
||
</programlisting>
|
||
<para>
|
||
Next, assign each scancode the characters imprinted on the
|
||
keycaps. This was done (sort of) for the US 101-key keyboard,
|
||
which you can find near the top in
|
||
<filename>keyboard.c</filename>. It also shows that if there
|
||
is no 102nd key, you can skip that.
|
||
</para>
|
||
<para>
|
||
However, for most international 102-key keyboards, we have
|
||
done it easy for you. The scancode layout for these already
|
||
pretty much matches the physical layout in the
|
||
<function>main_key_scan</function>, so all you need to do is
|
||
to go through all the keys that generate characters on your
|
||
main keyboard (except spacebar), and stuff those into an
|
||
appropriate table. The only exception is that the 102nd key,
|
||
which is usually to the left of the first key of the last line
|
||
(usually <keycap>Z</keycap>), must be placed on a separate
|
||
line after the last line.
|
||
</para>
|
||
<para>
|
||
For example, my Norwegian keyboard looks like this
|
||
</para>
|
||
<screen>
|
||
<EFBFBD> ! " # <20> % & / ( ) = ? ` Back-
|
||
| 1 2@ 3<> 4$ 5 6 7{ 8[ 9] 0} + \<5C> space
|
||
|
||
Tab Q W E R T Y U I O P <20> ^
|
||
<20>~
|
||
Enter
|
||
Caps A S D F G H J K L <20> <20> *
|
||
Lock '
|
||
|
||
Sh- > Z X C V B N M ; : _ Shift
|
||
ift < , . -
|
||
|
||
Ctrl Alt Spacebar AltGr Ctrl
|
||
</screen>
|
||
<para>
|
||
Note the 102nd key, which is the <keycap><></keycap> key, to
|
||
the left of <keycap>Z</keycap>. The character to the right of
|
||
the main character is the character generated by
|
||
<keycap>AltGr</keycap>.
|
||
</para>
|
||
<para>
|
||
This keyboard is defined as follows:
|
||
</para>
|
||
<programlisting>
|
||
static const char main_key_NO[MAIN_LEN][4] =
|
||
{
|
||
"|<7C>","1!","2\"@","3#<23>","4<>$","5%","6&","7/{","8([","9)]","0=}","+?","\\<5C>",
|
||
"qQ","wW","eE","rR","tT","yY","uU","iI","oO","pP","<22><>","<22>^~",
|
||
"aA","sS","dD","fF","gG","hH","jJ","kK","lL","<22><>","<22><>","'*",
|
||
"zZ","xX","cC","vV","bB","nN","mM",",;",".:","-_",
|
||
"<>"
|
||
};
|
||
</programlisting>
|
||
<para>
|
||
Except that " and \ needs to be quoted with a backslash, and
|
||
that the 102nd key is on a separate line, it's pretty
|
||
straightforward.
|
||
</para>
|
||
<para>
|
||
After you have written such a table, you need to add it to the
|
||
<function>main_key_tab[]</function> layout index table. This
|
||
will look like this:
|
||
</para>
|
||
<programlisting>
|
||
static struct {
|
||
WORD lang, ansi_codepage, oem_codepage;
|
||
const char (*key)[MAIN_LEN][4];
|
||
} main_key_tab[]={
|
||
...
|
||
...
|
||
{MAKELANGID(LANG_NORWEGIAN,SUBLANG_DEFAULT), 1252, 865, &main_key_NO},
|
||
...
|
||
</programlisting>
|
||
<para>
|
||
After you have added your table, recompile Wine and test that
|
||
it works. If it fails to detect your table, try running
|
||
</para>
|
||
<screen>
|
||
WINEDEBUG=+key,+keyboard wine > key.log 2>&1
|
||
</screen>
|
||
<para>
|
||
and look in the resulting <filename>key.log</filename> file to
|
||
find the error messages it gives for your layout.
|
||
</para>
|
||
<para>
|
||
Note that the <constant>LANG_*</constant> and
|
||
<constant>SUBLANG_*</constant> definitions are in
|
||
<filename>include/winnls.h</filename>, which you might need to
|
||
know to find out which numbers your language is assigned, and
|
||
find it in the WINEDEBUG output. The numbers will be
|
||
<literal>(SUBLANG * 0x400 + LANG)</literal>, so, for example
|
||
the combination <literal>LANG_NORWEGIAN (0x14)</literal> and
|
||
<literal>SUBLANG_DEFAULT (0x1)</literal> will be (in hex)
|
||
<literal>14 + 1*400 = 414</literal>, so since I'm Norwegian, I
|
||
could look for <literal>0414</literal> in the WINEDEBUG output
|
||
to find out why my keyboard won't detect.
|
||
</para>
|
||
<para>
|
||
Once it works, submit it to the Wine project. If you use CVS,
|
||
you will just have to do
|
||
</para>
|
||
<screen>
|
||
cvs -z3 diff -u dlls/x11drv/keyboard.c > layout.diff
|
||
</screen>
|
||
<para>
|
||
from your main Wine directory, then submit
|
||
<filename>layout.diff</filename> to
|
||
<email>wine-patches@winehq.org</email> along with a brief note
|
||
of what it is.
|
||
</para>
|
||
<para>
|
||
If you don't use CVS, you need to do
|
||
</para>
|
||
<screen>
|
||
diff -u the_backup_file_you_made dlls/x11drv/keyboard.c > layout.diff
|
||
</screen>
|
||
<para>
|
||
and submit it as explained above.
|
||
</para>
|
||
<para>
|
||
If you did it right, it will be included in the next Wine
|
||
release, and all the troublesome programs (especially
|
||
remote-control programs) and games that use scancodes will
|
||
be happily using your keyboard layout, and you won't get those
|
||
annoying fixme messages either.
|
||
</para>
|
||
</sect1>
|
||
|
||
|
||
<sect1 id="undoc-func">
|
||
<title>Undocumented APIs</title>
|
||
|
||
<para>
|
||
Some background: On the i386 class of machines, stack entries are
|
||
usually dword (4 bytes) in size, little-endian. The stack grows
|
||
downward in memory. The stack pointer, maintained in the
|
||
<literal>esp</literal> register, points to the last valid entry;
|
||
thus, the operation of pushing a value onto the stack involves
|
||
decrementing <literal>esp</literal> and then moving the value into
|
||
the memory pointed to by <literal>esp</literal>
|
||
(i.e., <literal>push p</literal> in assembly resembles
|
||
<literal>*(--esp) = p;</literal> in C). Removing (popping)
|
||
values off the stack is the reverse (i.e., <literal>pop p</literal>
|
||
corresponds to <literal>p = *(esp++);</literal> in C).
|
||
</para>
|
||
|
||
<para>
|
||
In the <literal>stdcall</literal> calling convention, arguments are
|
||
pushed onto the stack right-to-left. For example, the C call
|
||
<function>myfunction(40, 20, 70, 30);</function> is expressed in
|
||
Intel assembly as:
|
||
<screen>
|
||
push 30
|
||
push 70
|
||
push 20
|
||
push 40
|
||
call myfunction
|
||
</screen>
|
||
The called function is responsible for removing the arguments
|
||
off the stack. Thus, before the call to myfunction, the
|
||
stack would look like:
|
||
<screen>
|
||
[local variable or temporary]
|
||
[local variable or temporary]
|
||
30
|
||
70
|
||
20
|
||
esp -> 40
|
||
</screen>
|
||
After the call returns, it should look like:
|
||
<screen>
|
||
[local variable or temporary]
|
||
esp -> [local variable or temporary]
|
||
</screen>
|
||
</para>
|
||
|
||
<para>
|
||
To restore the stack to this state, the called function must know how
|
||
many arguments to remove (which is the number of arguments it takes).
|
||
This is a problem if the function is undocumented.
|
||
</para>
|
||
|
||
<para>
|
||
One way to attempt to document the number of arguments each function
|
||
takes is to create a wrapper around that function that detects the
|
||
stack offset. Essentially, each wrapper assumes that the function will
|
||
take a large number of arguments. The wrapper copies each of these
|
||
arguments into its stack, calls the actual function, and then calculates
|
||
the number of arguments by checking esp before and after the call.
|
||
</para>
|
||
|
||
<para>
|
||
The main problem with this scheme is that the function must actually
|
||
be called from another program. Many of these functions are seldom
|
||
used. An attempt was made to aggressively query each function in a
|
||
given library (<filename>ntdll.dll</filename>) by passing 64 arguments,
|
||
all 0, to each function. Unfortunately, Windows NT quickly goes to a
|
||
blue screen of death, even if the program is run from a
|
||
non-administrator account.
|
||
</para>
|
||
|
||
<para>
|
||
Another method that has been much more successful is to attempt to
|
||
figure out how many arguments each function is removing from the
|
||
stack. This instruction, <literal>ret hhll</literal> (where
|
||
<symbol>hhll</symbol> is the number of bytes to remove, i.e. the
|
||
number of arguments times 4), contains the bytes
|
||
<literal>0xc2 ll hh</literal> in memory. It is a reasonable
|
||
assumption that few, if any, functions take more than 16 arguments;
|
||
therefore, simply searching for
|
||
<literal>hh == 0 && ll < 0x40</literal> starting from the
|
||
address of a function yields the correct number of arguments most
|
||
of the time.
|
||
</para>
|
||
|
||
<para>
|
||
Of course, this is not without errors. <literal>ret 00ll</literal>
|
||
is not the only instruction that can have the byte sequence
|
||
<literal>0xc2 ll 0x0</literal>; for example,
|
||
<literal>push 0x000040c2</literal> has the byte sequence
|
||
<literal>0x68 0xc2 0x40 0x0 0x0</literal>, which matches
|
||
the above. Properly, the utility should look for this sequence
|
||
only on an instruction boundary; unfortunately, finding
|
||
instruction boundaries on an i386 requires implementing a full
|
||
disassembler -- quite a daunting task. Besides, the probability
|
||
of having such a byte sequence that is not the actual return
|
||
instruction is fairly low.
|
||
</para>
|
||
|
||
<para>
|
||
Much more troublesome is the non-linear flow of a function. For
|
||
example, consider the following two functions:
|
||
<screen>
|
||
somefunction1:
|
||
jmp somefunction1_impl
|
||
|
||
somefunction2:
|
||
ret 0004
|
||
|
||
somefunction1_impl:
|
||
ret 0008
|
||
</screen>
|
||
In this case, we would incorrectly detect both
|
||
<function>somefunction1</function> and
|
||
<function>somefunction2</function> as taking only a single
|
||
argument, whereas <function>somefunction1</function> really
|
||
takes two arguments.
|
||
</para>
|
||
|
||
<para>
|
||
With these limitations in mind, it is possible to implement more stubs
|
||
in Wine and, eventually, the functions themselves.
|
||
</para>
|
||
</sect1>
|
||
|
||
<sect1 id="accel-impl">
|
||
<title>Accelerators</title>
|
||
|
||
<para>
|
||
There are <emphasis>three</emphasis> differently sized
|
||
accelerator structures exposed to the user:
|
||
</para>
|
||
<orderedlist>
|
||
<listitem>
|
||
<para>
|
||
Accelerators in NE resources. This is also the internal
|
||
layout of the global handle <type>HACCEL</type> (16 and
|
||
32) in Windows 95 and Wine. Exposed to the user as Win16
|
||
global handles <type>HACCEL16</type> and
|
||
<type>HACCEL32</type> by the Win16/Win32 API.
|
||
These are 5 bytes long, with no padding:
|
||
<programlisting>
|
||
BYTE fVirt;
|
||
WORD key;
|
||
WORD cmd;
|
||
</programlisting>
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Accelerators in PE resources. They are exposed to the user
|
||
only by direct accessing PE resources.
|
||
These have a size of 8 bytes:
|
||
</para>
|
||
<programlisting>
|
||
BYTE fVirt;
|
||
BYTE pad0;
|
||
WORD key;
|
||
WORD cmd;
|
||
WORD pad1;
|
||
</programlisting>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Accelerators in the Win32 API. These are exposed to the
|
||
user by the <function>CopyAcceleratorTable</function>
|
||
and <function>CreateAcceleratorTable</function> functions
|
||
in the Win32 API.
|
||
These have a size of 6 bytes:
|
||
</para>
|
||
<programlisting>
|
||
BYTE fVirt;
|
||
BYTE pad0;
|
||
WORD key;
|
||
WORD cmd;
|
||
</programlisting>
|
||
</listitem>
|
||
</orderedlist>
|
||
|
||
<para>
|
||
Why two types of accelerators in the Win32 API? We can only
|
||
guess, but my best bet is that the Win32 resource compiler
|
||
can/does not handle struct packing. Win32 <type>ACCEL</type>
|
||
is defined using <function>#pragma(2)</function> for the
|
||
compiler but without any packing for RC, so it will assume
|
||
<function>#pragma(4)</function>.
|
||
</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1 id="hardware-trace">
|
||
<title>Doing A Hardware Trace</title>
|
||
|
||
<para>
|
||
The primary reason to do this is to reverse engineer a
|
||
hardware device for which you don't have documentation, but
|
||
can get to work under Wine.
|
||
</para>
|
||
<para>
|
||
This lot is aimed at parallel port devices, and in particular
|
||
parallel port scanners which are now so cheap they are
|
||
virtually being given away. The problem is that few
|
||
manufactures will release any programming information which
|
||
prevents drivers being written for Sane, and the traditional
|
||
technique of using DOSemu to produce the traces does not work
|
||
as the scanners invariably only have drivers for Windows.
|
||
</para>
|
||
<para>
|
||
Presuming that you have compiled and installed wine the first
|
||
thing to do is is to enable direct hardware access to your
|
||
parallel port. To do this edit <filename>config</filename>
|
||
(usually in <filename>~/.wine/</filename>) and in the
|
||
ports section add the following two lines
|
||
</para>
|
||
<programlisting>
|
||
read=0x378,0x379,0x37a,0x37c,0x77a
|
||
write=0x378,x379,0x37a,0x37c,0x77a
|
||
</programlisting>
|
||
<para>
|
||
This adds the necessary access required for SPP/PS2/EPP/ECP
|
||
parallel port on LPT1. You will need to adjust these number
|
||
accordingly if your parallel port is on LPT2 or LPT0.
|
||
</para>
|
||
<para>
|
||
When starting wine use the following command line, where
|
||
<literal>XXXX</literal> is the program you need to run in
|
||
order to access your scanner, and <literal>YYYY</literal> is
|
||
the file your trace will be stored in:
|
||
</para>
|
||
<programlisting>
|
||
WINEDEBUG=+io wine XXXX 2> >(sed 's/^[^:]*:io:[^ ]* //' > YYYY)
|
||
</programlisting>
|
||
<para>
|
||
You will need large amounts of hard disk space (read hundreds
|
||
of megabytes if you do a full page scan), and for reasonable
|
||
performance a really fast processor and lots of RAM.
|
||
</para>
|
||
<para>
|
||
You will need to postprocess the output into a more manageable
|
||
format, using the <command>shrink</command> program. First
|
||
you need to compile the source (which is located at the end of
|
||
this section):
|
||
<programlisting>
|
||
cc shrink.c -o shrink
|
||
</programlisting>
|
||
</para>
|
||
<para>
|
||
Use the <command>shrink</command> program to reduce the
|
||
physical size of the raw log as follows:
|
||
</para>
|
||
<programlisting>
|
||
cat log | shrink > log2
|
||
</programlisting>
|
||
<para>
|
||
The trace has the basic form of
|
||
</para>
|
||
<programlisting>
|
||
XXXX > YY @ ZZZZ:ZZZZ
|
||
</programlisting>
|
||
<para>
|
||
where <literal>XXXX</literal> is the port in hexadecimal being
|
||
accessed, <literal>YY</literal> is the data written (or read)
|
||
from the port, and <literal>ZZZZ:ZZZZ</literal> is the address
|
||
in memory of the instruction that accessed the port. The
|
||
direction of the arrow indicates whether the data was written
|
||
or read from the port.
|
||
</para>
|
||
<programlisting>
|
||
> data was written to the port
|
||
< data was read from the port
|
||
</programlisting>
|
||
<para>
|
||
My basic tip for interpreting these logs is to pay close
|
||
attention to the addresses of the IO instructions. Their
|
||
grouping and sometimes proximity should reveal the presence of
|
||
subroutines in the driver. By studying the different versions
|
||
you should be able to work them out. For example consider the
|
||
following section of trace from my UMAX Astra 600P
|
||
</para>
|
||
<programlisting>
|
||
0x378 > 55 @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
0x378 > aa @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
0x378 > 00 @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
0x378 > 00 @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
0x378 > 00 @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
0x378 > 00 @ 0297:01ec
|
||
0x37a > 05 @ 0297:01f5
|
||
0x379 < 8f @ 0297:01fa
|
||
0x37a > 04 @ 0297:0211
|
||
</programlisting>
|
||
<para>
|
||
As you can see there is a repeating structure starting at
|
||
address <literal>0297:01ec</literal> that consists of four io
|
||
accesses on the parallel port. Looking at it the first io
|
||
access writes a changing byte to the data port the second
|
||
always writes the byte <literal>0x05</literal> to the control
|
||
port, then a value which always seems to
|
||
<literal>0x8f</literal> is read from the status port at which
|
||
point a byte <literal>0x04</literal> is written to the control
|
||
port. By studying this and other sections of the trace we can
|
||
write a C routine that emulates this, shown below with some
|
||
macros to make reading/writing on the parallel port easier to
|
||
read.
|
||
</para>
|
||
<programlisting>
|
||
#define r_dtr(x) inb(x)
|
||
#define r_str(x) inb(x+1)
|
||
#define r_ctr(x) inb(x+2)
|
||
#define w_dtr(x,y) outb(y, x)
|
||
#define w_str(x,y) outb(y, x+1)
|
||
#define w_ctr(x,y) outb(y, x+2)
|
||
|
||
/* Seems to be sending a command byte to the scanner */
|
||
int udpp_put(int udpp_base, unsigned char command)
|
||
{
|
||
int loop, value;
|
||
|
||
w_dtr(udpp_base, command);
|
||
w_ctr(udpp_base, 0x05);
|
||
|
||
for (loop=0; loop < 10; loop++)
|
||
if ((value = r_str(udpp_base)) & 0x80)
|
||
{
|
||
w_ctr(udpp_base, 0x04);
|
||
return value & 0xf8;
|
||
}
|
||
|
||
return (value & 0xf8) | 0x01;
|
||
}
|
||
</programlisting>
|
||
<para>
|
||
For the UMAX Astra 600P only seven such routines exist (well
|
||
14 really, seven for SPP and seven for EPP). Whether you
|
||
choose to disassemble the driver at this point to verify the
|
||
routines is your own choice. If you do, the address from the
|
||
trace should help in locating them in the disassembly.
|
||
</para>
|
||
<para>
|
||
You will probably then find it useful to write a script/perl/C
|
||
program to analyse the logfile and decode them futher as this
|
||
can reveal higher level grouping of the low level routines.
|
||
For example from the logs from my UMAX Astra 600P when decoded
|
||
further reveal (this is a small snippet)
|
||
</para>
|
||
<programlisting>
|
||
start:
|
||
put: 55 8f
|
||
put: aa 8f
|
||
put: 00 8f
|
||
put: 00 8f
|
||
put: 00 8f
|
||
put: c2 8f
|
||
wait: ff
|
||
get: af,87
|
||
wait: ff
|
||
get: af,87
|
||
end: cc
|
||
start:
|
||
put: 55 8f
|
||
put: aa 8f
|
||
put: 00 8f
|
||
put: 03 8f
|
||
put: 05 8f
|
||
put: 84 8f
|
||
wait: ff
|
||
</programlisting>
|
||
<para>
|
||
From this it is easy to see that <varname>put</varname>
|
||
routine is often grouped together in five successive calls
|
||
sending information to the scanner. Once these are understood
|
||
it should be possible to process the logs further to show the
|
||
higher level routines in an easy to see format. Once the
|
||
highest level format that you can derive from this process is
|
||
understood, you then need to produce a series of scans varying
|
||
only one parameter between them, so you can discover how to
|
||
set the various parameters for the scanner.
|
||
</para>
|
||
|
||
<para>
|
||
The following is the <filename>shrink.c</filename> program:
|
||
<programlisting>
|
||
/* Copyright David Campbell <campbell@torque.net> */
|
||
#include <stdio.h>
|
||
#include <string.h>
|
||
|
||
int main (void)
|
||
{
|
||
char buff[256], lastline[256] = "";
|
||
int count = 0;
|
||
|
||
while (!feof (stdin))
|
||
{
|
||
fgets (buff, sizeof (buff), stdin);
|
||
if (strcmp (buff, lastline))
|
||
{
|
||
if (count > 1)
|
||
printf ("# Last line repeated %i times #\n", count);
|
||
printf ("%s", buff);
|
||
strcpy (lastline, buff);
|
||
count = 1;
|
||
}
|
||
else count++;
|
||
}
|
||
return 0;
|
||
}
|
||
</programlisting>
|
||
</para>
|
||
</sect1>
|
||
|
||
</chapter>
|
||
|
||
<!-- Keep this comment at the end of the file
|
||
Local variables:
|
||
mode: sgml
|
||
sgml-parent-document:("wine-devel.sgml" "set" "book" "part" "chapter" "")
|
||
End:
|
||
-->
|