8

WML Bytecodes

 

This may prove to be the most interesting chapter that you’ve read so far. Many of the programmers today (usually referred to coders, as this is all they do - write code, without understanding it) are not really aware that 'what you write is not what gets stored (WYWINWGS). It is understandable if you have never heard this term before, because it is used by only the real top shot programmers (which is the purpose of this book, and the ones that follow.)

 

WYWINWGS has been around ever since using English-like programming languages became popular. Who had the time to sit and explain to some dumb bunny that ThisNewSuperFunctionThatYouDefined would take too any characters to store, so it was just shortened down to a couple of bytes - BAh! And whenever it was used again, the program would look up a chart and figure out that you wanted something called BAh!

 

What we look at here, is the byte codes of the a few of the programs that we have written. And what they actually translate to.

 

No. You are not expected to write programs this way, but understanding how a program is stored, will give you a better understanding of how the program works.

 

In our first example, we have not tried anything too fancy. All we want to do is to introduce you to what identifies the program file. Every file saved, always has a header that denotes the type of file that is stored. It identifies the program that created it and hence, the type of information that follows the header.

 

We have referred to the specifications available on the wapforum site: www.wapforum.org.

 

w1.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card>

</card>

</wml>

 

Compile this file and check the size of the .wmlc file. Dir command will display 7 bytes. Using one of the well know hex editors ‘Hexshop’ we found out that the bytes in a11.wmlc were as follows.

 

01 04  6A  00 7F 27 01

 

01

A binary WML document contains  elements. Each element may have zero or more attributes. Also they can have their own content. For eg. card has id , title and many more attributes. Also it overlaps the other tags. A p tag can have align, mode and can contain only text.  All wmlc files begin with a version number Version 1.1 is encoded as 0x01. The version byte is the major version - 1 in the upper four bytes and the minor version as is in the lower four bytes.

 

Ver 1.1 = 0x01

(1-1) = 0

 

1

0

0

0

0

 

0

0

0

1

8

4

2

1

8

4

2

1

 

Ver 2.7 = 0x17

(2-1) =1

 

7

0

0

0

1

 

0

1

1

1

8

4

2

1

8

4

2

1

 

04

The next byte represents  the document public identifiers. 4 is the value given to “-//WAPFORUM//DTD WML 1.1//EN”

 

6A

A binary XML format contains a representation of the XML document character encoding. The default charset is UTF-8 ie 6A. A value of zero indicates an unknown document encoding.

 

00

A binary XML/WML document must  include a string table immediately after the charset. This byte consists of a number, excluding the length byte. If the length is zero, there are no more strings following it.

 

Tags are called tokens and they are split into a set of overlapping code space. Each code space is further split into a series of 256 code pages.

 

Within the tag byte :

7th bit indicates whether attributes follow the tag code. If the bit is 0, then the tag contains no attributes. If it is 1, the tag is followed by one ore more attributes

 

6th bit indicates whether the tag begins with an element containing content. If it is 0, there is not content and no end tag either. If it is one, the tag is followed by content and is terminated by the end tag.

 

5-0 indicates the tag.

 

Attribute

Content

5-0 - tags

7

6

5

4

3

2

1

0

 

The bytecodes

 

01       Version 1.1

04       DTD type

6A      Utf8 string

00       String table

7F        3F        wml

 

Attr

Cont

3

 

f

0

1

1

1

 

1

1

1

1

8

4

2

1

8

4

2

1

 

27         27         card - there is no end tag for card as the 6th bit is 0

 

Attr

Cont

2

 

7

0

0

1

1

0

1

1

1

8

4

2

1

8

4

2

1

 

01       end of wml

 

Attr

Cont

0

 

1

0

0

0

0

0

0

0

1

8

4

2

1

8

4

2

1

 

Lets take the next example where we have an attribute for card. Card contains <p> as its content. The p element encloses bye.

 

w2.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p>

bye

</p>

</card>

</wml>

 

01         Version 1.1

04         DTD type

6A        Utf8

00         No strings

7F        3f  wml

 

Attr

Cont

3

 

f

0

1

1

1

1

1

1

1

8

4

2

1

8

4

2

1

 

E7        27   card has attributes and encloses p. 

 

Attr

Cont

2

 

7

1

1

1

0

0

1

1

1

8

4

2

1

8

4

2

1

 

36         title

03         string

68         h

69         i

00         0

01         end - title

60         20    p

 

Attr

Cont

2

 

0

0

1

1

0

0

0

0

0

8

4

2

1

8

4

2

1

 

03         string

20         space

62         b

79         y

65         e

20         space

00         0

01         end - p

01         end - card

01         end - wml

 

Every element has its own end byte. Also notice the string byte changing from 0 to 3 to indicate string data following. Every string ends in a space or 0.

 

Here we have given two strings within <p>

 

w3.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p>

bye

good

</p>

</card>

</wml>

 

01         Version 1.1

04         DTD type

6A        Utf- 8  

00         string

7F        3F        wml

E7        27         card

36         title

03         string

68         h

69         i

00         0

01         end       title

60         20         p

03         string

20         space

62         b

79         y

65         e

20         space

67         g

6F        o

6F        o

64         d

20         space

00         null

01         end p

01         end card

01        end wml

 

w4.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p>

<b>bye</b> <br/>

good

</p>

</card>

</wml>

 

The byte codes for the wml file are as follows. We have introduced b for the bold tag and br for line break.

 

01         version no

04         dtd type

6A        utf8     

00         string

7F        3F        wml

E7        27         card

36         title

03         string

68         h

69         i

00         0

01         end       title

60         20         p

 

Attr

Cont

2

 

0

1

1

1

0

0

0

0

0

8

4

2

1

8

4

2

1

 

64        24  b

 

Attr

Cont

2

 

4

1

1

1

0

0

1

0

0

8

4

2

1

8

4

2

1

 

65

03         string

62         b

79         y

65         e

00         0

01         end - b

26         br

03         string

20         space

67         g

6F        o

6F        o

64         d

20         space

00         0

01         end       p

01         end       card

01         end       wml

 

In the next program, we have replaced b with i. This is the only change made here.

 

w5.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p>

<i>bye</i> <br/>

good

</p>

</card>

</wml>

 

01

04

6A

00

7F        3F        wml

E7        27         card

36         title

03         string   

68         h

69         i

00         0

01         end       title

60         20         p

6D        2D        i

 

Attr

Cont

2

 

D

1

1

1

0

1

1

0

1

8

4

2

1

8

4

2

1

 

03         string   

62         b

79         y

65         e

00         0

01         end       i

26         br

03         string

20         space

67         g

6F        o

6F        o

64         d

20         space

00         0

01         end       p

01         end card

01         end wml

 

The following file shows you the bytecodes for u which stands for underline.

 

w6.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p>

<u>bye</u> <br/>

good

</p>

</card>

</wml>

 

01         major , minor version

04         dtd type

6A        utf8

00         strings

7F        3F        wml

E7         card

36         title

03         string

68         h

69         i

00         0

01         end title

60         20    - p

7D        3D   - u

 

Attr

Cont

3

 

D

0

1

1

1

1

1

0

1

8

4

2

1

8

4

2

1

 

03         string

62         b

79         y

65         e

00         0

01         end u

26         br

03         string

20         space

67         g

6F        o

6F        o

64         d

20         space

00         0

01         end p

01         end card

01         end wml

 

Similarly, if your replace the u tag with the other tags, the codeword changes accordingly

 

em   69 - 29

strong 79 - 39

small   78 - 38

 

The actual code for em is 29 and not 69. As we have seen before, the content bit goes on if the tag contains further content. Hence 29 becomes 69.

 

w13.wml

<?xml version="1.0"?>

<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>

<card title="hi">

<p align="center">

hi

</p>

</card>

</wml>

 

01         major, minor version

04         dtd type

6A        utf8

00         string

7F        3F        wml

E7        27         card

36         title

03         string

68         h

69         i

00         0

01         end title

E0        20         p

 

Attr

Cont

2

 

0

1

1

1

0

0

0

0

0

8

4

2

1

8

4

2

1

 

07         align=center

01         end       align

03         string

20         space

68         h

69         i

20         space

00         0

01         end       p

01         end       card

01         end       wml

 

This is the last program in this series and this section . We could have continued further but decided to stop here. You are now familiar with what Compile does. The Virtual Machine in the micro browser has to interpret these bytes and act accordingly. You can visit the WapForum site www.wapforum.org and download the technical specification to guide you further.