w3m

Unnamed repository; edit this file to name it for gitweb.
git clone https://logand.com/git/w3m.git/
Log | Files | Refs | README

STORY.html (9428B)


      1 <html>
      2 <head>
      3 <title>History of w3m</title>
      4 </head>
      5 <body>
      6 <h1>History of w3m</h1>
      7 <div align=right>
      8 1999/2/18<br>
      9 1999/3/8 revised<br>
     10 1999/6/11 translated into English<br>
     11 Akinori Ito<br>
     12 aito@fw.ipsj.or.jp
     13 </div>
     14 <h2>Introduction</h2>
     15 W3m is a text-based pager and WWW browser.
     16 It is similar application to the famous text-based
     17 browser <a href="http://www.lynx.browser.org/">Lynx</a>.
     18 However, w3m has several advantages against Lynx. For example,
     19 <UL>
     20 <LI>W3m can render tables.
     21 <LI>W3m can render frame (by converting frame into table).
     22 <LI>As w3m is a pager, it can read document from standard input.
     23 (I heard Lynx also can display standard-input-given document, like this:
     24 <pre>
     25    lynx /dev/fd/0 &gt; file
     26 </pre>
     27 Hmm, it works on Linux. )
     28 <LI>W3m is small. Its stripped binary for Sparc (compiled with
     29 gcc -O2, version beta-990217) is only 260kbyte, while binary size
     30 of Lynx is beyond 1.8Mbyte. (Actually, lynx it 800K on my i386 system, w3m is 200K + libgc.)
     31 </UL>
     32 It is true that Lynx is an excellent browser, who have many
     33 features w3m doesn't have. For example,
     34 <UL>
     35 <LI>Lynx can handle cookies.
     36 <LI>Lynx has many options.
     37 <LI>Lynx is multilingual. (W3m is Japanese-English bilingual)
     38 </UL>
     39 etc. It is also a great advantage that Lynx has a lot of
     40 documentation.
     41 <P>
     42 <b>I don't intend w3m to be a substitute of any other browsers,
     43 including Netscape and Lynx.</b> Why did I wrote w3m?
     44 Because I felt inconvenient with conventional browsers 
     45 to `take a look' at web pages.
     46 I am browsing web pages in LAN environment. When I want to take
     47 a glance at a web page, I don't want to wait to start up Netscape.
     48 Lynx also takes a few seconds to start up (you can get lynx startup time to almost zero when you rm /etc/mailcap). On the other hand,
     49 w3m starts immediately with little load to the host machine.
     50 After looking at the information using w3m, I use other browser
     51 if I want to read the the page in detail. As for me, however,
     52 w3m is enough to read most of web pages.
     53 
     54 <h2>The birth of w3m</h2>
     55 <P>
     56 w3m was derived from a pager named `fm'. Fm was written before
     57 1991 (I don't remember the exact date) when WWW was not popular.
     58 At that time, the word `browser' meant a file browser like
     59 `more' or `less'.
     60 <P>
     61 I wrote fm to debug a program for my research. To trace the status
     62 of the program, it dumped megabytes of values of variables into a file,
     63 and I debugged it by checking the dumped file. The program dumped
     64 information at a certain time in one line, which made the dumped line
     65 several hundred characters long. When I looked the file using `more' or
     66 `less', one line was folded into several lines and it was very hard
     67 to read it. Therefore, I wrote fm, which didn't fold a line. Fm displayed
     68 one logical line as one physical line. When seeing the hidden
     69 part of a line, fm shifted entire screen. As I used 80x24 terminal at that
     70 time, fm was very useful for the debugging.
     71 <P>
     72 Several years later, I got to know WWW and began to use it.
     73 I used XMosaic and Chimera. I liked Chimera because it was light.
     74 As I was interested in the mechanism of WWW, I learned HTML and
     75 HTTP, and I felt it simpler than I expected. The earlier version
     76 of HTTP was very similar to Gopher protocol. HTML 2.0 was
     77 simple enough to render. All I have to do seemed to be line folding
     78 and itemized display. Then I made a little modification to fm
     79 and made a web browser. It was the first version of w3m.
     80 The name `w3m' was an abbreviation of Japanese phrase `WWW wo miru',
     81 which means `see WWW'. It was an inheritance from `fm', which
     82 was an abbreviation of `File wo miru'. The first version of w3m
     83 was released at the beginning of 1995.
     84 
     85 <h2>Death and rebirth of w3m</h2>
     86 <p>
     87 I had used w3m as a pager to read files, E-mails and online manuals. 
     88 It was a substitute of less. Sometimes I used w3m as a web browser,
     89 but there were many pages w3m couldn't display correctly, most of
     90 which used table for page layout. Once I tried to implement table
     91 renderer, but I gave up because it seemed to be too difficult for me.
     92 <P>
     93 It was 1998 when I tried to modify w3m again. There were two reasons.
     94 The first is that I had some time to do it. I stayed Boston University
     95 as a visiting researcher at that time. The second reason is that
     96 I wanted to use table in my personal web page.  I had written research
     97 log using HTML, and I wanted to write a table in it. At first I used 
     98 &lt;pre&gt;..&lt;/pre&gt; to describe table, but it was not cool at all.
     99 One day I used &lt;table&gt; tag, which made me to use Netscape to
    100 read the research log. Then I decided to implement a table renderer
    101 into w3m.
    102 <P>
    103 I didn't intend to write a perfect table renderer because tables
    104 I used was not very complicated. However, incomplete table rendering
    105 made the display of table-layout pages horrible. I realized that
    106 it required almost-perfect table renderer 
    107 to do well both in `rendering (real) table' and `fine display of
    108 table-layout page.' It was a thorn path.
    109 <P>
    110 After taking several months, I finished `fair' table renderer.
    111 Then I implemented form into w3m. Finally, w3m was reborn as a
    112 practical web browser.
    113 
    114 <h2>Table rendering algorithm in w3m</h2>
    115 
    116 HTML table rendering is difficult. Tabular environment
    117 of LaTeX is not very difficult, which makes the width of a column
    118 either a specified value or the maximum width to put items into it.
    119 On the other hand, HTML table renderer has to decide
    120 the width of a column so that the entire table can fit into the
    121 display appropriately, and fold the contents of the table according
    122 to the column width. Inappropriate column width decision makes
    123 the table ugly. Moreover, table can be nested, which makes the algorithm
    124 more complicated.
    125 
    126 <OL>
    127 <LI>First, calculate the maximum and minimum width of each column.
    128 The maximum width is the width required to display the column
    129 without folding the contents. Generally, it is the length of 
    130 paragraph delimited by &lt;BR&gt; or &lt;P&gt;.
    131 The minimum width is the lower limit to display the contents.
    132 If the column contains the word `internationalization', the minimum
    133 width will be 20. If the column contains 
    134 &lt;pre&gt;..&lt;/pre&gt;, the maximum width of the preformatted
    135 text will be the minimum width of the column.
    136 
    137 <LI>If the width of the column is specified by WIDTH attribute,
    138 fix the column width using that value. If the specified width is
    139 smaller than the minimum width of the column, fix the column width
    140 to the minimum width.
    141 
    142 <LI>Calculate the sum of the maximum width (or fixed width) of
    143 each column and check if the sum exceeds the screen width.
    144 If it is smaller than screen width, these values are used for
    145 width of each column.
    146 
    147 <LI>If the sum is larger than the screen width, determine the widths
    148 of each column according to the following steps.
    149 <OL>
    150 <LI>Let W be the screen width subtracted by the sum of widths of 
    151 fixed-width columns.
    152 <LI>Distribute W into the columns whose width are not decided,
    153 in proportion to the logarithm of the maximum width of each column.
    154 <li>If the distributed width of a column is smaller than the minimum width,
    155 then fix the width of the column to the minimum width, and 
    156 do the distribution again.
    157 </OL>
    158 </OL>
    159 
    160 In this process, distributed width is proportion to logarithm of
    161 maximum width, but I am not sure that this heuristic is the best.
    162 It can be, for example, square root of the maximum width.
    163 <P>
    164 The algorithm above assumes that the screen width is known.
    165 But it is not true for nested table. According the algorithm above,
    166 the column width of the outer table have to be known to render
    167 the inner table, while the total width of the inner table have to
    168 be known to determine the column width of the outer table.
    169 If WIDTH attribute exists there are no problems. Otherwise, w3m
    170 assumes that the inner table is 0.8 times as wide as the outer
    171 table. It works fine, but if there are two tables side by side in an outer
    172 table, the width of the outer table always exceeds the screen width.
    173 To render this kind of table correctly, one have to render the table once,
    174 check the width of outmost table, and then render the entire table again.
    175 Netscape might employ this kind of algorithm.
    176 
    177 <h2>Libraries</h2>
    178 
    179 w3m uses
    180 <a href="http://reality.sgi.com/boehm/gc.html">Boehm GC</a>
    181 library. This library was written by H. Boehm and A. Demers.
    182 I could distribute w3m without this library because one can
    183 get the library separately, but I decided to contain it in the
    184 w3m distribution for the convenience of an installer.
    185 W3m doesn't use libwww.
    186 <P>
    187 Boehm GC is a garbage collector for C and C++. I began to use this
    188 library when I implemented table, and it was great. I couldn't
    189 implement table and form without this library. 
    190 <P>
    191 Older version than beta-990304 used 
    192 <a href="http://home.cern.ch/~orel/libftp/libftp/libftp.html">LIBFTP</a>
    193 because I felt tired of writing codes to handle FTP protocol.
    194 But I rewrote the FTP code by myself to make w3m completely free.
    195 It made w3m slightly smaller.
    196 <P>
    197 By the way, w3m doesn't use UNIX standard regexp library and curses library.
    198 It is because I want to use Japanese. When I wrote fm, there were
    199 no free regexp/curses libraries that can treat Japanese. Now both libraries
    200 are available and they looks faster than w3m code.
    201 
    202 <h2>Future work</h2>
    203 
    204 ...Nothing. As w3m's virtues are its small size and rendering speed,
    205 adding more features might lose these advantages. On the other hand,
    206 w3m is still known to have many bugs, and I will continue fixing them.
    207 
    208 </body>
    209 </html>