\documentclass{article} % (C) Raja Sooriamurthi 9.3.1995 % \usepackage{newcent} \usepackage{comment,psfig,fullpage,alltt,noweb} % \psdraft \def\scheme#1{{\tt #1}} % Make noweb let the code flow without odd breaks. \def\filbreak{} \def\nwdocspar{} \title {A Scheme Word Count Program \\ {\large A tutorial introduction to literate programming in Scheme} \footnotetext{\copyright\ 1995}} \author {Raja Sooriamurthi} \date {} \begin{document} \maketitle The Unix {\tt wc} program has been used a didactic example to illustrate literate programming \cite{Knuth-book:literate,Ramsey-ieee:noweb}. In continuing that tradition we present below a Scheme version of the {\tt wc} program written using [[noweb]]. % In Common Lisp a better interface would be % (wc :line :char :word "file-1" "file-2" "file-3" ) The calling convention of the program is: \scheme{(wc 'l 'c 'w "file-1" "file-2" ... ) }. Where {\tt c}, {\tt w}, and {\tt l} are counter parts of the Unix {\tt wc} options specifying whether we want a count of the number of characters (bytes), words (white space separated character sequences), or lines (newline characters). If no options are explicitly specified then it is assumed to be {\tt (c l w)}. Files are identified by strings following the options. If no files are specified then input is read from the {\tt (current-input-port)}. Output is returned as a list of the various counts. For instance \begin{alltt}\small > (wc "/etc/motd") (("/etc/motd" (chars 92) (words 13) (lines 4))) > (wc 'l 'c 'w "/etc/motd" "/etc/passwd") (("/etc/motd" (chars 92) (words 13) (lines 4)) ("/etc/passwd" (chars 1629) (words 59) (lines 28)) (total (chars 1721) (words 72) (lines 32))) > (wc) > (wc) Literate Programming can be fun ( (chars 33) (words 5) (lines 5)) \end{alltt} \section{The Idea} The {\tt wc} is a simple {\em Mealy machine\/} as given below: \begin{figure}[tbh] \centerline{\psfig{figure=wc.pstex,height=1.5in}} \caption{A Mealy machine that represents the behaviour of the {\tt wc} program.} \end{figure} The arc transitions occur on a newline (\verb+\n+), a space or tab (\verb+\s, \t+) or any other character (\verb+\o+). Associated with each arc transition we have an associated action. All of these actions increment a counter. \fbox{$l$} and \fbox{$w$} are counters for the number of lines and words in the input. There is another counter for characters \fbox{$c$} whose increment is associated with all the arcs and hence has not been explicitly shown. \section{The Program} Given the specifications above we have the outline of the program as: @ <<*>>= <> <> <> @ %def The main {\tt wc} program consists of two phases (1) processing the input option argument to determine the format of the output \footnote{This program isn't complete: it only processes the full set of options and not a particular subset.} and (2) processing the individual files. (In this code we make use of Chez Scheme's multiple values \cite{Ashley:mvls}.) @ <>= (define wc (lambda args (call-with-values <> <>))) @ %def wc To process the input we have to check if no files have been specified and if so we receive input from the \verb+(current-input-port)+ or else we process the specified files. @ <>= (lambda (options all-files) (if (null? all-files) ;; then process the (current-input-port) (call-with-values (lambda () (process-input (current-input-port))) (lambda (chars words lines) `( (chars ,chars) (words ,words) (lines ,lines)))) ;; else process the list of the input files ;; keeping track of a running total (let loop ([files all-files] [total_chars 0] [total_words 0] [total_lines 0]) (if (null? files) (if (null? (cdr all-files)) ;; there was only 1 file '() ;; else return a summary (list `(Total (chars ,total_chars) (words ,total_words) (lines ,total_lines)))) <>)))) @ %def total_chars total_words total_lines To process a single file we first open it, count the number of characters, words and lines, and then update the running total. <>= (call-with-input-file (first files) (lambda (port) (call-with-values (lambda () (process-input port)) (lambda (chars words lines) (cons `(,(first files) (chars ,chars) (words ,words) (lines ,lines)) (loop (rest files) (+ total_chars chars) (+ total_words words) (+ total_lines lines))))))) @ %def To process an input we just traverse the aforementioned two state Mealy machine. @ <>= (define process-input (lambda (port) (let loop ([chars 0] [words 0] [lines 0] [state 's0] [in (read-char port)]) (if (eof-object? in) (values chars words lines) (case state <> <>))))) @ %def process-input chars words lines state in The first state of the FSA is defined by: @ <>= [(s0) (case in [(#\newline) (loop (add1 chars) words (add1 lines) 's0 (read-char port))] [(#\space #\tab) (loop (add1 chars) words lines 's0 (read-char port))] [else (loop (add1 chars) words lines 's1 (read-char port))])] @ %def The second state of the FSA is: @ <>= [(s1) (case in [(#\newline) (loop (add1 chars) (add1 words) (add1 lines) 's0 (read-char port))] [(#\space #\tab) (loop (add1 chars) (add1 words) lines 's0 (read-char port))] [else (loop (add1 chars) words lines 's1 (read-char port))])] @ %def To determine what are the options to be used while producing the final output we split the input {\tt args} into a list of options and a list of the file names. @ <>= (lambda () (let loop ([args args] [options '()]) (cond [(null? args) (values options args)] [(string? (first args)) ;; the options have been processed (values options args)] [else (loop (rest args) (cons (first args) options))]))) @ %def Finally some simple help routines. @ <>= (define first car) (define rest cdr) @ %def first rest \medskip \begin{minipage}[t]{3in} {\bf Chunks:} \\[-10pt] \small\nowebchunks \end{minipage} \hspace*{30pt} \begin{minipage}[t]{2in} \noindent{\bf Index:} \\[-10pt] \small \nowebindex \end{minipage} \nopagebreak {\small \bibliographystyle{plain} \bibliography{abbreviations,my-ai}} \end{document}