Generated: April 2, 2002, 11:01:26Copyright ©2002, Kurt NørmarkThe local LAML software home page

Reference Manual of the Text Collection and Skipping Library

Kurt Nørmark ©    normark@cs.auc.dk    Department of Computer Science    Aalborg University    Denmark    

Master index
Source file: lib/collect-skip.scm
LAML Version 17.00 (April 2, 2002) full

This library contains a number of functions which collect and skip characters in a text file. These functions may, for instance, be used to parse a file.

It is assumed that the variable ip references an input port. The assignment of ip must be done exernally to this library, and after the library is loaded.

The main functions can be found in the section Collection and skipping functions below.

This library has been developed as part of an SGML Document Type Definition (DTD) parser. There exists internal documentation of the DTD parser, as such also of some aspects of the functions in this library.

Table of Contents:
1. Look ahead buffer and queue.2. Collection and skipping functions.3. Useful predicates for skipping and collecting.

Alphabetic index:
advance-look-ahead(advance-look-ahead n)Provided that there is at least n characters in the reading queue, advance next-read with n positions.
char-predicate(char-predicate ch)Return a predicate functions which matches the character ch.
collect-balanced-until(collect-balanced-until char-pred-1 char-pred-2)This collection procedure returns a balanced collection given two char predicates.
collect-until(collect-until p)Return the string collected from the input port ip.
collect-until-string(collect-until-string str . inclusive)Collect characters until str is encountered.
end-of-line?(end-of-line? ch)Is ch an end of line charcter?
ensure-look-ahead(ensure-look-ahead n)Make sure that there is at least n characters in the look ahead queue
eof?(eof? ch)Is ch an end of file character?
is-white-space?(is-white-space? ch)Is ch a white space character?
look-ahead-char(look-ahead-char)Return the first character in the look ahead vector.
look-ahead-prefix(look-ahead-prefix lgt)Return a lgt character string from the peeked chars in the queue.
match-look-ahead?(match-look-ahead? str)Return whether the queue contents match the string str.
max-look-aheadmax-look-aheadThe length of the cyclic look ahead buffer.
max-look-ahead-prefix(max-look-ahead-prefix)Return the entire look ahead queue as a string
peek-a-char(peek-a-char)Peek a character from the input port, but queues it for subsequent reading at "the peek end".
peek-chars(peek-chars n)Peeks n charcters
put-back-a-char-read-end(put-back-a-char-read-end ch)Put ch back at the front end of the "queue" (where read-a-char operates).
put-back-a-char-write-end(put-back-a-char-write-end ch)Put ch back at the rear end of the queue (where peek-a-char operates).
put-back-a-string(put-back-a-string str which-end)Put str back in queue.
read-a-char(read-a-char)Read from the the look ahead buffer.
read-a-string(read-a-string n)Read and return a string of length n.
reset-look-ahead-buffer(reset-look-ahead-buffer)Reset the look ahead buffer.
skip-string(skip-string str if-not-message)Assume that str is just in front of us.
skip-until-string(skip-until-string str . inclusive)Skip characters until str is encountered.
skip-while(skip-while p)Skip characters while p holds.

 

1.   LOOK AHEAD BUFFER AND QUEUE.
The functions in this section manipulates a look ahead queue, which is in between the input port ip and the applications. Via this buffer it is possible to implement look ahead in the input port.


max-look-ahead



Form
max-look-ahead

Description
The length of the cyclic look ahead buffer. Predefined to 2000 characters.


reset-look-ahead-buffer



Form
(reset-look-ahead-buffer)

Description
Reset the look ahead buffer.


peek-a-char



Form
(peek-a-char)

Description
Peek a character from the input port, but queues it for subsequent reading at "the peek end". This function always reads one character via read-char.


peek-chars



Form
(peek-chars n)

Description
Peeks n charcters


read-a-char



Form
(read-a-char)

Description
Read from the the look ahead buffer. Only if this buffer is empty, read from the port. Reads from "the read end" of the queue.


read-a-string



Form
(read-a-string n)

Description
Read and return a string of length n. Should take eof into account such that a string shorter than n can be returned.


look-ahead-prefix



Form
(look-ahead-prefix lgt)

Description
Return a lgt character string from the peeked chars in the queue.


max-look-ahead-prefix



Form
(max-look-ahead-prefix)

Description
Return the entire look ahead queue as a string


look-ahead-char



Form
(look-ahead-char)

Description
Return the first character in the look ahead vector. As a precondition, the look ahead queue is assumed not to be empty


match-look-ahead?



Form
(match-look-ahead? str)

Description
Return whether the queue contents match the string str. The queue must contain (length str) characters in order to call this function. If not, an error is issued. This is a proper function (appart from the error condition).


ensure-look-ahead



Form
(ensure-look-ahead n)

Description
Make sure that there is at least n characters in the look ahead queue


put-back-a-char-write-end



Form
(put-back-a-char-write-end ch)

Description
Put ch back at the rear end of the queue (where peek-a-char operates).


put-back-a-char-read-end



Form
(put-back-a-char-read-end ch)

Description
Put ch back at the front end of the "queue" (where read-a-char operates).


put-back-a-string



Form
(put-back-a-string str which-end)

Description
Put str back in queue. The second parameter which-end controls whether to put back in read end or write end. Possible values 'read-end and 'write-end.


advance-look-ahead



Form
(advance-look-ahead n)

Description
Provided that there is at least n characters in the reading queue, advance next-read with n positions. Hereby queued characters are skipped. Not used in dtd parsing.


 

2.   COLLECTION AND SKIPPING FUNCTIONS.
This section contains a number of higher level collection and skipping functions. These functions use the funtions from the previous section. The functions in this section are the most important of this library.


collect-until



Form
(collect-until p)

Description
Return the string collected from the input port ip. The collection stops when the predicate p holds holds on the character read. The last read character (the first character on which p holds) is left as the oldest character in the queue.


collect-balanced-until



Form
(collect-balanced-until char-pred-1 char-pred-2)

Description
This collection procedure returns a balanced collection given two char predicates. Return the string collected from the input port ip. The collection stops when the predicate char-pred-2 holds holds on the character read. However, if char-pred-1 becomes true it has to be matched by char-pred-2 without causing a termination of the collection. The last read character (the first character on which char-pred-2 holds) is processed by this function. As a precondition assume that if char-pred-1 holds then char-pred-2 does not hold, and vice versa.


skip-while



Form
(skip-while p)

Description
Skip characters while p holds. The first character on which p fails is left as the oldest character in the queue The predicate does not hold if end of file


skip-string



Form
(skip-string str if-not-message)

Description
Assume that str is just in front of us. Skip through it. If str is not in front of us, a fatal error occurs with if-not-message as error message.


skip-until-string



Form
(skip-until-string str . inclusive)

Description
Skip characters until str is encountered. If inclusive, also skip str. It is assumed as a precondition that the length of str is at least one.


collect-until-string



Form
(collect-until-string str . inclusive)

Description
Collect characters until str is encountered. If inclusive, also collect str. It is assumed as a precondition that the length of str is at least one.


 

3.   USEFUL PREDICATES FOR SKIPPING AND COLLECTING.



is-white-space?



Form
(is-white-space? ch)

Description
Is ch a white space character?


end-of-line?



Form
(end-of-line? ch)

Description
Is ch an end of line charcter?


eof?



Form
(eof? ch)

Description
Is ch an end of file character?


char-predicate



Form
(char-predicate ch)

Description
Return a predicate functions which matches the character ch. A higher order function.


Generated: April 2, 2002, 11:01:26
This documentation has been extracted automatically from the Scheme source file by means of the Schemedoc tool