Abstract. This program implements an Elucidator for the Scheme programming language.
An elucidator is a programming tool which supports elucidative programming,
which in turn is a practical and modern variant of literate programming. The main functionality
of the elucidator is oriented towards generation of HTML pages which we can present
the documentation and the program in an Internet browser.
In addition, the program generates information which the editor part of the elucidator
can use for navigation purposes. The editor part, which is implemented in Emacs Lisp,
is not described in this documentation bundle. This document describes the internal working of the Scheme Elucidator. There exists a manual which describes how to produce documentation, like the documentation you are reading on this page. A brief user guide of the Scheme elucidator (as it appears in a browser) can be navigated to via the question mark icons above. This elucidative program is reasonable complete. However, the program is under active development, and this document will in the future reflect the latest changes.
|
![]() ![]() ![]() 1.1 An example of an elucidator setup file. |
;Style Loading (load (string-append laml-dir "laml.scm")) ;Elucidator loading (style "elucidator/elucidator") ; Set the directory in which this file resides. (set-source-directory "SOME DIRECTORY") ; Set the name of this file, without extension (set-documentation-name "ELUCIDATOR-NAME") (make-all-indexes) ; Define the sourcefiles in this documentation bundle (program-source (key "SOURCE-KEY") (file-location "SOURCE-PATH") (language "scheme") (group "GROUP") ) ; Define the documentation body, here in terms of a documentation-from clause (begin-documentation) (documentation-from "DOCUMENTATION-FILE-NAME") (end-documentation)Below we will discuss the implementation of the forms found in the setup file.
![]() ![]() ![]() 1.2 Organization of the setup file |
The source key is meant to be a handy and unique identification of a single source file. Internally in the function program-source we just accumulate the program-source in the variable program-source-list.
Following the program-source clauses we meet the documentation body, enclosed by begin-documentation and end-documentation clauses. The documentation text can either be inlined between the begin and end clauses as a sequence of documentation-entry and documentation-section clauses. However, more typically, we import the documentation text from a separate file via use the documentation-from clause.
We will next take a closer look at the functionality of the mentioned clauses.
![]() ![]() ![]() 1.3 Overall documentation processing forms. |
The documentation-from clause parses the textual documentation and issues calls of the documentation-intro, documentation-section, and documentation-entry functions (see section 1.4). In section 4 we will describe the inner details of documentation-from.
The end-documentation function is a long function where almost "everything" is initiated. (We should consider to break this function into several parts, if not for other reasons than to improve its documentation in this description). The different parts of the function can be seen from the comments in the source program. Here we describe briefly and in overview form the interesting and most important parts of the processing done in end-documentation:
![]() ![]() ![]() 1.4 The documentation-entry and documentation-section clauses |
Take a look at the manual pages of the elucidator for user level documentation of these forms.
If documentation-section and documentation-entry are used directly we need functions which implement the constituent forms, such as (title "..."), (body ...) etc. These functions (see for instance title and body) are all generated via the higher order function make-syntax-function. This function and the generated functions are all trivial.
From an internal point of view the functions documentation-section and documentation-entry are almost trivial. They basically collect information and put it into a number of useful global variables, which are used in end-documentation (see section 1.3).
In both functions we extract the id and the title, and we add these to the elements (see the assignment to document-elements). In that way this information is available in a convenient way. We also make the section-numbering, and we add it to the elements too. The numbering is done by the functions section-numbering and subsection-numbering. These functions are based on two global variables, section-number and subsection-number that holds the section and subsection-numbers. The element called raw-numbering is a list of section number and subsection number. (Section n has raw section number (n 0)). The variables are assigned in the beginning of documentation-section and documentation-entry.
![]() ![]() ![]() 1.5 File structure overview |
In order to be concrete let us assume that we document the program p.scm found in the directory p-dir. We get the following directory structure:
p-dir p.scm doc p.laml p.txt html images internalFirst notice that the Emacs editor command make-elucidator constructs all the files and directories of doc, including templates of p.laml (the setup file) and p.txt (the documentation file). The user must, however, manually make the doc directory.
By executing the Elucidator all the icons are copied into the doc/html/images directories from the software directory, see section 8.2.
The internal directory is used for files generated by the elucidator; Some of these are used for transfering information from the Elucidator to Emacs' elucidator mode.
![]() ![]() ![]() 2 Making the program pages In this section we will study one of the central aspects of the Elucidator, namely the decoration and WEB presentation of the source programs. This is one of the language dependent parts of the Elucidator; in our case the programming language is Scheme. In section 1 we described how the source files are enumerated in the setup file via program-source clauses. As one of the many tasks of end-documentation we make the program sources via calls of the function make-source-program-file for each source file which needs processing. This function is our starting point in this section. |
![]() ![]() ![]() 2.1 Getting started: the top level functions |
The function make-source-program-file calls elucidate-program-source. We use the source key information to make the name of the HTML output file, the destination path, which becomes the second parameter. Appart from that, the two functions are quite similar.
The function elucidate-program-source opens the input and output files. The original source text is read from the input file, and the HTML decorated source text is written to the output file. In this function we prepare for imperative processing of the output file. Thus, instead of forming one large HTML expression which represents the output, we write piece by piece of HTML output to the output port op. The functions pre-page and post-page from the html library, together with the start-tag and end-tag functions from the html-v1 library are used for the imperative output of the necessary tags. Now the function elucidate-program-source-1 takes over. The function elucidate-program-source-1 iterates while we have not reached the end of the input file. The function elucidate-program-form is called for (but not on) each top-level program contstruct (Scheme top level form) in the input. We investigate this function in the next section.
![]() ![]() ![]() 2.2 The overall program traversal and scanning. |
The basic idea is to traverse the form f (tree traversal) and scan the characters on the input port simultaneously. A decorated version of the input is written to the output port op. The decoration consist of coloring, linking, and insertion of a few special icons into the program text. Notice that we do not go for any kind of pretty printing. The source file, as presented in the browser, should basically appear as written in the text editor. The necessary decoration is made possible because we can look ahead in the input via the parsed form f. We know what is in front of us...
The function elucidate-program-form basically dispatches on the type of the form f. (This is not entirely true, but as of now we will tell the store this way. In section 2.4 we will return to a lexcial special case). As we see from the large conditional we handle symbols, strings, numbers, chars, booleans, and a number of list variants.
In the simple cases we call a matching function, such as match-symbol. Via one of the Lisp reading function (which reads a very well-defined portion of the input at the current location) we read the symbol which must be ahead of us. This knowledge is due to the synchronous scanning and tree traversal. Because some symbols may be anchors of links to program definition we look the symbol up in the defined names. If it is there, we output a HTML anchor tag with a link to the similar definition. If not, we just output the symbol.
The functions match-string, match-char, match-number, and match-boolean, are similar, and in reality trivial because there is no possible linking from these lexemes.
![]() ![]() ![]() 2.3 Traversing and scanning lists |
The matching of lists, which in the Lisp world represent program constructs, is of course more complicated. As on overall concern we need to keep the traversal of the program form f and the input synchronized. Here the lexcial special elements such as white space and comments cause a number of problems.
Take a look at the conditional clause () which takes care of define forms in the function elucidate-program-form.
(This is the case which traverses and scans a Scheme define form). As can be seen we call the function
skip-white-space in order to read over white space elements in the input. Recall that such elements do
not have a counter part in the form f. As can also be seen in skip-white-space we handle comments in a special
way by means of the function skip-comment (because it is a rather lengthy lexical element).
The functions match-start-parenthesis and match-end-parenthesis are low level helping functions which deals the start and end parenthesis of lists.
The call of the function total-doc-navigator should be noticed; This is the function which generates links from the program to the documentation (the yellow left arrows). We have more to say about this in section 2.5.
The recursive nature of lists causes a recursive processing: Eluciate-program-form calls itself on subforms. There is one very noteworthy thing in this respect. The fourth parameter of the function holds the defined names. This parameter is, as already seen in section 2.2, used to link from applied names to Scheme definitions. There may be local name definitions in a Scheme form. These are parameters and local name binding forms. According to the usual scope rules the local name definitions overrules the global name definitions, as found in defined-names. Therefore we want to subtract the locally defined names from defined names, when the defined names are passed to the recursive call of elucidate-program-form. This is done by the function list-difference-2, but only for parameter bindings (as located by bounded-names). In section 3.2 we will explain the function bounded-names in some details.
We have not yet subtracted locally defined names from let bindings. This causes unfortunately some mis-bindings of applied names in our WEB presentations of programs. It would probably be rather tedious to implement the subtractions of let-defined names.
As part of the continued develpment of the Scheme Elucidator this problem has been remedied, see section 9.
The processing of a define form comes before the processing of other proper lists, which comes before the processing of improper lists. As usual we handle special cases before the more general cases. The processing of lists and inproper lists are relatively straightforward, and we will not go into these in more details here.
Notice however that we have not yet programmed the support of manifest vectors in the program source. We have improved the handling of bounded names. In section ??? we will describe the new handling of this.
![]() ![]() ![]() 2.4 More lexical troubles |
The relevant place to look is in the beginning of elucidate-program-form; More specifically the first case in the conditional. The function quote-in-input? returns whether the input port contains a quote character in front of us. (And we check whether the similar Lisp form is a quote expression; If not a fatal error occurs). If we encounter a quote character in the input we output a similar quote on the output, and we process the quoted expression recursively.
As of now we have not made a similar support of backquotes.
![]() ![]() ![]() 2.5 Making links from the program to the documentation |
The relevant function to study is total-doc-navigator which is called by elucidate-program-form, as discussed in section 2.3. The function returns a sequences of icon anchors, or an empty string. The parameters to total-doc-navigator are:
The rest of total-doc-navigator returns the icon which toggles between small and large font (if wanted), the icon which links from the definition to the cross reference entry, and following that the documentation navigators. The function doc-link makes the documentation navigation icon anchor tags.
![]() ![]() ![]() 2.6 Marking detailed places in a program |
In this subsection we will explain the mechanism that allows us to mark a particular place in the program. In some sense, this runs counter to the principle of leavning the source file unaffected of our documentation needs. The marking takes place in the comments of a source program, using very minimal means.
In the first version we implemented the markers were entirely visual. In a later version we link from the markers (see section 5.7).
At the concrete level, a mark in a source file comment has the form
@awhere a is a one-character entity (a letter or a digit).
At the program side, in the function skip-comment-1 we recognize this pattern and output an identifying image. This is done via the function source-marker-image.
At the documentation side, we use the same notation. The function
program-linking-transition, which implements the state machine that
govern the documentation linking, is the relevant place to implement the
markers at the documentation side. From the state normal-text
we may enter a new state, inside-marker. The input character
encounted in this state determines the mark. We can use the same function
as above, source-marker-image to produce the marker. This is done
at via a call of source-marker-glyph. The result of this
function call is just used as the output string in the state machine.
![]() ![]() ![]() 2.7 Preparing the linking to program source markers. |
The thing to arrange is that the source markers in the programs are tagged with anchor names.
Recall from section 2.6 that skip-comment-1 is the relevant function (), because
source markers are embedded in comments in a program.
In addition to the source marker itself we need to output an anchor name, of the same form as shown above.
In order to do so we need access to the name of the definition, in which we are located. This information
is not available immediately in the function skip-comment-1.
We can solve this problems in two ways: Either we pass the name of the definition through all the functions as parameter - from elucidate-program-form to skip-comment-1. This could (and perhaps should) be done, but not right now. As always, the easiest thing to do is to make an imperative solution. We go for this solution here.
The function elucidate-program-form sets the global variable enclosing-definition-name,
both for define forms ( ) and for sectional, syntactical comments (
).
Now, in skip-comment-1, and more importantly, in the state inside-marker of
the function program-linking-transition (which has taken over the work of skip-comment-1,
we can easily emit an a-name tag (
).
![]() ![]() ![]() 2.8 Linking from source markers in the program. |
We need to save some additional bookkeeping information about the documentation source marks in order to relate a program source mark to the proper documentation source mark. The necessary information is akkin to the information in the list documented-name-occurences, which describes the relations between program-definition-id, documentation-id, and weak/strong relationship. Here we need a tripple relation
(program-id doc-id source-mark)-listwhich we save in the variable documentation-source-marker-occurences. There is one entry for each documentation source marker.
The definition of this variable is really a
documentation side preparation, see again 5.8 for details.
The place to introduce the link to the documentation source mark is skip-comment-1 (at ).
We introduce the function doc-source-marker-link, the responsibility of which it is to return the docuemtation-linked
source mark.
We pass the information, which is necessary for the function to work:
Let us now describe the inner working of doc-source-marker-link. We first find the relevant entries in documentation-source-marker-occurences: the entries deal with the given definition (referred by enclosing-definition-name) and the given marker char. Next we check whether there are 0, 1 or more relevant entries. In case of 0 or more than one we issue a warning. In case of 1 we return the link (an anchor tag from the source mark glyph). We also return a link in case there is more than one relevant entry, namely to the first one.
The caption of the link reports the ambiguity via the function report-ambiguous-doc-source-markers
which is called in doc-source-marker-link at position .
3.1 The function defined-names 3.2 The function bounded-names |
![]() ![]() ![]() 3.1 The function defined-names |
The function defined-name extract the defined name from a Scheme define form, which can be one of these two kinds:
(define name value) (define (function-name par) body)
![]() ![]() ![]() 3.2 The function bounded-names |
There are two cases, corresponding to the two forms of definition shown above. Let us first assume that the second element is a pair (proper or improper list). We have now two possibilities:
(define (function-name p1 p2 p3) body) (define (function-name p1 p2 . p3) body)In both cases we want to return the list (p1 p2 p3). We use the functions proper-part and first-improper-part from the general library to extract the proper and improper part of an improper list.
If the second element of the define form is a symbol we have the following possibilities:
(define name (lambda (p1 p2 p3) body)) (define name (lambda (p1 p2 . p3) body))Again we want to return (p1 p2 p3). In any other case we return the empty list.
![]() ![]() ![]() 4 Parsing the textual documentation In this section we will explain the processing of the textual documentation format. Recall that this is the preferred format of documentation in an Elucidator. (It is, by the way, the source text used to the markup of the text you are reading here). The alternative is to use documentation-entry and documentation-section forms with LAML markup (see section 1.4). This alternative is more complicated, but also more powerful because all the possible LAML abstractions are available. |
![]() ![]() ![]() 4.1 Introduction to the textual documentation format |
.TITLE title .AUTHOR author .EMAIL email .AFFILIATION affiliation .ABSTRACT abstract .END ----------------------------------------------------------------------------- .SECTION section-id .TITLE section title .BODY Section text More section text .END ----------------------------------------------------------------------------- .ENTRY entry-id .TITLE entry title .BODY Entry text More entry text .END -----------------------------------------------------------------------------The dashed lines in between sections are just for separation purposes; They play the roles of comments. section-id and entry-id are section and unit identification symbols, used for cross reference purposes. HTML markup can appear in bodies and titles. The body text usually starts at the line following body, but it may also start just after the body keyword.
The dot markup is line oriented. The dotted keywords must be at the beginning of a line, and the text after .SECTION and .ENTRY and .TITLE run to the end of the line.
By the way, this is the only reason that the dotted keywords aren't interpreted in the text above. We do not, at this level, support "escape mechanisms" which allow us to have the dotted keywords in front positions of a line. However, we support escaping of the linking characters, see 4.2.
![]() ![]() ![]() 4.2 The overall ideas |
If we want to use the characters
[, ], {, }and * inside curly brackets we need to escape them with a backslash character:
\[, \[, etcThe implementation of the escaping mechanism is realized through the state machine, which we return to in section 5.3.
![]() ![]() ![]() 4.3 The top level functions. |
A documentation-from form appear between begin-documentation and end-documentation in the setup-file, see section 1.1. We have already touched on the documentation-from in section 1.2 and section 1.3.
The function documentation-from calls functions which process the intro part (title, author, etc), and the remaining documentation units (sections and entries). Besides this, documentation-from is responsible for file opening and closing. The remaining functions take input from an input port, ip.
The function documentation-intro-from-port processes the introduction. It eats the necessary white space in front of it. The function accept-documentation-intro does the real extraction and parsing work (see section 4.5). The function define-documentation-intro! calls documentation-intro with the extracted constituent. In turn, this function just assigns the title, author, etc to global variables, which are used by the function documentation-contents, which we explain in section 5.1.
Similarly, the function documentation-units-from-port eats initial white space, parses a unit (section or entry), and eats the separator. The function accept-documentation-unit does the real work (see again section 4.5). The collected unit is passed to define-unit!. The function define-unit! imperatively evaluates (by means of Scheme eval) the Lisp form made by make-documentation-form. This is the function which aggreates a documentation-section or documentation-entry form from the extracted information. Notice the iterative nature of documentation-units-from-port.
![]() ![]() ![]() 4.4 Organizing the parsing process |
We could go for the application of a general parser. However, this is not attractive. There is only a tiny set of syntactic construct. And ordinary lexical analysis would not be very useful on information, which is more or less free text.
We could alternatively take the text through a state machine which should collect the necessary constituents while reading the individual characters of the textual documentation. This could be done, but there would be many states, and it would be quite difficult to make and maintain such a state machine. (We use state machines other places, also in the Elucidator - see 5.3. We could, of course, use the general template and approach from there).
We decided to make a special set of procedures which accepts well-defined portions of the documentation. This approach is quite similar to recursive descent paring, although in our case there is no recursion involved (the language is so simple that it does not invite to recursive constructs). In the next section we will explain this approach.
![]() ![]() ![]() 4.5 The accept functions |
The function accept-documentation-intro accepts, in turn, the title, author, email, affiliation, and abstract by means of lower level accept functions. After successful acceptance and recognizion of these it returns the list of the constituents.
The function accept-documentation-unit generically accepts a section and a unit. This is also done by lower level, specialized accept functions. The similar structure of sections and units allows for a single function doing the job. The function accepts id, title, and body.
There are a number of lower level accept functions, as mentioned above. These realizes the kernel of the parsing in terms of collections and skippings. Let us look at one of these, accept-doc-id as a typical representative. The text which is accepted is one of the following:
.SECTION id .ENTRY idThe function accept-doc-id has an important precondition established by the context: It must be called just before the appearance of a keyword .ENTRY or .SECTION. The function collect-until collects the keyword by reading until white space is encountered. We check to see whether the collected text is either the string .ENTRY or .SECTION. If not we stop the processing via doc-check, which causes a fatal error and an error message. Next we skip white space via the funcion skip-while, and the id is collected via collect-until. Accept-doc-id returns the list reflecting the concrete syntax: (list unit id).
Most of the other accept-functions work in the same way as accept-doc-id. These are accept-doc-title, accept-doc-author, accept-doc-email, accept-doc-affiliation, and accept-doc-abstract. The function which accepts the bodies of sections and units is a little different, so we will explain it briefly.
accept-doc-body first eats white space, after which it accepts the body keyword. It finally calls accept-body-text, which in turn calls the iterative accept-body-text-1. It collects lines, again using collect-until until it meats the .END keyword at it's own line. The predicate end-unit? identifies this situation. accept-doc-body reverses the collected lines, and appends them with string-merge.
Notice that we call a function eat-eol-chars in accept-body-text-1. When called we have encountered an end of line. The end of line handling is tricky, because we want the program to run both on Unix and Windows. In windows lines are ended by CR (character 13) and LF (character 10). The eat-eol-chars read the LF and prepares for a "good start" on the next line (emptying the one character queue).
![]() ![]() ![]() 4.6 The collection functions |
We dont know the length of the collected text. We accumulate read characters in a variable, collection-buffer, but we cannot easily determine the length of this buffer. We could handle this by allocating longer and longer strings (or more and more strings), much as we do in the function read-text-file in the file-read (and write) library, which reads text from a file and return it as a string.
As an important observation, we collect line by line in the accepting functions. Therefore we can live with a fixed upper limit, defined by buffer-length.
Quite often we read a character, which we really did not want to read. This is a classical problem when handling input. We want to put the character back, such that the next reading will re-encounter the character. Some libraries support a put-back operation and a queue of putted back characters. We only have a "one character queue", next-doc-char. The function read-next-doc-char takes the character from next-doc-char if there is one, else it reads a character from the input port. Because this is the central place we read characters from the input we can also here handle an administration of line nubers. If we read a CR we increase the variable doc-line-number. By means of this we can give relatively good and precise error messages in the doc-check procedure.
Let us also here mention some generally useful predicates which we in convenient ways pass to collect-until and skip-while. These identify white space (is-white-space?), end of line (end-of-line?), and similar boundary conditions.
![]() ![]() ![]() 4.7 The skipping functions |
![]() ![]() ![]() 4.8 Summary of parsing process |
![]() ![]() ![]() 5 Making the documentation page In this section we describe the production of the documentation page given the variable documentation-elements (and others). Thus, the starting point is the parsed documentation page, as represented in the bunch a variables, of which the most important is documentation-elements. The parsing process was described in section 4 and summaried above in 4.8. The most serious challenge in this section is to convert curly brackets and brackets to program and documentation references. |
![]() ![]() ![]() 5.1 The function documentation-contents |
The function documentation-contents produce to title, author info, abstract, and the trailing blank space surrounding the documentation contents. Futhermore the function string-append accumulates the contributions from the sections and entries of the documentation
The real work is initiated by the function present-documentation-element which dispatches to present-documentation-section and present-documentation-entry. The function present-documentation-section presents the introductory section text in a color-frame together with a numbered title. Similary, the function present-documentation-entry presents a documentation entry. Both of these functions call do-program-link-documentation in which the real interesting work is done, namely conversion of the linking brackets to HTML anchor tags. This function is described next.
In both a documentation section and a documentation entry we show links to parent and sibling sections/entries. In section 5.6 we describe how this is done.
![]() ![]() ![]() 5.2 The function do-program-link-documentation |
The function do-program-link-documentation-1 works via a state machine which reads each character in the input (the body) and translates this to an appropriate output. The central function, which realizes the state machine is program-linking-transition. It is activated on a state, an input character, a collected word, and the documentation id. The collected word serves as a collector strings for the bracketed linking words; When we see the rear end of the linking word we can call the function linking-from-doc-to-prog or linking-from-doc-to-doc which insert the anchor tags. Defails about these follow in section 5.4. In the next section we will describe the state machine in some details.
![]() ![]() ![]() 5.3 The state machines which transform the documentation bodies |
We have the following states:
![]() ![]() ![]() 5.4 The functions which returns a link to a program unit or a documentation unit |
As an important side effect of this function we accumulate the linking words in the global variable documented-name-occurences. Besides this we return an a-tag-target string.
The function linking-from-doc-to-doc is simpler. The word is looked up in the variable documentation-key-numbering-alist. This is the variable which maps documentation ids to the assigned section numbers. The section number becomes the anchor text of the a-tag-target URL. In order to handle errors (again non-fatal) we test if the collected word is a known one in the association list documentation-key-numbering-alist. If not we issue a warning, and just return the word collected-word without any liking from it.
![]() ![]() ![]() 5.5 Refined linking possibilities |
{*reference}The necessary program modifications were the following: In the function linking-from-doc-to-prog we test for the initial star in the first parameter, word. This is done by the function strong-program-link?. This function defines the variable strong-link-char in order to be able to change the string link character to another character.
We distinguish weak and strong links by using different colors in the documentation frame. Strong links are red, and weak links are dark blue. We also need to extract the real linking word in case there is a leading star. This is done easily by the function linking-word-of-strong-link.
On the program side we want the left arrows to indicate whether they are involved in strong or weak program links. In order to do so we need to remember the strong/weak distinction of a given link. This has to affect the registrations done in the variable documented-name-occurences. We decide to add an extra symbol 'strong or 'weak to the association. Thus an association may now be (program-name doc-id strong/weak). Notice that it is still an association list, associating the program-name to a list of two elements.
The association list stored in documented-name-occurences is used in a variety of functions under the formal name docmented-names. The only place the information in the association list is in the function total-doc-navigator, and further on doc-link. (The function doc-navigator is outdated, and not used.) Here we introduce the distinction between weak and strong links by shown different left arrow icons for the two of them.
![]() ![]() ![]() 5.6 Linking between documentation sections and entries. |
The documentation link banners are produced by the function section-navigation-banner, which takes the documentation elements
of the section/entry as parameter. section-navigation-banner is called by present-documentation-entry (
Internally in section-navigation-banner we use the function doc-section-url to produce URLs to
sections and subsections (entries). This function just traverses the variable documentation-elements, and by
means of filtering it finds the relevant documentation element (a section or entry). The predicate section-subsection?
is useful (and it is similar to subsections? which we discuss in section 7.6).
An URL to section n.m is produced by (doc-section-url n m).
Notice that n.0 denotes section n. In this respect, in the function section-navigation-banner,
there are special cases (
Whereas the function section-navigation-banner produces URLs,
the function section-navigation-banner-1 produces the graphical appearance of the documentation link banners.
In that way there is a clear division of responsibility in between the two of them.
) and
present-documentation-section (
).
and
) when we calculate
the previous (blind) links from section 1 and section i.1.
![]() ![]() ![]() 5.7 Linking from source markers in the documentation. |
We have alread prepared for this in section 2.7, where we introduced anchor names of the program source marks.
From a design point of view we decide that a source marker is associated to the nearest strong relation (a red one). Only strong relations earlier than the source mark is taken into consideration.
In the function program-linking-transition we need to output an anchor tag instead
of just the source marker (
We use the following naming scheme for identification of source marker:
). This is done by the function source-mark-anchor.
This function is modelled after linking-from-doc-to-prog, and it is really straightforward
once linking-from-doc-to-prog is understood. source-mark-anchor depends
on the global variable previous-strong-program-word, which is assigned by linking-from-doc-to-prog (
and
).
program.html#definition-@m
where definition is a name of a defintion (a function name, typcially) and m is a marker name.
![]() ![]() ![]() 5.8 Preparing the linking to the documentation source markers. |
We also define a variable documentation-source-marker-occurences which relates program-ids, doc-ids and source mark characters. The variabel is assigned by the procedure source-mark-register (registration of a new entry in the list).
The function to care about in order to introduce anchor names of documentation source markers is
program-linking-transition. At the same place () as we introduced the linking to the program (see section 5.7)
we now also insert an anchor name, via use of the a-name LAML tag. The anchor name is the following:
docId-@xwhere docId is the documentation id of the section/entry and x is the source mark character.
The programming challenge is here how to get access to the documentation id of the section/entry, in which we are located. We are lucky here! The function program-linking-transition carries this information as the last parameter.
In section 2.8 we will link to the anchor names (from program source markers to documentation source markers).
![]() ![]() ![]() 6 Extracting applied names. In this section we will study the extraction of applied names from the parsed program files. |
6.1 Overview 6.2 The function applied-names-multiple-sources. 6.3 Extracting applied names from a single form. |
![]() ![]() ![]() 6.1 Overview |
The function applied-names-multiple-sources initiates the extraction task. At the calling place in end-documentation we se that the list of parsed source forms is made by appending source-list-list-process (the list of parsed sources processed in this 'Elucidation') and the list of source forms read via read-source (corresponding to the non-processed source files).
![]() ![]() ![]() 6.2 The function applied-names-multiple-sources. |
The function applied-names, and in particular its helping function applied-names-1 extracts applied name pairs from a single source list, representing a single source file. The latter function accumulates the results in the last parameter, res. We see that applied name pairs are only collected from definitions, identified by the predicate define-form?. In reality applied-names-1 iterates through the definitions of source-list, skipping the remaining top level forms. Under the definition of this-contribution we see the construction of the pairs of applied and defined names. We also see that the function applied-names-one-form extracts the applied names from a single form. This function is explained in the next section.
![]() ![]() ![]() 6.3 Extracting applied names from a single form. |
For each symbol we encounter we return that symbol (i.e., a list consisting of the symbol) if the symbol is defined in the current documentation bundle. The function defining-in-batch? implements this condition.
Later in the conditional we process various special cases of lists (from most specialised to most general):
(define (f p1 p2) ...) (define f ...) (lambda (p1 p2) ...) (let ((n1 v1) (n2 v2) ...) ...)In all of these we want to skip defining name occurences (defined names, parameters, and bound let names). These are f, p2, p2, n1, and n2 above. The function let-vals returns the forms corresponding to v1 and v2 above.
The remaining cases are simple, exhaustive traversals and collection.
7.1 The cross reference index 7.2 Alphabetically organized cross reference indexes 7.3 The duplicated name index 7.4 The defined name index 7.5 Making the table of contents 7.6 Local table of contents |
![]() ![]() ![]() 7.1 The cross reference index |
The function end-documentation writes a HTML page with the cross reference index (at ).
Here the function present-cross-reference-index is called with the list defined-applied-names
as parameter (an association list mapping names to all definitions in which they occur, sorted after the first element in the list).
The creation of this list was addressed in section 6.
Besides forming the real and actual list of applied/defined names (see below in this subsection)
the function present-cross-reference-index makes the outer table of the cross reference.
The function takes a list of pairs as parameter; each pair (a . d) represents an applied name
a which is applied in the definition of d. The list is sorted by applied name (the car position).
In this function, we first "sublist" the parameter list () such that all entries belonging to the same applied name
become a sublist; In other words all occurences of an applied name are grouped together in a sublist. Hereby the
the list passed as parameter becomes one level deeper.
Next (
) we eliminate mulitiple applications of the same name in a single definition.
This is done by (essentially) mapping remove-duplicates-by-predicate over all sublists of the list formed just above.
The rest of the task in present-cross-reference-index is presentation of the result.
The left colum shows an applied name; The right "fat column" presents all the definitions
in which the name occur. All the entries (an applied name and all the definitions, in which it
occurs) are produced by the function present-applied-sublist.
The "fat column" just mentioned is made in this function. Each entry in this inner table is made
by present-defined-entry.
The names, which are defined but not applied in the current documentation bundle, do not occur in the list defined-applied-names. It would be quite informative to include these, e.g., in order to illustrate the a given definition is not used (at all, or at least in the current bundle). Therefore we merge the lists defining-name-occurences and defined-applied-names to a list of the same data format as defined-applied-names. This takes place in the function merge-defined-and-defined-applied-lists. The pair
(name . #f)is a legal entry in the list, meaning that the name is applied nowhere in the documentation bundle.
Symmetrically, the names which are applied but never defined, would be useful in the cross reference index. This may be an error. As of now, these do not appear. These names are probably not extracted at all. If we tried to do so, we would probably end up get far to many names. It could be complicated to hit all the symbols which are in evaluating position, and relating to global definitions.
![]() ![]() ![]() 7.2 Alphabetically organized cross reference indexes |
The splitted cross reference index facility is controlled by a boolean variable alphabetic-cross-reference-index?.
As can be seen in end-documentation () the generation of the split index is, in principle,
straightforward. First we split the value of extended-defined-applied-names into alphabetical sections
by means of the function split-defined-applied-names. Next we call the function make-cross-reference-index
over the splitted list, thus generating a number of smaller index files. Finally, the function
make-overall-cross-reference-index make the overall index, with links the individual small index files.
Now to some of the details, first split-defined-applied-names. It is easy to do the splitting
via an application of the function sublist-by-predicate. We just need to make a predicate (
However, there is one problem: In case
there are no names with a particular starting letter we get a smaller number of sublist than letters
in the alphabet. We see two solutions:
It is not hard to make the partial alphabet list; We just map function first-letter-of over
an appropriate list formed by another mapping over the splitted list of defined applied names.
Next we map a procedure make-cross-reference-index over the splitted name list and over the
partial alphabet. The first parameter passed to this function is a list of name pairs; Each pair
(a . d) represents an applied named a defined in d; d may be false (#f) in case a is
not applied in the current documentation bundle at all.
The procedure make-cross-reference-index produces a small index files
(for names with a given initial letter).
It uses the function present-cross-reference-index to present the cross reference index.
Recall from section 7.1 that this function produces the table which presents the cross
reference index.
We also present an alphabet link
array - by means of alphabetic-link-array-1 - allowing for easy navigation to other indexes from an arbitrary
index.
Finally, we have to make the overall index, which just contains an alphabet navigator.
We take the already existing library function alphabetic-link-array as the starting point.
This function needs generaliation with respect to both the linking target, the alphabet, and more.
This gives a variant, called alphabetic-link-array-1.
)
which identifies different front letters in two consequtive elements of the car position in
defined-applied-names (called dan here, for brievity).
We go for solution number 2.
![]() ![]() ![]() 7.3 The duplicated name index |
The elucidator uses the names of definitions as identifications. This is a very simple decission (probably also too simple). In case of double definitions we cannot distinguish between two or more definitions. Needless to say, this cause problems. Therefore we also use the duplicated name index to remind the user of the Elucidator of the size of this problem (or "flaw", you could say).
Internally, the function duplicated-definitions sorts all defined names, and next we attempt to identify duplicates. Given that we have the sorted definitions f, g, g, h, i, and j (with g double defined) we identify duplicates by pairing the list
(f g g h i j)with the tail of the list
(g g h i j)giving
((f . g) (g . g) (g . h) (h . i) (i . j))The element (g . g) represents a duplicate.
The function present-duplicated-definitions (called by end-documentation) presents the duplicated definitions in a straightforward way.
![]() ![]() ![]() 7.4 The defined name index |
![]() ![]() ![]() 7.5 Making the table of contents |
As can be seen in present-documentation-contents we present the table of contents in a two column list, made by the function two-column-list. The second parameter determines whether we show both sections and entries, or only sections (good for long documents).
The function which presents a single entry is present-documentation-content-element. We use the information on the first parameter element to access the kind (entry or section), the doc-id symbol, the section number, and the title. We return a string which represents a (possibly indented) anchor tag.
![]() ![]() ![]() 7.6 Local table of contents |
Recall that a documentation section is made by the function present-documentation-section. In this function we find the relevant subsections by filtering the documentation-elements list with the predicate (subsections? section-number). The function subsections? generates a predicate which return true on exactly the proper subsections of section section-number.
The local table of contents is made by the function the function present-documentation-subsection-element. This function is straightforward; It returns a string, in which the important substring is an anchor tag which is associated with a link to the appropriate subsection.
![]() ![]() ![]() 8 Constructing the HTML files. In this section we will discuss the most interesting aspects of the HTML file construction, including the images on which the HTML depend. |
8.1 Some HTML details. 8.2 The icons 8.3 The program file menu and coloring schemes 8.4 The Help page |
![]() ![]() ![]() 8.1 Some HTML details. |
Almost all the html files are generated in end-documentation. The responsibilities are divided in three parts:
The frames as such (technically the frameset) are constructed by specialized functions such as elucidator-frame and elucidator-frame-horizontal. Here we use the functions from the html-v1 library for aggregation of the frame stuff. These are the function which are responsible for the overall layout of the Elucidator, as presented in a browser.
![]() ![]() ![]() 8.2 The icons |
If or when we introduce a new icon it must be saved in the images directory. In addition, it must be put into the list elucidator-image-files. As part of the processing in end-documentation the icons are copied from the software directory to the html/images subdirectory of directory, in which a concrete elucidator resides. This is done by means of copy-files from the general library.
This organization ensures that all relevant icons appear in any elucdiator instance. The icons will physically exist many places; But this is prices of self contained html directories, which easily can be copied and transported.
The icons appear on a number of difference WWW pages. The icons, and the links behind them, are produced by the function icon-bar.
![]() ![]() ![]() 8.3 The program file menu and coloring schemes |
It may be difficult to realize which program file is being presented in the program-frame to the right in an elucidator browser. Therefore we support a background color scheme of programs in the documentation bundle. In order to be general, also the documentation and index frames can have distinct colors. The colors of the frames are controlled by the variable elucidator-color-scheme, which is #f (use default-background-color) or an association list a that maps group names (see section 1.2) to colors.
The function make-color-scheme is meant to be a high-level function via which to defined elucidator-color-scheme. The function returns a association list given the input which is a property list. make-color-scheme is called from the setup file. The function color-of-group maps color groups (as used in the program-source forms) to colors, as defined by the color scheme.
If there are many programs in the documentation bundle it will not work out nicely to have a horizontal table with all program files in the index frame. Therefore we have introduced yet another frame, the program-menu frame, which holds a menu (table) of all programs. The color scheme, discussed above, is used in that table too. The function source-file-links-for-program-menu produces the table. This function is very similar to source-file-links, which produces the horizontal table. (The main difference is the use of different table functions, table-1 and table-4).
The boolean variable separate-program-menu? controls whether to use the original horizontal table of program, or the menu frame. The function control-frame produces the control frame, if the boolean variable is false, or a column frameset consisting of the control-frame and the program-menu.
![]() ![]() ![]() 8.4 The Help page |
The help page use LAML markup, as can be expected. The elucidator generates the help page every time it is executed.
![]() ![]() ![]() 9 Handling of bounded names This section elaborates on the problem raised in section 2.3. Let us repeat the point here:
We have not yet subtracted locally defined names from let bindings. This causes unfortunately some mis-bindings of applied names in our WEB presentations of programs. It would probably be rather tedious to implement the subtractions of let-defined names. As we will see below, it turns out that it is relatively easy to implement a solution. |
9.1 Introduction to the problem 9.2 A solution |
![]() ![]() ![]() 9.1 Introduction to the problem |
(define a ...) (define c ...) (define (f a b) (let ((c d) (e f)) (x a c)))First, binding occurences should never be linked as applied names to the top-level definitions of a and c. In the early versions of the Elucidator the binding occurence of c is linked to to the top-level definition of c. This is - of course - wrong.
Second, the names a and c in the form (x a c) should not be be linked to the top-level definition of a and c. Within the body of the let clause in f, a and c refers to the a parameter and the local binding of c, of course. The early versions of the Elucidator would make wrong links here on c. However, a is identified as a parameter, and no mis-linking is done on the name a in the early version of the elucidator.
![]() ![]() ![]() 9.2 A solution |
In elucidate-program-form we subtract the locally bound names from defining names in the recursive
calls of the function (). The form f is here af define form, on which the function bounded-names
return the list of binding name occurences.
Our solution is now to weaken the precondition on bounded-names such that it works on any Scheme form. If we pass a name binding construct to bounded-names it will return the namebings of the form. If we pass another construct to bounded-names it returns the empty list. The existing version of bounded-names is renamed to parameter-names, as suggested in section 3.2.
Now, in elucidate-program-form, we subtract the bounded names from defining names more uniformly
in the recursive calls. Concretely, this is now also done in the cases (
and
) on lists which are not define forms (
and
).
Hereby we eliminate the names, which happen to be bound locally, from the names which are considered
as source anchors of applied-defined name links.
![]() ![]() ![]() 10.1 Problems and existing descriptions |
In the Scheme Elucidator program the parsing of the Scheme program is done by the function read-source, which is called from end-documentation. The handling of comments is done via the function skip-comment, which is called exclusively by skip-white-space, which in turn is called from elucidate-program-form.
We will now discuss our ideas for improved handling of comments in the Scheme eluciator.
![]() ![]() ![]() 10.2 Ideas to improved handling of comments |
Let us illustrate the idea via an example. First a small Scheme program:
;;;; This is just an example
;; The function f adds a constant
;; c to its first parameter
(define (f a c)
(+ a c) ; The result
)
;; This function just calls f
(define (g a)
(f a 5))
and here the transformation, in which the comments are lexical elements:
(comment 4 "This is just an example ")
(comment 2 "The function f adds a constant
c to its first parameter ")
(define (f a c)
(+ a c) (comment 1 "The result ")
)
(comment 2 "This function just calls f ")
(define (g a)
(f a 5))
(Actually, we use a slightly different designation for the comment,
syntactical-comment-designator, in order not to risk a name conflict wrt. the name 'comment').
The transformation illustrated above has been carried out by the function lexical-to-syntactical-comments! mentioned above. We see that the number of semicolons are represented as an integer argument of the syntactical comment form. We also see that consequtive comment lines are folded into a single syntactic comment construct.
Now the overall idea is to pre-process all Scheme source files by means of the function lexical-to-syntactical-comments. This will affect the program presentation, as realized by elucidate-program-form, which therefore needs modification.
![]() ![]() ![]() 10.3 Solution |
Now, the function read-source, as called by end-documentation, should read the internal comment-transformed source file instead of the original source files. This is arranged for in the function source-file-determinator, which takes a program source file descriptor (with key, file-location, and language components). The variable comment-handling determines whether to invoke and use syntactical comments.
![]() ![]() ![]() 10.4 Extracting sectional names from comments. |
; ::section-name:: Remaining commentwhich - according to section 10.2 - is transformed to
(comment 1 "::section-name:: Remaining comment")There may be one or more semicolons (typically, there will be threee, according to our SchemeDoc conventions). However, the section name must be the first thing in the comment (following possible white space). Thus, we only look for section names in comments in the prefix of the comment string. This is a natural decission seen from a design perspective, and this allows us to make a reasonable efficient predicate to determine whether a comment holds a section name.
At the overall level, we now want to extract section names from comments, and add these as contributions to defining-name-occurences.
We start with the the predicate syntactical-comment? which recognizes a syntactical comment. This function is straightforward. Next, we need a predicate which identifies a section name comment: section-name-comment?. The input is the string, such as
"::section-name:: Remaining comment"The function first skips white space in the string, whereafter it looks for a name in double colons. We could use a function such as substring-index, but we want to be reasonable efficient wrt. determination of the existence of a section name in a comment. The programming of this function involves skip-chars-in-string and looking-at-substring? (programmed for this particular purpose) which we put in the general-lib because of their general applicability.
We use syntactical-comment? and section-name-comment? in the function defined-names-1, which has been revised to extract the section name from 'sectional comment'. The section name is considered as exactly the same as a defining name in a Scheme define form.
![]() ![]() ![]() 10.5 Look-ahead through comments for a define form |
![]() ![]() ![]() 10.6 Presenting syntactical comments. |
We are now ready to present the syntactal comment in elucidate-program-form. We
introduce the conditional clause which captures syntactical comments (
Here we will address the skipping. The pretty printing will be discussed in section 10.8.
The procedure match-syntactical-comment-without-output is specialized, and simple.
It reads through all characters of the syntactical comment on the input port ip without
outputting anyting on the output port op.
The procedure depends on the exact form of a syntactical comment, as explained in section 10.2.
Just after calling match-syntactical-comment-without-output in elucidate-program-form ( ) before any lists
(define, non-define, and pairs) are catched. The idea is to skip the comment (real skipping with
outputting anything on op), and then to pretty print the syntactical comment string with
respect to source markers and section names (and more perhaps).
) we read a single char.
As explained in the program comment, this eats the empty line (the newline) character
after the syntactical comment. Now it is time to explain the pretty printing
of the comment string of the syntactical comment.
![]() ![]() ![]() 10.7 Printing the anchor name |
The key to this is the next form (nf) parameter of elucidate-program-form, as
touched on in section 10.6. In the condititonal case
() we test whether the next form, nf is a define form. If it is, we write the
anchor name on the output port, and we set flag last-define-a-name imperatively,
which remembers the fact that we have already written the anchor tag of the next defintion.
The value of the flag is the name of the define form. The flag is initialized in
elucidate-program-source-1.
In the section of elucidate-program-form we now only write the anchor tag
if it has not already been done together with the previous (interface) comment.
Thus, the writing of the anchor tag is conditional, by the expression
(not (eq? last-define-a-name (defined-name f))). After using
the value of last-define-a-name we reset it to its initial value, #f.
In case we use lexcical comments this will also work, because in that case last-define-a-name will always be false.
![]() ![]() ![]() 10.8 Pretty printing syntactical comments |
It is worth noticing, that we have already done a similar processing in the function skip-comment-1, but here we read from the input port and wrote to the output port. As mentioned above, we now have the comment in a string. This calls for a functional solution using a state machine, similar to the state machine from section 5.2 and 5.3.
Let us look at the desired transformation. The input might be
" ::sect-name:: This is a comment with @a source mark.
It is a two line comment."
We need to identify the double colon syntax of section names, and the source marker. Source markers
only take effect if there is a white space character after them, cf. section 2.6.
A newline should be followed by lexical comment characters (one or more semicolons).
The top-level function for the current comment processing is render-syntactical-comment. It calls the function do-render-syntactical-comment, which is quite similar to do-program-link-documentation. (We could consider to abstract over the functions which realize the state machine, especially because they have been used five or more times in various places in the LAML software. However, the experience seems to bee that they are all sligtly different when it comes to the fine details. So I make the state machine by hand, also this time).
The functiono do-render-syntactical-comment in turn calls the function syntactical-comment-transition, which implements the central state machine of the comment rendering.
There will be the following states in the state machine of syntactical-comment-transition:
11.1 Background 11.2 The solution to the problem |
![]() ![]() ![]() 11.1 Background |
{*reference}The star character could be '+', '-', or empty as well. (An empty modifier defaults to '+').
We would like to be able to determine in which file to address reference. We will use the syntax
{*source-key$reference}The source-key acts as a source key qualification. The special syntax (in which the reference part is empty, and in which the *, +, - may be omitted)
{source-key$}generates a reference to the program as such.
The original and simple reference should still be legal. In cases where there is only one file containing af definition of reference everything works out as usual. In case two or more files contain reference we are now able to refer to a specific instance, in a specific source file.
![]() ![]() ![]() 11.2 The solution to the problem |
The function qualified-program-link? returns the source key qualification (a string) if it is applied on a qualified "word" which happens to be a source key. However, it only returns a source key if the candidate is member of the source-key-list Recall that all the source keys of the documentation bundle are found in the list source-key-list If the word is not qualifiy it returns #f (false).
The function proper-linking-word returns the proper linking word. This is the string "reference" in the example of section 11.1. The result is witout modifier (such as +, -, *) and with the qualification.
The new case in linking-from-doc-to-prog is relevant if there is more than one possible
linking target, and if the reference is qualified ( ). A warning is issued an illegal
qualification is given. In that case we link to 'the first' source key.
![]() ![]() ![]() 12 Ideas to future work on the Elucdiator tool. In this section we will describe and enumerate the ideas to future work on the Elucidator. |
12.1 The ideas |
![]() ![]() ![]() 12.1 The ideas |
This part of our work is related to version control of the documentation bundle.
Support
{*source-key:program-name}
![]() ![]() ![]() 13 Problems and errors in the Elucidator In this section we will describe the known problems and errors in the Elucidator. |
13.1 The problems and errors |
![]() ![]() ![]() 13.1 The problems and errors |
Solution: We make sure that store-defined-names stores the defined name as a string. The reverse function, restore-defined-names, reverses the string to a symbol. As such, the elucidator (the Scheme software) is not affected by the change. The internal files with defined names have been changed, however.