WebSyn
Summary
WebSyn is a program to assist in maintaining a web site mirror
on local disk. It updates files, using instructions embedded
in HTML comment fields, performing tasks that would otherwise
require time consuming manual amendments to many web pages.
It is particularly aimed at giving a consistent look-and-feel
to a whole site (or areas of a site), though its general-
purpose nature may well make it also useful for other
purposes. Hmmm, well, it would wouldn't it; that's what
"general-purpose" means!
Introduction
The motivation for writing this program arose when I created a
web site before I knew what it would ultimately look like. But
how many web sites spring into being, finished, never to be
altered again? Hopefully, none of yours or mine.
I designed (wrote would be more accurate) a number of pages,
concerned to get drafts of the content done, with things like
border graphics, navigation buttons etc left to lower
priority. I hoped that my knowledge in that area was
likely to grow and that I would want to redesign the look of
the site a number of times as I took more time to work on the
graphics.
I was not prepared to revisit dozens of pages, revamping the
graphic layout on each one, every time I made a site design
change. There had to be a better way. I think WebSyn is that
way.
Embedded commands
Although WebSyn is able to scan a whole site, the best way to
describe its actions is to consider just one file and discuss
larger considerations later.
WebSyn scans a file and generates a new version, according to
commands embedded within the file.
WebSyn looks for special HTML comments that contain instructions
it understands. Each of these comments has a similar form,
having a '$' character immediately after the standard <!--
that starts an HTML comment. The comments appear in pairs, to
mark a significant piece of HTML, here called a section. For
example:
Last updated <!--$( date="%Y/%m/%d" -->2000/02/28 <!--$) -->
The pair is matched, the first comment using $( and the second
comment $). Optionally the rest of the comment may contain a
list of qualifiers, of the form keyword="value".
To identify sections, a name may follow the bracket, in which
case the matching bracket must have the same name. E.g.
.. $(another_bit .. .. $)another_bit
The part between the matching comments is called the section
body.
The following keywords are supported:
file
The value is the name of a file. The contents of the
specified file will replace the material between the
pair of comments (the section body). If the new
material contains further WebSyn commands, these will
also be interpreted.
section
Again the value is the name of a file. Instead of the
whole contents of that file being used, a search is
made for a section of the same name -- the new section
body replaces the original.
date
The value is a string that can be used to generate a
date to replace the section body. [Details to follow]
File update
After a new version of the file has been created (in a temporary
file, named $$$$$$$$.$$$), it is compared with the original.
If the files match, the temporary file is deleted. If they
don't match, the old file is renamed (to a name ending with a
dollar) and the temporary is renamed to the original. Thus,
the timestamps of unchanged files are preserved.
However, this would not be adequate for the 'date' keyword.
Rather than make this a specific exception, a more general
mechanism is provided. Sections enclosed in $[ .. $] are
not compared during the match. So the normal way to use
the 'date' keyword is:
Page updated <!--$[ date="%Y/%m/%d %H:%M:%S %Z" -->2000/02/28 21:24:59 GMT<!--$] -->
This still doesn't offer a full solution. If you manually
update a file, you need to manually alter the date within
this part. I still don't see a good solution for this, all
ideas gladly received.
Configuration
You will probably want to run the program whenever you update
your site. A future version will have options held in a
configuration file. This will have the extension .wsy, and
any file matching this in the current directory will be used.
To do list
Implement configuration.
Consider how the section keyword should work when recursing
through directories.
Consider whether to document the 'date' keyword (currently
using the C strftime format) or change it to something
more friendly (e.g. yyyy/mm/dd).
Wish list
* Something that puts in the size of a file, so that lists of
files to download can be auto-updated.
* Some kind of macro substitution, so that e.g. a generic navigation
bar can be defined, and the image corresponding to the current
page can be changed.
For example, suppose I have 3 pages
page1.htm
..
button1_depressed.gif
button2.gif link-to-page2
button3.gif link-to-page3
page2.htm
..
button1.gif link-to-page1
button2_depressed.gif
button3.gif link-to-page3
page3.htm
..
button1.gif link-to-page1
button2.gif link-to-page2
button3_depressed.gif
Instead of having to hand craft each of those pages, I want
something like
<!--$(btns edit="some bit of magic" -->
button1.gif link-to-page1
button2.gif link-to-page2
button3.gif link-to-page3
<!--$)btns -->
If you can think of a way of specifying this, I'd like to hear
from you.
* Something to generate directory listings.
This appears to be a Catch-22, think about the .htm file
in which you want to put a listing of the directory.
If it's in the same directory, consider what happens:
1. the .htm file is uploaded
2. remote directory is fetched (the ftpmir program
can do this automatically)
3. the .htm file is out of date, compared to the
directory listing, therefore
4. the .htm file is updated, so it is out of date
compared to the web site and... goto step 1.
I don't yet see how to get out of this loop in an elegant
way, while ensuring that non-trivial changes to the
directory listing do get uploaded.
* Something to put dates of other files into files, e.g. so that
you could have a page listing recent updates to the site,
so regular visitors know what's new.