URL2Link Rexx Program
/* ReXX */
/* #!/usr/local/bin/rexx */
/* Getting this running NOTES:
****************************
For IBM/Rexx standard interpreters, put
/* ReXX */ as first in this script,
For UNIX style systems, put the #!/... 'shebang' line
as first in file, correcting the interpreter
path/filename as needed.
For variations from Rexx standard Library functions
that need corrections (Lines, LineIn), see
'Function Corrections' comments, way down there.
These are known to be needed for OS/2 / AS400 interpreters,
and somewhat current GPL Regina interpreters.
If you printed this from an html web page and experience
traps for unmatched '/*' - '*/' pairs, check for a '*' and '/'
pair split by an end-of-line, and rejoin them.
It has been tested to 'auto process' itself into html
that runs when processed back to plain text.
*/
/* General NOTES:
**************
Based on prototype ReXX filter script
from 'Teach Yourself ReXX in 21 DAYS'
By Bill and Esther Schindler
p. 271-272
ReXX Language Association: http://www.rexxla.org
Delete #!/usr/local/bin/rexx 'shebang' line
for use on non-UNIX-like systems.
Make sure /* REXX */ comment is leftmost in
the topmost line.
(C) Copyright D.E. Legan,
You may modify or use this program as you wish,
as long as you do not remove my credit for
originally writing it.
If you make any juicy improvements,
let me know! :-)
If you figure out the full implications of the GPL,
let me know.
I assume no liability for the results of this program,
legal or otherwise.
D.E. Legan 12 Feb. 2001
mailto:leganii@surfree.com
http://www.lafn.org/~aw585/index.html
http://www.lafn.org/~aw585/dlegan.html
Thanks to Rollin White (mailto:rollin@scoug.com)
for a few tweaks.
*/
/* More General Notes:
*******************
url2link URL to HTML converter for text file input
27 Jan 2001 d.e. legan
__
Help:
----
url2link -h
Typical use, at the command line prompt:
-----------
url2link < input.txt > output.htm
List all substitutions made in text and URLs:
'url2link -Lr', 'url2link -Lx', 'url2link -Ll', or 'url2link -Lxl'
__
Purpose: to do about 90% of the work in converting
------- a simple text file to HTML.
Used 'HTML Visual Quick Reference' by Scharf / pub. Que
as a reference
__
To 'auto process': url2link -r url2link
----------------- or
url2link -r url2link.cmd
This should prepare it fairly cleanly for further
processing towards conversion of this source text to HTML.
See comments on 'self processing' below in the source for
specific changes to make this come off neatly.
This is probably a good test of handling special,
weird cases.
__
*Known* Limitations:
------------------
0) The first line, and the 'Lines' routine may have
to be changed depending on OS you are using.
1) All of URL must be on one line, no continuation on next line
2) URL must have one of the 'schema' i.e. http:, or ftp:
set up in schemix stem variable.
3) URL must end with ' ' (space) or End of line,
with a possible trailing '.', ',' '"', "'",
or ';' before the space or EOL.
4) Characters preceeded by % in URLs will be converted
to %xx, xx = hexidecimal digits, (0-9, A-F) EXCEPT
spaces and hexidecimal digits themselves.
(which shouldn't need escaping this way anyway.)
To embed a a space, you must put in by hand
'%20' (hex. 32).
(all '%' escaped hex codes for characters should be
passed unchanged in URLs)
To embed a '%' (as might occur in some e-mail addresses
to other networks that go through gateways)
use '%%%' (three (3) '%' in a row).
This escaping should be extremely usefull for embedding
Get methods inside URLs.
5) Reserved/Special characters must be in the
set up in stem variables in subroutine Datastructures.
6) Might want to add ability to do arbitrary inclusion of files
or macro substitution.
7) Some routines currently might be optimized by
converting tail recursion to iteration.
8) This was used as tool to learn HTML by me.
Some of the features may seem foolish in retrospect,
so bear with me. :-)
In short, this is tool that does not completely automate
the process of converting raw text to HTML, but takes care
of much of the dog work associated with the process.
Derived as a simplified version of the earlier
'url2html' program.
__
D.E. Legan 03 Feb. 2001
mailto:leganii@surfree.com
*/
/***********************************************************************
Start of Actual Program
***********************************************************************/
/* Since there shouldn't be any program errors,
bunch up all the traps in one handler.
This may not do much more than the defaults,
but it puts in place a skeleton to customize it more
if it is ever needed.
D.E.L. 09/Aug./2001
*/
SIGNAL ON SYNTAX NAME Handler ;
SIGNAL ON NOVALUE NAME Handler ;
SIGNAL ON ERROR NAME Handler ;
SIGNAL ON FAILURE NAME Handler ;
SIGNAL ON HALT NAME Handler ;
/* SIGNAL OFF NOTREADY NAME Handler ; */
SIGNAL OFF NOTREADY ;
/*
/* Alternate if ever needed: */
/* only these conditions are valid with this type of handler: */
/* (one that returns after doing "it's thing") */
CALL ON ERROR NAME HandlerR ;
CALL ON FAILURE NAME HandlerR ;
CALL ON HALT NAME HandlerR ;
CALL ON NOTREADY NAME HandlerR ;
*/
/* For simplified exposing of these variables in functions: */
basics = 'myself STDIN STDOUT STDERR n nshadow ' ,
'clearscreen TRUE FALSE ON OFF YES NO ' ,
'splitchar oddCharsInURLs dirseperator helptest ',
'hiddenstuff.' ;
/* say 'basics='basics'=' ; exit ; */
/* find out what the name of this program is:
(This routine also sets some OS / interpreter specific
parameters)
*/
myself = WhoAmI( ) ;
switches = '-hsLrxltnp' ; /* switches are on or off, there or not */
parameters = 'e' ; /* parameters take arguments */
options = switches || parameters ;
/* And now the Biggie: */
USAGE = 'USAGE: 'myself' ['options'=X] [[<file.in>|-] [<file.out>|-]]' ;
/* ----By convention, 'USAGE' should be near the head of the program */
/* Handle command line & switches ***************************/
PARSE ARG infile outfile scratch ;
infile = Strip( infile ) ;
IF 1 >< Verify( infile, helptest, 'M', 1 ) THEN
CALL Help infile ; /* First check for help */
/* Initialize switch flags: */
PARSE VAR OFF =1 reservetrans =1 debugmine =1 extendedtrans ,
=1 backslashtrans =1 listsubs ,
=1 extra =1 national =1 punctuation
/* -- using parse to do a multiple assignment,
instead of:
reservetrans = OFF ;
debugmine = OFF ;
extendedtrans = OFF ;
backslashtrans = OFF ;
listsubs = OFF ;
*/
splitchar = '' ;
escaper = '\' ;
/* Have made the option processing more ReXXcentric, should only
require only one pass to process each freestanding
clump of switches on the
command line.
D.E.L. 12/March/2001
Even more so, only with UNIX twist,
D.E.L. 14/July/2002
*/
DO WHILE( GetOpts( infile ) )
IF '--' = infile THEN /* End of options with '--' */
DO
PARSE VALUE outfile scratch WITH infile outfile scratch ;
infile = Strip( infile ) ;
Leave ;
END ;
IF '-' == infile THEN /* '-' is a valid infile spec. */
Leave ;
/* Start of processing 'parameters', 'options' with 'arguments' */
parameter = '' ;
cutpoint = Verify( infile, parameters, 'MATCH', 1 ) ;
IF 0 < cutpoint THEN
DO
PARSE VAR infile infile =(cutpoint) parameter +1 infile2 ;
IF '' << optarg THEN
infile = infile || infile2 ;
ELSE
optarg = infile2 ;
END ;
IF (( '' = parameter ) & ( '' >< optarg )) THEN
DO
SAY 'Spurious argument 'optarg', with no parameter specified.' ;
infile = '-help' ;
CALL Help ;
END ;
IF (( '' >< parameter ) & ( '' = optarg )) THEN
DO
optarg = outfile ;
outfile = '' ;
IF '' = optarg THEN
DO
SAY 'Parameter 'parameter' must have an argument!' ;
infile = '-help' ;
CALL Help ;
END ;
END ; /* parameter argument location/extraction */
CALL Help infile ; /* First check for help */
/* this had to be moved to after processing of
optarg / parameter values.
*/
IF '' >< parameter THEN /* There is a parameter with argument */
SELECT
/* Process parameters, with ARGUMENTs here */
/* Should only be one parameter per option cluster */
/* with possible argument following either ':', '=' or a space */
WHEN 'e' == parameter THEN
DO
/* Must handle this before testing for unacceptable options. */
escaper = optarg ;
IF 1 < Length( optarg ) THEN
DO
infile = '-help' ;
CALL Help ;
END ;
END ;
OTHERWISE
SAY 'Something seems to be wrong with parameter set up.' ;
infile = '-help' ;
CALL Help ;
END ;
/* End of processing 'parameters', 'options' with 'arguments' */
/* START of Hidden switch processing */
IF Inside( '?', infile ) THEN /* hidden switch */
DO /* Help synonymn, for the totally lost */
/* CALL LineOut outfile ; */
infile = '-help' ;
CALL Help ;
END ;
IF Inside( 'd', infile ) THEN /* hidden switch */
DO /* secret debug switch, for me, D.E.L. */
debugmine = intent ;
END ;
IF Inside( 'Q', infile ) THEN /* hidden switch */
DO /* secret switch to query version and source info */
CALL Whereami ;
END ;
/* END of Hidden switches */
/* Verify valid, non-hidden options */
/* Parameters, with arguments, should be sliced out of
infile at this point */
IF Verify( infile, options ) > 0 THEN /* check for invalid switches */
DO
/* CALL LineOut outfile ; */
infile = '-help' ;
CALL Help ;
END ;
/* End of Verifying valid, non-hidden switches */
/* Process cleartext switches, with no arguments here */
IF Inside( 'h', infile ) THEN
DO
/* CALL LineOut outfile ; */
infile = '-help' ;
CALL Help ;
END ;
IF Inside( 's', infile ) THEN
DO
splitchar = n ;
END ;
IF Inside( 'r', infile ) THEN
DO
reservetrans = intent ;
END ;
IF Inside( 'x', infile ) THEN
DO
reservetrans = intent ;
extendedtrans = intent ;
END ;
IF Inside( 'l', infile ) THEN
DO
reservetrans = intent ;
backslashtrans = intent ;
END ;
IF Inside( 'L', infile ) THEN
DO
listsubs = intent ;
END ;
IF Inside( 't', infile ) THEN
DO
extra = intent ;
END ;
IF Inside( 'n', infile ) THEN
DO
national = intent ;
END ;
IF Inside( 'p', infile ) THEN
DO
punctuation = intent ;
END ;
PARSE VALUE outfile scratch WITH infile outfile scratch ;
infile = Strip( infile ) ;
END ; /* DO WHILE( processing command line options ) */
/* Set up I/O */
SELECT
WHEN outfile = '' THEN
outfile = STDOUT ;
WHEN outfile = '-' THEN
outfile = STDOUT ;
OTHERWISE
NOP ;
END /* SELECT */
SELECT
WHEN infile = '' THEN
infile = STDIN ;
WHEN infile = '-' THEN
infile = STDIN ;
OTHERWISE
NOP ;
END /* SELECT */
IF debugmine THEN
DO
SAY 'infile is *'infile'*' ;
SAY 'outfile is *'outfile'*' ;
END
/* End of switch command line processing ***********************/
/*****************************************************************/
CALL DataStructures ; /* VERY Important call here */
/*****************************************************************/
/* Preliminary Steps now complete - Begin generating HTML */
/* Main Processing Loop */
lastline = '' ;
DO WHILE Lines( infile )
/*
See the Function Corrections section below.
See Daney, Lines() is not
generally Boolean
only for OS/2 and Regina
therefore, the test
Also in Regina Lines only seems to work after a read at the
end of the file returns a null string.
Previously, I tried a Blines() function that tried to deal
with this, but some variations of this failed with OS/2,
and also it seemed excessively complex.
So the LineIn() call was moved to the end of the loop,
and then these tests seemed to work fine, were reasonably fast
and fairly simple.
It also seems to be critical if the NOTREADY trap is set or not.
*/
linethat = LineIn( infile ) ;
linethat = Strip( linethat, 'T' ) ; /* remove trailing blanks */
/* Here, look at each possibility, and process appropriately */
IF '' >< linethat THEN
DO
/* Now process each reserved character */
/* Those that cannot be in an URL */
IF reservetrans THEN
DO cindex = 1 TO reservechar.0
linethat = Reserver( cindex, linethat ) ;
END cindex
/**** The most important call: ****/
linethat = Process( linethat ) ;
END /* IF */
linethat = Translate( linethat, cleanout, dirtyin ) ;
/* cleaning up the non-ASCII characters used
to simplify conflicts between freestanding
and HTML meaningfull uses.
*/
CALL LineOut outfile, linethat ;
END /* while */
EXIT
/******************************************************************
*******************************************************************
END OF MAIN PROGRAM
START OF SUBROUTINES
*******************************************************************
******************************************************************/
Process: PROCEDURE EXPOSE fperc reservetrans ,
extendedtrans backslashtrans ,
escaper splitchar ,
oddCharsInURLs ,
schemix. ,
specialchar. specialnum. ,
spexialchar. spexialnum. ,
specialvector. spexialvector. ,
specialvectors. spexialevector. ;
/* This is the most important subroutine!! */
PARSE ARG linehere ;
/* schemalength = 0 ; */
firstpart = linehere ; /* default if there is no URL on line */
url = '' ;
lastpart = '' ;
/* This part has gone through many convoluted versions.
Part of the reason was a desire to deal cleanly with
various character cases for the schema.
Philosophical conclusion to date:
You should simply use Parse to split up data,
not to *decide* if it should be split up
(or should of been split up).
*/
IF Inside( ':', linehere ) THEN /* No colon, no schema! */
DO
PARSE UPPER VAR linehere shadowline ;
/* Destroying case dependence.
Since several parses are to be done, this allows
PARSE VAR .... instead of
PARSE UPPER VAR ....
and should speed things up since the PARSE does
not have to do upper casing along with everything else.
Tests seem to indicate 'PARSE VAR' is 3/5's the time of a
'PARSE UPPER VAR', so do it once here.
'PARSE UPPER VAR' is itself about half the time of
a call to Tranlate( <defaults to upper> ) and assignment.
NOTE: The URL schema in the data setup, MUST be
uppercase.
*/
urlpoint = shadowline ; /* set an initial value */
DO index = 1 TO schemix.0
/* Search for the schema closest to the start of the string */
/* in this loop. */
/* Note that shadowline, uppercased above, is used, */
/* with the uppercase schema so that the numeric location */
/* will be case insensitive. */
/* Note that the position of the colon triggering this */
/* is ignored, since something like - */
/* ........: http://. */
/* would make it erroneous data to consider. */
PARSE VAR shadowline urlframe (schemix.index) . ;
/* urlframe == (one of the) string(s) that frames/brackets
the URL.
If schemix.index is not found, the PARSE match point moves
to the end of the record, and urlframe captures the complete
line/record.
*/
IF urlframe << urlpoint THEN
DO
urlpoint = urlframe ;
schemalength = schemix.length.index ;
END
/* In strict comparison ('<<'), the longer string is greater
if it is an extension of the shorter.
That is, it consists of the shorter concatenated to
some more characters.
This comparison moves the urlpoint to the leftmost
schema.
*/
END index
/* Having found the first schema, now split things apart. */
urlpoint = Length( urlpoint ) ;
/* Now, use the actual line, not the uppercased 'shadowline',
used for searching in a case insensitive way.
*/
PARSE VAR linehere =1 . +(urlpoint) url ' ' lastpart ,
=1 firstpart (url) . ;
/* trickiness needed, the straight ahead:
PARSE VAR linehere =1 firstpart +(urlpoint) url ' ' lastpart ;
causes firstpart to slurp up the whole record
if 0 == urlpoint, that is the url is first thing on the line.
This seems to be a result of '+(urlpoint)' when urlpoint = 0
produces the equivalent of a backup in the PARSE template.
(backups causing the preceeding variable to 'grab' the rest of
line/record. This is to facilitate multiple assignments
with PARSE.).
If no URL was found, 'urlpoint' will be the length of the line,
so the '+(urlpoint)' match point will be past the end of the record,
squeezing 'url' and 'lastpart' to be empty strings.
On the second pass of the parse, '(url)' will, due to one
of PARSES peculiarities, put the end point of 'firstpart' at the
end of the record. (An empty string is 'found' at the end of
the record by PARSE, matching nothing in a finitely sized string.)
*/
trimmer = ' ' ;
littlebit = '' ;
textlabel = '' ;
SELECT
WHEN Inside( '|', url ) THEN
DO
/* Now grab the text label for the link: */
PARSE VAR url url '|' textlabel '|' sp '|' littlebit ;
IF 1 < Length( sp ) THEN /* may want to change this in the future */
DO
SAY 'Invalid space character (it is a string, not a single character.)'
SAY ' URL='url'=' ;
SAY ' text label='textlabel'=' ;
SAY ' space character (the problem)='sp'=' ;
SAY ' text abutted to URL='littlebit'=' ;
EXIT ;
END ;
END ;
WHEN '' >< oddCharsInURLs & ( '' >< url ) THEN
DO
cutpoint = Verify( url, oddCharsInURLs, 'M', schemalength ) ;
IF 0 < cutpoint THEN
DO
PARSE VAR url url =(cutpoint) littlebit ;
trimmer = '' ;
END ; /* IF */
END ; /* WHEN */
OTHERWISE
NOP ;
END ;
SELECT /* use select to allote for future complications */
WHEN '' = lastpart THEN
lastpart = littlebit ;
WHEN '' >< lastpart THEN
lastpart = littlebit lastpart ;
OTHERWISE
NOP ;
END ; /* SELECT */
END /* IF */
/* Here, recursively handle multiple URLs on one line */
IF '' >< lastpart THEN
lastpart = trimmer || Process( lastpart ) ;
/* -- calling this routine itself. */
/* If not for the above, only the first URL would be processed */
/* At this point, firstpart is assured to not have any
part of an URL in it, so this would be an ideal point
to process it for special and resesrved characters that
might be part of an URL.
*/
IF '' >< firstpart THEN
firstpart = FreeText( firstpart ) ;
/* This DO block replaced with the FreeText() function call
DO
/* order of evaluation is important on these things in this DO block! */
/* handle these first, so that '&', and ';' don't cause trouble */
IF extendedtrans THEN
firstpart = Specialsx( firstpart ) ;
IF backslashtrans THEN
firstpart = Specialbsx( firstpart ) ;
IF reservetrans THEN
DO cindex = 1 TO specialchar.0
firstpart = Specials( cindex, firstpart ) ;
END cindex
END /* IF */
*/ /* FreeText */
IF '' >< url THEN
DO
/* url = schema || url ; */
/* process url for hex escapes here */
/* order of evaluation is important on these things! */
/* Convert '%symbols;' to hex encoded '%xx' in URL: */
IF extendedtrans THEN
url = Percent2sX( url ) ;
/* Convert '\x' character strings to hex escapes: */
IF backslashtrans THEN
url = Percent2bsX( url ) ;
/* Convert characters preceeded with '%' to hex escapes in URl.
('%%%' -> '%'): */
url = Percent2X( url ) ;
/*
strend = Right( url, 1 ) ;
Let's do this in a more sophisticated way: :-)
*/
PARSE VAR url . '' -1 strend ;
/* add more characters to terminate URL string here as needed */
SELECT
/* Adjust these spacings to preferences */
/*
These branchs no longer used:
WHEN strend == "'" THEN
DO
/* This branch is specificly to enable self processing */
lastpart = "' " || lastpart ;
url = Strip( url, 'T', "'" ) ;
END
WHEN strend == '"' THEN
DO
/* This branch is specificly to enable self processing */
lastpart = '"' || lastpart ;
url = Strip( url, 'T', '"' ) ;
END
End of eliminated branches
*/
WHEN strend == '.' THEN
DO
lastpart = '.' || lastpart ;
url = Strip( url, 'T', '.' ) ;
END
WHEN strend == ';' THEN
DO
lastpart = ';' || lastpart ;
url = Strip( url, 'T', ';' ) ;
END
/* Another branch eliminated:
WHEN strend == ',' THEN /* adding ',' to the terminators */
DO
lastpart = ',' || lastpart ;
url = Strip( url, 'T', ',' ) ;
/* Apparently, quoting the T is sometimes
needed with some interpreters with ',',
it tries to interprete 'T' as a varaible
otherwise, sometimes.
*/
END
end of branch eliminated */
OTHERWISE
NOP ;
/*
IF '' >< lastpart THEN
lastpart = ' ' || lastpart ;
*/
END
/* add other trailing charters here and above as needed */
/* Now process the text label for the link: */
IF '' >< textlabel THEN
DO
IF '' >< sp THEN
textlabel = translate( textlabel, ' ', sp ) ;
ELSE
textlabel = translate( textlabel, ' ', '_' ) ;
textlabel = FreeText( textlabel ) ;
END ;
ELSE
DO
textlabel = url ;
END ;
/* original:
url = "<A HREF='" || url || "'>" || url || '</A>' ;
suggested change by Rollin White:
url = "<A HREF=""" || url || """>" || url || '</A>' ;
*/
IF schemalength < Length( url ) THEN
url = "<A HREF=""" || url || """>" || splitchar || textlabel || ,
splitchar || '</A>' ;
/* modified to allow text label for the link.
url = "<A HREF=""" || url || """>" || splitchar || url || ,
splitchar || '</A>' ;
*/
/* This will skip null URLs, that are just schema standing alone
in text, making it easier to process this script by itself. :-)
*/
END
linehere = firstpart || url || lastpart ;
RETURN linehere ;
FreeText:
/* Bundle some function calls for processing free standing text
into one function call
The EXPOSE list is basicly borrowed from 'Process:' -
*/
PROCEDURE EXPOSE fperc reservetrans ,
extendedtrans backslashtrans ,
escaper splitchar ,
oddCharsInURLs ,
schemix. ,
specialchar. specialnum. ,
spexialchar. spexialnum. ,
specialvector. spexialvector. ,
specialvectors. spexialevector. ;
PARSE ARG bitoftext ;
/* order of evaluation is important on these things in this DO block! */
/* handle these first, so that '&', and ';' don't cause trouble */
IF extendedtrans THEN
bitoftext = Specialsx( bitoftext ) ;
IF backslashtrans THEN
bitoftext = Specialbsx( bitoftext ) ;
IF reservetrans THEN
DO cindex = 1 TO specialchar.0
bitoftext = Specials( cindex, bitoftext ) ;
END cindex
RETURN bitoftext ;
Reserver: PROCEDURE EXPOSE reservechar. reservenum. ;
/* This is mostly to handle freestanding, literal angle brackets */
PARSE ARG reserved, linehere ;
PARSE VAR linehere firstpart (reservechar.reserved) lastpart ;
/* Here, recursively handle multiple instances on one line */
IF '' >< lastpart THEN
lastpart = Reserver( reserved, lastpart ) ;
/* If not for the above, only the first instance would be processed */
/* IF 0 < Pos( rchar, linehere ) THEN */
IF linehere >< firstpart THEN
linehere = firstpart || reservenum.reserved || lastpart ;
/* linehere = firstpart || '&#' || rnum || ';' || lastpart ; */
RETURN linehere ;
Specials: PROCEDURE EXPOSE specialchar. specialnum. ;
/* this is for freestanding characters that might have special
meaning if they appeared in a URL.
*/
PARSE ARG specialized, linehere ;
PARSE VAR linehere firstpart (specialchar.specialized) lastpart ;
/* Here, recursively handle multiple instances on one line */
IF '' >< lastpart THEN
lastpart = Specials( specialized, lastpart ) ;
/* If not for the above, only the first instance would be processed */
/* IF 0 < Pos( rchar, linehere ) THEN */
IF linehere >< firstpart THEN
linehere = firstpart || specialnum.specialized || lastpart ;
/* linehere = firstpart || '&#' || rnum || ';' || lastpart ; */
RETURN linehere ;
Specialsx: PROCEDURE EXPOSE specialvector. ;
/* This is for '&abbrev/backslashcode;' to decimal translations */
PARSE ARG linehere ;
/* PARSE VAR linehere firstpart '&' guts ';' lastpart , */
PARSE VAR linehere firstpart '&' +0 guts ';' lastpart ,
=1 . '&' . ';' +0 lptest ,
=1 falsepatch (lptest) ;
lastpartorg = lastpart ;
IF '' >< lastpart THEN
lastpart = Specialsx( lastpart ) ;
IF '' >< lptest THEN
DO
/* guts = '&' || Translate( guts ) || ';' ; */
/* guts = Translate( guts ) || ';' ; */
PARSE UPPER VAR guts guts ;
guts = guts';' ;
SELECT
WHEN '' >< specialvector.guts THEN
linehere = firstpart || specialvector.guts || lastpart ;
WHEN lastpartorg >< lastpart THEN
linehere = falsepatch || ';' || lastpart ;
OTHERWISE
NOP ;
END /* SELECT */
END /* IF */
RETURN linehere ;
Specialbsx: PROCEDURE EXPOSE specialvectors. escaper ;
/* This is for prefix escaped substitutions, originally '\' */
PARSE ARG linehere ;
/* PARSE VAR linehere firstpart '\' guts +4 lastpart ; */
PARSE VAR linehere firstpart (escaper) guts +4 lastpart ;
/* IF '' >< guts & '\' >< guts THEN */
IF '' >< guts & escaper >< guts THEN
DO index = 3 TO 1 BY -1
/* guts = Translate( guts ) ; */
PARSE UPPER VAR guts guts ;
IF '' >< specialvectors.guts THEN
DO
IF '' >< lastpart THEN
lastpart = Specialbsx( lastpart ) ;
return firstpart || specialvectors.guts || lastpart ;
END
/* PARSE VAR linehere firstpart '\' guts +(index) lastpart ; */
PARSE VAR linehere firstpart (escaper) guts +(index) lastpart ;
END index
RETURN linehere ;
Percent2X: PROCEDURE EXPOSE fperc ;
/* to precede a character by '%' and convert to %xx format
in URL (xx == hex encoded number).
*/
PARSE ARG linehere ;
PARSE VAR linehere firstpart '%' +1 kar +1 lastpart,
=1 fp '%' +1 xkar +2 lp ;
DO WHILE( '' >< kar )
SELECT
WHEN '%%' = xkar THEN
linehere = fp || fperc || lp ; /* '%%%' passed as '%' */
WHEN 0 = Datatype( xkar, 'X' ) THEN /* xkar is not a hex number! */
linehere = firstpart || fperc || C2X( kar ) || lastpart ;
OTHERWISE /* When kar specifies an already escaped Hex. char. code */
linehere = firstpart || fperc || kar || lastpart ;
END
PARSE VAR linehere firstpart '%' +1 kar +1 lastpart,
=1 fp '%' +1 xkar +2 lp ;
END /* while */
RETURN linehere ;
Percent2sX: PROCEDURE EXPOSE spexialvector. ;
/* convert in URL '%symbols;' to hex encoded '%xx' */
PARSE ARG linehere ;
PARSE VAR linehere firstpart '%' +0 guts ';' +0 lptest ,
=1 . '%' . ';' lastpart ,
=1 falsepatch (lptest) ;
lastpartorg = lastpart ;
IF '' >< lastpart THEN
lastpart = Percent2sX( lastpart ) ;
IF '' >< lptest THEN
DO
/* guts = Translate( guts ) || ';' ; */
/* Translate() defaults to uppercase */
PARSE UPPER VAR guts guts ;
guts = guts';' ;
SELECT
WHEN '' >< spexialvector.guts THEN
linehere = firstpart || spexialvector.guts || lastpart ;
WHEN lastpartorg >< lastpart THEN
linehere = falsepatch || ';' || lastpart ;
OTHERWISE
NOP ;
END /* SELECT */
END /* IF */
RETURN linehere ;
Percent2bsX: PROCEDURE EXPOSE spexialevector. escaper ;
/* convert '\symbol' to '%xx' string in URL */
PARSE ARG linehere ;
/* PARSE VAR linehere firstpart '\' guts +4 lastpart ; */
PARSE VAR linehere firstpart (escaper) guts +4 lastpart ;
/* IF '' >< guts & '\' >< guts THEN */
IF '' >< guts & escaper >< guts THEN
DO index = 3 TO 1 BY -1
/* guts = Translate( guts ) ; */
PARSE UPPER VAR guts guts ;
IF '' >< spexialevector.guts THEN
DO
IF '' >< lastpart THEN
lastpart = Percent2bsx( lastpart ) ;
return firstpart || spexialevector.guts || lastpart ;
END
/* PARSE VAR linehere firstpart '\' guts +(index) lastpart ; */
PARSE VAR linehere firstpart (escaper) guts +(index) lastpart ;
END index
RETURN linehere ;
WhoAmI: PROCEDURE EXPOSE (basics) ;
/*
Here take care of interpreter specific stuff:
Borrowing some ideas from Rexxtry ;-)
*/
PARSE version particular '_' .
IF 'REXX-Regina' == particular THEN
DO
/* Some defaults for Regina: */
STDIN = '' ;
STDOUT = '' ;
STDERR = '' ;
/* These are handled where more specificly attributable:
STDIN = '/dev/stdin' ;
STDOUT = '/dev/stdout' ;
STDERR = '/dev/stderr' ;
/* Or possibly:
STDIN = '/dev/fd/0' ;
STDOUT = '/dev/fd/1' ;
STDERR = '/dev/fd/2' ;
*/
*/
END
ELSE
DO
STDIN = 'STDIN:' ;
STDOUT = 'STDOUT:' ;
STDERR = 'STDERR:' ;
END
TRUE = 1 ;
FALSE = 0 ;
ON = TRUE ;
OFF = FALSE ;
YES = TRUE ;
NO = FALSE ;
environment = address() ;
/* Storing the original environment for future use,
such as possible retreival of of environmental variables,
with value()
Where appropriate below, this should be overridden.
*/
/* Find out what is being executed! */
PARSE SOURCE os . pgmName ;
/*
n will be the end of line character for various OSs for 'here' documents.
See Help, Header, Footer and How functions for examples.
clearscreen will clear the screnn
*/
/* os = Translate( os ) ; */
PARSE UPPER VAR os os ;
SELECT
WHEN 'OS/2' = os THEN
DO
n = '0D 0A'x ;
clearscreen = 'CLS' ;
environment = 'OS2ENVIRONMENT' ;
END
WHEN abbrev( os, 'Windows' ) THEN
DO
n = '0D 0A'x ;
clearscreen = 'CLS' ;
END
WHEN 'DOS' = os THEN
DO
n = '0D 0A'x ;
clearscreen = 'CLS' ;
END
WHEN 'UNIX' = os THEN
DO
n = '0A'x ;
clearscreen = 'clear' ;
STDIN = '/dev/stdin' ;
STDOUT = '/dev/stdout' ;
STDERR = '/dev/stderr' ;
/* Or possibly:
STDIN = '/dev/fd/0' ;
STDOUT = '/dev/fd/1' ;
STDERR = '/dev/fd/2' ;
*/
END
WHEN 'LINUX' = os THEN
DO
n = '0A'x ;
clearscreen = 'clear' ;
STDIN = '/dev/stdin' ;
STDOUT = '/dev/stdout' ;
STDERR = '/dev/stderr' ;
/* Or possibly:
STDIN = '/dev/fd/0' ;
STDOUT = '/dev/fd/1' ;
STDERR = '/dev/fd/2' ;
*/
END
WHEN 'AIX' = os THEN
DO
n = '0A'x ;
clearscreen = 'clear' ;
STDIN = '/dev/stdin' ;
STDOUT = '/dev/stdout' ;
STDERR = '/dev/stderr' ;
/* Or possibly:
STDIN = '/dev/fd/0' ;
STDOUT = '/dev/fd/1' ;
STDERR = '/dev/fd/2' ;
*/
END
WHEN 'CMS' = os THEN
DO
/* n = ' ' || D2C( 10 ) ; /* Does anyone really know? */ */
n = ' ' || '85'x ; /* According to some UUASC e-list traffic */
clearscreen = 'VMFCLEAR' ;
END
WHEN 'MAC' = os THEN /* and various 8 bit machines :-) */
DO
n = '0D'x ;
/* ??? clearscreen = 'clear' ; */
END
OTHERWISE
DO
n = '0D 0A'x ;
clearscreen = 'clear' ; /* ????? */
END
END /* SELECT */
/* where = directory() ; */
where = pgmName ;
SELECT
WHEN Inside( '\', where ) THEN
dirseperator = '\' ; /* PC proprietary */
WHEN Inside( '/', where ) THEN
dirseperator = '/' ; /* UNIX */
WHEN Inside( '.', where ) THEN
dirseperator = '.' ; /* some IBM mainframes */
WHEN Inside( ':', where ) THEN
dirseperator = ':' ; /* some Macs? */
OTHERWISE
NOP ;
END ;
helptest = '-+' ;
IF ( '/' >< dirseperator ) Then
helptest = helptest'/' ;
nshadow = Translate( n, , n, ' ' ) ;
/* ....for use with record/line oriented functions */
pgmName = Translate( pgmName, ' ', '/\' ) ; /* typical dir. seperators */
/* Note that this may not work on certain IBM mainframe
OSs where '.' is effectively the directory seperator
*/
nameCnt = Words( pgmName ) ; /* count spaced words and */
pgmName = Word( pgmName, nameCnt ) ; /* get the last one */
pgmName = Translate( pgmName, ' ', '.' ) ; /* Now strip extensions */
PARSE VAR pgmName pgmName . ;
RETURN pgmName ;
HandlerR:
/* Type 2 Exception handler - returns to program */
/* By default this is inactive in this program, just here
for possible future use, as needed.
*/
/* Basicly ripped off of Daney p. 205 */
/* "Programming in REXX", (c) 1992, Charles Daney,
J. Ranade IBM Series, McGraw-Hill, Inc.
*/
ttype = Condition( 'C' ) ;
IF 'SYNTAX' = ttype THEN
SAY 'REXX error 'rc' ('errortext( rc )' ) occured in line 'sigl'.' ;
IF 'ERROR' = ttype | 'FAILURE' = ttype THEN
SAY 'Command return code: 'rc' occured in line 'sigl'.' ;
IF SourceLine() > 0 THEN
SAY '=====> 'sourceline( sigl ) ;
SAY 'Of 'sourceline()' total lines.' ;
/*
signal on syntax
signal restart
*/
tdescription = Condition( 'D' ) ;
tinstruction = Condition( 'I' ) ;
tstate = Condition( 'S' ) ;
SAY 'Trap type: 'ttype ;
SAY 'Trap description: 'tdescription ;
SAY 'Trap instruction: 'tinstruction ;
SAY 'Trap state: 'tstate ;
/* EXIT ; /* HandlerR */ */
RETURN ; /* HandlerR */
Handler:
/* Type 1 Exception handler - No Return */
/* Basicly ripped off of Daney p. 205 */
/* "Programming in REXX", (c) 1992, Charles Daney,
J. Ranade IBM Series, McGraw-Hill, Inc.
*/
ttype = Condition( 'C' ) ;
IF 'FAILURE' = ttype THEN
SAY 'Command return code: 'rc' occured in line 'sigl'.' ;
IF SourceLine() > 0 THEN
SAY '=====> 'sourceline( sigl ) ;
SAY 'Of 'sourceline()' total lines.' ;
/*
signal on syntax
signal restart
*/
tdescription = Condition( 'D' ) ;
tinstruction = Condition( 'I' ) ;
tstate = Condition( 'S' ) ;
SAY 'Trap type: 'ttype ;
SAY 'Trap description: 'tdescription ;
SAY 'Trap instruction: 'tinstruction ;
SAY 'Trap state: 'tstate ;
EXIT ; /* Handler */
Whereami: PROCEDURE EXPOSE (basics) ;
/* Here take care of interpreter specific stuff: */
/* Borrowing some ideas from Rexxtry ;-) */
PARSE VERSION version ;
PARSE SOURCE source ;
clearscreen ;
SAY ,
''n,
version n,
''n,
source n,
''n ; /* end of 'HERE' document */
EXIT ;
RETURN ;
/*
Lowercase:
PARSE ARG casetarget ;
casegoal = Translate( casetarget, lowers, uppers ) ;
/ * ( string, to, from ) * /
return casegoal ;
EXIT ;
*/
Inside: Procedure ;
/* This returns a TRUE / FALSE value if string 'letter'
is inside ( a substring ) of string 'string'.
It is used mainly for processing command line switches.
It uses the same 'needle' in 'haystack' ordering
of parameters as ReXX's Pos command.
This is really just a boolean version of the
CountStr function, and so has been rewritten
to use methods I developed in that article.
*/
/*
PARSE ARG letter, string ;
RETURN Pos( letter, string ) > 0 ;
*/
PARSE ARG needle, haystack =1 subhaystack (needle) . ;
RETURN subhaystack << haystack ;
GetOpts:
PROCEDURE EXPOSE infile intent optarg (basics) ;
PARSE ARG optclump ;
/* a default value: */
optarg = '' ;
intent = OFF ;
IF 1 >< Verify( optclump, helptest, 'M', 1 ) THEN
RETURN OFF ;
cutpoint = Verify( optclump, ':=', 'M', 1 ) ;
IF 0 < cutpoint THEN
PARSE VAR optclump optclump =(cutpoint) . +1 optarg ;
SELECT
WHEN '' = optclump THEN
RETURN OFF ;
WHEN Abbrev( optclump, '-', 1 ) THEN
DO
intent = ON ;
infile = optclump ;
END ;
WHEN Abbrev( optclump, '+', 1 ) THEN
DO
intent = OFF ;
infile = optclump ;
END ;
WHEN ( Abbrev( optclump, '/', 1 ) & ( '/' >< dirseperator ) ) THEN
DO
intent = ON ;
infile = optclump ;
END ;
OTHERWISE
RETURN OFF ;
END ;
RETURN ON ;
Help: PROCEDURE EXPOSE USAGE infile (basics) ;
/* Be flexible accepting help */
rescueCry = Translate( infile, '-', '/' ) ;
/* Translate( string, to, from ) */
rescueCry = Strip( rescueCry, 'L', '-' ) ;
PARSE UPPER VAR rescueCry =1 rescueCry =5 . ;
rescueCry = Strip( rescueCry ) ;
IF Abbrev( 'HELP', rescueCry, 1 ) | ,
Abbrev( 'HHHH', rescueCry, 1 ) | ,
Abbrev( '????', rescueCry, 1 )
THEN
DO
/*
clearscreen ;
*/
CALL LineOut STDERR, ,
USAGE n,
''n,
' 'myself', a simple tool to aid converting flat text to HTML.'n,
' Typical use: 'myself' -xl < input.txt > output.htm'n,
' Switches:'n,
' -h - Help (this) message'n,
' Reserve/special character processing:'n,
' -s - Split the portions of the anchor onto seperate lines'n,
' -r - Simple reserve character processing'n,
' -x - Extended reserve character processing'n,
' -l - Backslashed reserve character processing'n,
' -L - List schema / reserve/special character substitutions'n,
' -eX - Make ''X'' instead of ''\'' the escape character'n,
' -t - Allow RFC 1738 extra characters in URLs. (!*''(),)'n,
' -n - Allow RFC 1738 national characters in URLs. ({}\^[]`)'n,
' -p - Allow RFC 1738 punctuation characters in URLs. (<>"#)'n,
' For all:'n,
' "URL|someClabel|C|" -> "some label" linked to URL'n,
' "C" defaults to "_", "someClabel" defaults to URL'n,
; /* end of 'Here' document */
EXIT ;
END /* IF */
RETURN ;
/****************************************************************/
/* Function Corrections **************************************/
/* Uncomment as appropriate *********************************/
/****************************************************************/
/* */
/* For OS/2, AS400, and sometimes Regina: */
Lines:
PROCEDURE ;
PARSE ARG infile ;
RETURN 0 < 'LINES'( infile ) ;
/* These calls to the built-in functions must be in upper case. */
/* END OS/2, AS400 special function defs. */
/* */
/*
/* Sometimes these are needed for Regina, if lines() seems to act up: */
Lines:
PROCEDURE EXPOSE (basics) ;
PARSE ARG infile ;
hiddenstuff.infile.line = 'LINEIN'( infile ) ;
/* These calls to the built-in functions must be in upper case. */
IF '' == hiddenstuff.infile.line THEN
IF 0 = 'LINES'( infile ) THEN
/* These calls to the built-in functions must be in upper case. */
RETURN FALSE ;
RETURN TRUE ;
LineIn:
PROCEDURE EXPOSE (basics) ;
PARSE ARG infile ;
RETURN hiddenstuff.infile.line ;
/* End Regina special function defs. */
*/
/****************************************************************/
/* Initialize Data *********************************/
/****************************************************************/
DataStructures: /* PROCEDURE ......not! */
/* This must all be available. */
/* What would be known in FORTRAN as the BLOCK DATA */
/* or was that DATA BLOCK? Been a long time ago...... */
/* lowers = Xrange( 'a', 'z' ) ; */
/* uppers = Xrange( 'A', 'Z' ) ; */
/*
/* See Daney p.xxx - This is portable across ASCII to EBCIDC: */
lowers = Xrange( 'a', 'i' ) || Xrange( 'j', 'r') || Xrange( 's', 'z' ) ;
/* uppers = Translate( lowers ) ; */
/* Faster: */
PARSE UPPERS VAR lowers uppers ;
*/
/* Initialize schema array */
/* These should be arranged in ascending order if any one is a
substring of another, with some processing schemes,
also they should be uppercase: */
schemix. = '' ; /* set the compound variable default value */
schemix.1 = 'HTTP://' ;
schemix.2 = 'HTTPS://' ;
schemix.3 = 'FTP://' ;
schemix.4 = 'FILE://' ;
schemix.5 = 'MAILTO:' ;
schemix.6 = 'NEWS:' ;
schemix.7 = 'NNTP:' ;
schemix.8 = 'TELNET://' ;
schemix.9 = 'SNEWS:' ; /* easier to handle snews here */
schemix.10 = 'NEWSPOST://' ;
schemix.11 = 'NEWSREPLY://' ;
schemix.12 = 'SNEWSPOST://' ;
schemix.13 = 'SNEWSREPLY://' ;
schemix.14 = 'IKSD://' ; /* Internet Kermit Server Daemon */
schemix.15 = 'RLOGIN://' ;
schemix.16 = 'TN3270://' ;
schemix.17 = 'TN5250://' ;
schemix.18 = 'GOPHER://' ;
schemix.19 = 'WAIS://' ;
schemix.20 = 'FINGER://' ;
schemix.21 = 'CSO://' ; /* CSO? I can't even find an RFC
for this :-)
it seems related to gopher, mentioned in
gopher RFCs. Some predecessor to LDAP?
(was listed as a URL Lynx could cope with)
*/
/* add more schema here as appropriate */
schemix.0 = 21 ; /* keep number of elements in '.0'
as is traditional in ReXX
*/
DO I = 1 TO schemix.0
/* Used to spot null URLs that are just a schema */
schemix.length.I = Length( schemix.I ) ;
END I ;
IF listsubs THEN
DO
SAY 'Schema recognized:'
DO index = 1 TO schemix.0
SAY ' 'schemix.index ;
END index
END
/* characters with questionable URL use status, official status
(theory):
From RFC 1738:
extra = "!" | "*" | "'" | "(" | ")" | ","
national = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]" | "`"
punctuation = "<" | ">" | "#" | "%" | <">
*/
/* What will be allowed in practice: */
oddCharsInURLs = '' ; /* default status */
IF \ extra THEN
oddCharsInURLs = oddCharsInURLs || "!*'()," ;
IF \ national THEN
oddCharsInURLs = oddCharsInURLs || "{}\^[]`" ;
/* strictly, should be: "{}\^~[|]`" ,
but '~' is in too many URLs in path specification
and '|' is being used for link text
(in the future may want to allow for alternate symbol for this)
and '\' is being used for some character substitution sequences
*/
IF \ punctuation THEN
oddCharsInURLs = oddCharsInURLs || '<>"#' ;
/* strictly, should be: '<>#%"' ,
but '%' is used for hex encoding and so not included here */
/* Now specificly exempt some characters: */
oddCharsInURLs = Translate( oddCharsInURLs, , escaper, ' ' ) ;
oddCharsInURLs = Space( oddCharsInURLs, 0 ) ;
/* --- This may need extension in the future */
IF listsubs THEN
DO
SAY 'Odd characters not allowed in URLs:'
SAY ' "'oddCharsInURLs'"' ;
END
/* Initialize reserve characters to be translated
These are tricky ones that are used until final
cleanup toward the end
The actual values, 130-132, is pretty arbitrary as long
as they are not in the normally printable or control ASCII values
They may have to be adjusted to deal with non-standard ASCII
values.
*/
fake = 128 /* 128 < fake +1 & fake + last fake char < 256 */
famp = D2C( fake + 1 ) ; /* Fake ampersand */
fhash = D2C( fake + 2 ) ; /* Fake hash, sharp or pound sign */
fsemi = D2C( fake + 3 ) ; /* Fake semicolon */
fperc = D2C( fake + 4 ) ; /* Fake percent */
fcolon = D2C( fake + 5 ) ; /* Fake colon */
fampfhash = famp || fhash ;
/* For translation just before printing the final output: */
dirtyin = famp || fhash || fsemi || fperc || fcolon ;
cleanout = '&' || '#' || ';' || '%' || ':' ;
/* These are for outside URLs, plain text */
/* Here handling characters guaranteed not to be in URLs */
/* These are the main trouble makers! */
reservechar. = '' ; /* again, set the default */
reservechar.1 = '<' ;
reservechar.2 = '>' ;
/* add more characters here as appropriate */
reservechar.0 = 2 ;
/* now set up an array of the decimal values of the characters: */
/* These are default values that can be over ridden below. */
reservenum. = '' ; /* set default */
DO nindex = 1 TO reservechar.0
tnumber = C2D( reservechar.nindex ) ;
tnumber = Right( tnumber, 3, '0' ) ;
reservenum.nindex = fampfhash || tnumber || fsemi ;
END nindex
reservenum.0 = reservechar.0 ; /* redundant loop, but ready for */
reservenum.1 = famp'lt'fsemi ; /* growth in the future */
reservenum.2 = famp'gt'fsemi ;
IF reservetrans THEN
IF listsubs THEN
DO
SAY 'Reserve character translations (-r, -x, -l):' ;
SAY 'in all text:' ;
DO index = 1 TO reservechar.0
SAY ' 'reservechar.index 'goes to',
Translate( reservenum.index, cleanout, dirtyin ) ;
END index
END
/* Now these are for characters that will not be inside an
URL, but generally have special meaning in URLs
*/
specialchar. = '' ; /* initialize compound variable */
specialchar.1 = '&' ;
specialchar.2 = '"' ;
specialchar.3 = '#' ;
specialchar.4 = ';' ;
specialchar.5 = '%' ;
specialchar.6 = ':' ;
specialchar.7 = '?' ;
specialchar.8 = '@' ;
specialchar.0 = 8 ;
specialnum. = '' ; /* set default */
DO nindex = 1 TO specialchar.0 /* set defaults for &#nnn; trans. */
tnumber = C2D( specialchar.nindex ) ;
tnumber = Right( tnumber, 3, '0' ) ;
specialnum.nindex = fampfhash || tnumber || fsemi ;
END nindex
specialnum.0 = specialchar.0 ;
specialnum.1 = famp'amp'fsemi ;
specialnum.2 = famp'quot'fsemi ;
IF reservetrans THEN
IF listsubs THEN
DO
SAY 'Special character translations (-r, -l, -x):' ;
SAY '(inside free text):' ;
DO index = 1 TO specialchar.0
SAY ' 'specialchar.index 'goes to',
Translate( specialnum.index, cleanout, dirtyin ) ;
END index
END
/* seperating these to simplify expanding all */
IF extendedtrans THEN /* &code; --> &#dec.number; */
DO /* %code; --> %#hex.number */
specialstr. = '' ; /* initialize compound variable */
specialstr.1 = '&NUL;' ; /* and standard control chars. */
specialstr.2 = '&SOH;' ; /* shifted off by 1 due to Rexx */
specialstr.3 = '&STX;' ; /* convention vrs. ASCII values */
specialstr.4 = '&ETX;' ;
specialstr.5 = '&EOT;' ;
specialstr.6 = '&ENQ;' ;
specialstr.7 = '&ACK;' ;
specialstr.8 = '&BEL;' ;
specialstr.9 = '&BS;' ;
specialstr.10 = '&HT;' ;
specialstr.11 = '&LF;' ;
specialstr.12 = '&VT;' ;
specialstr.13 = '&NP;' ;
specialstr.14 = '&CR;' ;
specialstr.15 = '&SO;' ;
specialstr.16 = '&SI;' ;
specialstr.17 = '&DLE;' ;
specialstr.18 = '&DC1;' ;
specialstr.19 = '&DC2;' ;
specialstr.20 = '&DC3;' ;
specialstr.21 = '&DC4;' ;
specialstr.22 = '&NAK;' ;
specialstr.23 = '&SYN;' ;
specialstr.24 = '&ETB;' ;
specialstr.25 = '&CAN;' ;
specialstr.26 = '&EM;' ;
specialstr.27 = '&SUB;' ;
specialstr.28 = '&ESC;' ;
specialstr.29 = '&FS;' ;
specialstr.30 = '&GS;' ;
specialstr.31 = '&RS;' ;
specialstr.32 = '&US;' ;
specialstr.33 = '&SP;' ; /* and include space with the other whitespace
*/
specialstr.34 = '&TAB;' ; /* alternate tab symbol */
specialstr.35 = '&\T;' ; /* here define some common C world escapes */
specialstr.36 = '&\R;' ;
specialstr.37 = '&\F;' ; /* form feed */
specialstr.38 = '&\A;' ; /* bell */
specialstr.39 = '&\E;' ; /* escape */
specialstr.40 = '&\B;' ; /* backspace */
specialstr.41 = '&\N;' ; /* 'newline' = form feed */
specialstr.42 = '&\0;' ; /* null */
specialstr.43 = '&\V;' ; /* vertical tab */
specialstr.44 = '&\S;' ; /* space (from perl) */
specialstr.45 = '&\\;' ; /* backslash itself */
specialstr.0 = 45 ;
/* .........setting up defaults */
specialnums. = '' ; /* initializing the compound variable */
DO nindex = 1 TO specialstr.0
nvalue = nindex - 1 ;
nvalue = Right( nvalue, 3, '0' ) ;
specialnums.nindex = fampfhash || nvalue || fsemi ;
END nindex
/* Now the irregular stuff: */
specialnums.34 = fampfhash'009'fsemi ; /* alternate horizontal Tab */
specialnums.35 = fampfhash'009'fsemi ; /* horizontal tab */
specialnums.36 = fampfhash'013'fsemi ; /* carriage return */
specialnums.37 = fampfhash'011'fsemi ; /* form feed */
specialnums.38 = fampfhash'007'fsemi ; /* bell */
specialnums.39 = fampfhash'027'fsemi ; /* escape */
specialnums.40 = fampfhash'008'fsemi ; /* backspace */
specialnums.41 = fampfhash'010'fsemi ; /* linefeed ('newline') */
specialnums.42 = fampfhash'000'fsemi ; /* NUL */
specialnums.43 = fampfhash'011'fsemi ; /* Vertical Tab */
specialnums.44 = fampfhash'032'fsemi ; /* alternate space */
specialnums.45 = fampfhash'092'fsemi ; /* backslash */
specialnums.0 = 45 ;
spexialnum. = '' ; /* again, initializing the compound variable */
DO index = 1 TO specialstr.0
spexialstr.index = Translate( specialstr.index, '%', '&' ) ;
spexialnum.index = Strip( specialnums.index, 'L', famp ) ;
spexialnum.index = Strip( spexialnum.index, 'L', fhash ) ;
spexialnum.index = Strip( spexialnum.index, 'T', fsemi ) ;
spexialnum.index = D2X( spexialnum.index ) ;
spexialnum.index = Right( spexialnum.index, 2, '0' ) ;
spexialnum.index = fperc || spexialnum.index ;
END index
spexialnum.0 = specialchar.0 ;
spexialchar.0 = spexialnum.0 ;
/* .......finishing off specialnum's with some special cases */
/* so we can do things associatively !!! */
specialvector. = '' ; /* initializing these 2 compound variables */
spexialvector. = '' ;
DO index = 1 TO specialstr.0 ;
vindex = specialstr.index ;
specialvector.vindex = specialnums.index ;
vindex = spexialstr.index ;
spexialvector.vindex = spexialnum.index ;
END index ;
IF extendedtrans THEN
IF listsubs THEN
DO
SAY 'Extended translations (-x):' ;
SAY 'In free text:' ;
DO index = 1 TO specialstr.0
SAY ' 'Left( specialstr.index, 5 ) 'goes to',
Translate( specialnums.index, cleanout, dirtyin ) ;
END index
SAY 'Inside an URL:' ;
DO index = 1 TO specialstr.0
SAY ' 'Left( spexialstr.index, 5 ) 'goes to',
Translate( spexialnum.index, cleanout, dirtyin ) ;
END index
END
END /* extended translations */
IF backslashtrans THEN /* \code --> &#dec.number; in body */
DO /* \code --> %hex# in URL */
/* escaper = '\' ; /* the escape character */ */
specialesc. = '' ; /* initializing this compound variable */
specialesc.1 = escaper'NUL' ; /* and standard control chars. */
specialesc.2 = escaper'SOH' ; /* shifted off by 1 due to Rexx */
specialesc.3 = escaper'STX' ; /* convention vrs. ASCII values */
specialesc.4 = escaper'ETX' ;
specialesc.5 = escaper'EOT' ;
specialesc.6 = escaper'ENQ' ;
specialesc.7 = escaper'ACK' ;
specialesc.8 = escaper'BEL' ;
specialesc.9 = escaper'BS' ;
specialesc.10 = escaper'HT' ;
specialesc.11 = escaper'LF' ;
specialesc.12 = escaper'VT' ;
specialesc.13 = escaper'NP' ;
specialesc.14 = escaper'CR' ;
specialesc.15 = escaper'SO' ;
specialesc.16 = escaper'SI' ;
specialesc.17 = escaper'DLE' ;
specialesc.18 = escaper'DC1' ;
specialesc.19 = escaper'DC2' ;
specialesc.20 = escaper'DC3' ;
specialesc.21 = escaper'DC4' ;
specialesc.22 = escaper'NAK' ;
specialesc.23 = escaper'SYN' ;
specialesc.24 = escaper'ETB' ;
specialesc.25 = escaper'CAN' ;
specialesc.26 = escaper'EM' ;
specialesc.27 = escaper'SUB' ;
specialesc.28 = escaper'ESC' ;
specialesc.29 = escaper'FS' ;
specialesc.30 = escaper'GS' ;
specialesc.31 = escaper'RS' ;
specialesc.32 = escaper'US' ;
specialesc.33 = escaper'SP' ; /* and include space with the other
whitespace */
specialesc.34 = escaper'TAB' ; /* alternate tab symbol */
specialesc.35 = escaper'T' ; /* here define some common C world escapes */
specialesc.36 = escaper'R' ;
specialesc.37 = escaper'F' ; /* form feed */
specialesc.38 = escaper'A' ; /* bell */
specialesc.39 = escaper'E' ; /* escape */
specialesc.40 = escaper'B' ; /* backspace */
specialesc.41 = escaper'N' ; /* 'newline' = form feed */
specialesc.42 = escaper'0' ; /* null */
specialesc.43 = escaper'V' ; /* vertical tab */
specialesc.44 = escaper'S' ; /* space (from perl) */
specialesc.45 = escaper'\' ; /* backslash itself */
specialesc.46 = escaper || escaper ;
specialesc.0 = 46 ;
/* .........setting up defaults */
specialnums. = '' ; /* initializing a compound variable */
DO nindex = 1 TO specialesc.0
nvalue = nindex - 1 ;
nvalue = Right( nvalue, 3, '0' ) ;
specialnums.nindex = fampfhash || nvalue || fsemi ;
END nindex
/* Now the irregular stuff: */
specialnums.34 = fampfhash'009'fsemi ; /* alternate horizontal Tab */
specialnums.35 = fampfhash'009'fsemi ; /* horizontal tab */
specialnums.36 = fampfhash'013'fsemi ; /* carriage return */
specialnums.37 = fampfhash'011'fsemi ; /* form feed */
specialnums.38 = fampfhash'007'fsemi ; /* bell */
specialnums.39 = fampfhash'027'fsemi ; /* escape */
specialnums.40 = fampfhash'008'fsemi ; /* backspace */
specialnums.41 = fampfhash'010'fsemi ; /* linefeed ('newline') */
specialnums.42 = fampfhash'000'fsemi ; /* NUL */
specialnums.43 = fampfhash'011'fsemi ; /* Vertical Tab */
specialnums.44 = fampfhash'032'fsemi ; /* alternate space */
specialnums.45 = fampfhash'092'fsemi ; /* backslash */
specialnums.46 = fampfhash || Right( C2D( escaper ), 3, '0' ) || fsemi ;
specialnums.0 = 46 ;
spexialenum. = '' ; /* initializing a compound variable */
DO index = 1 TO specialesc.0
spexialenum.index = Strip( specialnums.index, 'L', famp ) ;
spexialenum.index = Strip( spexialenum.index, 'L', fhash ) ;
spexialenum.index = Strip( spexialenum.index, 'T', fsemi ) ;
spexialenum.index = D2X( spexialenum.index ) ;
spexialenum.index = Right( spexialenum.index, 2, '0' ) ;
spexialenum.index = fperc || spexialenum.index ;
END index
spexialenum.0 = specialchar.0 ;
/* .......finishing off specialnum's with some special cases */
/* so we can do things associatively !!! */
specialvectors. = '' ; /* initializing these 2 compound variables */
spexialevector. = '' ;
DO index = 1 TO specialesc.0 ;
vindex = specialesc.index ;
specialvectors.vindex = specialnums.index ;
spexialevector.vindex = spexialenum.index ;
END index ;
IF backslashtrans THEN
IF listsubs THEN
DO
SAY '"Backslash" translations (-l):' ;
SAY 'In free text:' ;
DO index = 1 TO specialesc.0
SAY ' 'Left( specialesc.index, 4 ) 'goes to',
Translate( specialnums.index, cleanout, dirtyin ) ;
END index
SAY 'Inside an URL:' ;
DO index = 1 TO specialesc.0
SAY ' 'Left( specialesc.index, 4 ) 'goes to',
Translate( spexialenum.index, cleanout, dirtyin ) ;
END index
END
END /* backslash escaped translations */
/* finished setting up data structures **********************/
IF listsubs THEN
EXIT ;
RETURN ;
/*
Dallas's home page
*/
Dallas E. Legan
mailto:aw585@lafn.org