/*************************************************************************/
/* gutcheck - check for assorted weirdnesses in a PG candidate text file */
/*                                                                       */
/* Version 0.95. Copyright 2001, 2002 Jim Tinsley <jtinsley@pobox.com>   */
/*                                                                       */
/* This program is free software; you can redistribute it and/or modify  */
/* it under the terms of the GNU General Public License as published by  */
/* the Free Software Foundation; either version 2 of the License, or     */
/* (at your option) any later version.                                   */
/*                                                                       */
/* This program is distributed in the hope that it will be useful,       */
/* but WITHOUT ANY WARRANTY; without even the implied warranty of        */
/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         */
/* GNU General Public License for more details.                          */
/*                                                                       */
/* You should have received a copy of the GNU General Public License     */
/* along with this program; if not, write to the                         */
/*      Free Software Foundation, Inc.,                                  */
/*      59 Temple Place,                                                 */
/*      Suite 330,                                                       */
/*      Boston, MA  02111-1307  USA                                      */
/*                                                                       */
/*                                                                       */
/*                                                                       */
/* Overview comments:                                                    */
/*                                                                       */
/* If you're reading this, you're either interested in how to detect     */
/* formatting errors, or very very bored.                                */
/*                                                                       */
/* Gutcheck is a homebrew formatting checker specifically for            */
/* spotting common formatting problems in a PG e-text. I typically       */
/* run it once or twice on a file I'm about to submit; it usually        */
/* finds a few formatting problems. It also usually finds lots of        */
/* queries that aren't problems at all; it _really_ doesn't like         */
/* the standard PG header, for example.  It's optimized for straight     */
/* prose; poetry and non-fiction involving tables tend to trigger        */
/* false alarms.                                                         */
/*                                                                       */
/* The code of gutcheck is not very interesting, but the experience      */
/* of what constitutes a possible error may be, and the best way to      */
/* illustrate that is by example.                                        */
/*                                                                       */
/*                                                                       */
/* Here are some common typos found in PG texts that gutcheck            */
/* will flag as errors:                                                  */
/*                                                                       */
/* "Look!John , over there!"                                             */
/* <this is a HTML tag>                                                  */
/* &so is this;                                                          */
/* Margaret said: " Now you should start for school."                    */
/* Margaret said: "Now you should start for school. (if end of para)     */
/* The horse is said to he worth a lot.                                  */
/* 0K - this'11 make you look close1y.                                   */
/*                                                                       */
/* There are some complications . The extra space left around that       */
/* period was an error . . . but that ellipsis wasn't.                   */
/*                                                                       */
/* The last line of a paragraph                                          */
/* is usually short.                                                     */
/*                                                                       */
/* This period is an error.But the periods in a.m. aren't.               */
/*                                                                       */
/* Checks that are do-able but not (well) implemented are:               */
/*     1. Spelling.                                                      */
/*          Gutcheck does some limited spelling checks,                  */
/*          but does not try to match a dedicated checker.               */
/*          I am playing around with models for a PG-friendly            */
/*          spelling-checker in gutcheck's baby brother gutspell.        */
/*          Maybe someday I'll integrate them.                           */
/*     2. Short lines.                                                   */
/*          Lots of short lines are deliberate, like tables of           */
/*          contents or letter headings. This is do-able, but way        */
/*          down the priority list, since I don't find the existing      */
/*          over-reporting to be an issue.                               */
/*     3. Single-quote chcking.                                          */
/*          Despite 3 attempts at it, singlequote checking is still      */
/*          crap in gutcheck. Can you figure a better way to do it?      */
/*     4. Sentence punctuation and capitals.                             */
/*          It _is_ do-able, with acceptable accuracy, to check for      */
/*          "period-and-a-capital letter" at sentence ends, and this     */
/*          would be a worthwhile addition, since it's a fairly          */
/*          common problem.                                              */
/*                                                                       */
/* I'd also like to hear your ideas for more checks.                     */
/*                                                                       */
/*************************************************************************/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAXWORDLEN    50    /* max length of one word             */
#define LINEBUFSIZE 2048    /* buffer size for an input line      */

char aline[LINEBUFSIZE];
char prevline[LINEBUFSIZE];

char *typo[] = { "teh", "th", "og", "fi", "ro", "adn", "yuo", "ot", "fo", "thet", "ane",
                "th", "te", "ig", "acn",  "ahve", "alot", "anbd", "andt", "awya", "aywa", "bakc", "om",
                "btu", "byt", "cna", "cxan", "coudl", "dont", "didnt", "couldnt", "wouldnt", "doesnt", "shouldnt", "doign", "ehr",
                "hmi", "hse", "esle", "eyt", "fitrs", "firts", "foudn", "frmo", "fromt", "fwe", "gaurd", "gerat", "goign",
                "gruop", "haev", "hda", "hearign", "seeign", "sayign", "herat", "hge", "hlep", "hsa", "hsi", "hte", "htere",
                "htese", "htey", "htis", "hvae", "hwich", "idae", "ihs", "iits", "int", "iwll", "iwth", "jsut", "loev",
                "sefl", "myu", "nkow", "nver", "nwe", "nwo", "ocur", "ohter", "omre", "onyl", "otehr", "otu", "owrk",
                "owuld", "peice", "peices", "peolpe", "peopel", "perhasp", "perhpas", "pleasent", "poeple", "porblem",
                "porblems", "rwite", "saidt", "saidh", "saids", "seh", "smae", "smoe", "sohw", "stnad", "stopry",
                "stoyr", "stpo", "tahn", "taht", "tath", "tehy", "tghe", "tghis", "theri", "theyll", "thgat", "thge",
                "thier", "thna", "thne", "thnig", "thnigs", "thsi", "thsoe", "thta", "timne", "tirne", "tje", "tjhe", "tkae",
                "tthe", "tyhat", "tyhe", "veyr", "vou", "vour", "vrey", "waht", "wasnt", "awtn", "watn", "wehn", "whic", "whcih",
                "whihc", "whta", "wihch", "wief", "wiht", "witha", "wiull", "wnat", "wnated", "wnats",
                "woh", "wohle", "wokr", "woudl", "wriet", "wrod", "wroet", "wroking", "wtih", "wuould", "wya", "yera",
                "yeras", "yersa", "yoiu", "youve", "ytou", "yuor",
                /* added h/b words for version 12 - removed a few with "tbe" v.25 */
                "abead","ahle","ahout","ahove","altbough","balf","bardly","bas","bave","baving","bebind",
                "beld","belp","belped","ber","bere","bim","bis","bome","bouse","bowever","buge","dehates",
                "deht","han","hecause","hecome","heen","hefore","hegan","hegin","heing",
                "helieve","henefit","hetter","hetween","heyond","hig","higber","huild","huy","hy","jobn","joh",
                "meanwbile","memher","memhers","notbing","numher","numhers",
                "perbaps","prohlem","puhlic","scbool","scbools","sometbing","witbin","witbout",
                /* and a few more for .18 */
                "arn", "hin", "hirn", "wrok", "wroked", "amd", "aud", "prornise", "prornised","modem","bo",
                 ""};

                 /* Common abbreviations and other OK words not to query as typos. */
char *okword[] = {"mr", "mrs", "ms", "mss", "mssrs", "ft", "pm", "st", "dr", "hmm", "h'm", "hmmm", "rd", "sh", "br",
                  "pp", "hm", "cf", "jr", "sr", "vs", "lb", "lbs", "ltd", "pompeii","hawaii","hawaiian",
                  ""};

/* ---- list of special characters ---- */
#define CHAR_SPACE        32
#define CHAR_TAB           9
#define CHAR_LF           10
#define CHAR_CR           13
#define CHAR_DQUOTE       34
#define CHAR_SQUOTE       39
#define CHAR_OPEN_SQUOTE  96
#define CHAR_TILDE       126
#define CHAR_ASTERISK     42

#define CHAR_OPEN_CBRACK   '{'
#define CHAR_CLOSE_CBRACK  '}'
#define CHAR_OPEN_RBRACK   '('
#define CHAR_CLOSE_RBRACK  ')'
#define CHAR_OPEN_SBRACK   '['
#define CHAR_CLOSE_SBRACK  ']'





/* ---- longest and shortest normal PG line lengths ----*/
#define LONGEST_PG_LINE   75
#define WAY_TOO_LONG      80
#define SHORTEST_PG_LINE  55

#define SWITCHES "ESTPXLOYHWV" /* switches:-                            */
                               /*     E - echo queried line             */
                               /*     S - check single quotes           */
                               /*     T - check common typos            */
                               /*     P - require closure of quotes on  */
                               /*         every paragraph               */
                               /*     X - "Trust no one" :-) Paranoid!  */
                               /*         Queries everything            */
                               /*     L - line end checking defaults on */
                               /*         -L turns it off               */
                               /*     O - overview. Just shows counts.  */
                               /*     Y - puts errors to stdout         */
                               /*         instead of stderr             */
                               /*     H - Echoes header fields          */
                               /*     W - Defaults for use on Web upload*/
                               /*     V - Verbose - list EVERYTHING!    */
#define SWITNO 11              /* max number of switch parms            */
                               /*        - used for defining array-size */
#define MINARGS   1            /* minimum no of args excl switches      */
#define MAXARGS   1            /* maximum no of args excl switches      */

int pswit[SWITNO];             /* program switches set by SWITCHES      */
#define ECHO_SWITCH      0
#define SQUOTE_SWITCH    1
#define TYPO_SWITCH      2
#define QPARA_SWITCH     3
#define PARANOID_SWITCH  4
#define LINE_END_SWITCH  5
#define OVERVIEW_SWITCH  6
#define STDOUT_SWITCH    7
#define HEADER_SWITCH    8
#define WEB_SWITCH       9
#define VERBOSE_SWITCH   10



long cnt_dquot;       /* for overview mode, count of doublequote queries */
long cnt_squot;       /* for overview mode, count of singlequote queries */
long cnt_brack;       /* for overview mode, count of brackets queries */
long cnt_bin;         /* for overview mode, count of non-ASCII queries */
long cnt_odd;         /* for overview mode, count of odd character queries */
long cnt_long;        /* for overview mode, count of long line errors */
long cnt_short;       /* for overview mode, count of short line queries */
long cnt_punct;       /* for overview mode, count of punctuation and spacing queries */
long cnt_dash;        /* for overview mode, count of dash-related queries */
long cnt_word;        /* for overview mode, count of word queries */
long cnt_html;        /* for overview mode, count of html queries */
long cnt_lineend;     /* for overview mode, count of line-end queries */
long cnt_spacend;     /* count of lines with space at end  V .21 */
long linecnt;         /* count of total lines in the file */
long checked_linecnt; /* count of lines actually gutchecked V .26 */

void proghelp(void);
void procfile(char *);

int mixdigit(char *);
char *getaword(char *, char *);
int matchword(char *, char *);
char *flgets(char *, int, FILE *, long);
void lowerit(char *);
int gcisalpha(unsigned char);
int gcisdigit(unsigned char);
char *gcstrchr(char *s, char c);

char wrk[LINEBUFSIZE];
#define MAX_QWORD           40                                        
#define MAX_QWORD_LENGTH    10                                        
char qword[MAX_QWORD][MAX_QWORD_LENGTH];
signed int dupcnt[MAX_QWORD];



int main(int argc, char **argv)
{
    char *argsw;
    int i, switno, invarg;


    switno = strlen(SWITCHES);
    for (i = switno ; --i >0 ; )
        pswit[i] = 0;           /* initialise switches */

    /* Standard loop to extract switches.                   */
    /* When we come out of this loop, the arguments will be */
    /* in argv[0] upwards and the switches used will be     */
    /* represented by their equivalent elements in pswit[]  */
    while ( --argc > 0 && **++argv == '-')
        for (argsw = argv[0]+1; *argsw !='\0'; argsw++)
            for (i = switno, invarg = 1; (--i >= 0) && invarg == 1 ; )
                if ((toupper(*argsw)) == SWITCHES[i] ) {
                    invarg = 0;
                    pswit[i] = 1;
                    }

    pswit[PARANOID_SWITCH] ^= 1;         /* Paranoid checking is turned OFF, not on, by its switch */

    if (pswit[PARANOID_SWITCH]) {                         /* if running in paranoid mode */
        pswit[TYPO_SWITCH] = pswit[TYPO_SWITCH] ^ 1;      /* force typo checks as well   */
        }                                                 /* v.20 removed s and p switches from paranoid mode */

    pswit[LINE_END_SWITCH] ^= 1;         /* Line-end checking is turned OFF, not on, by its switch */
    pswit[ECHO_SWITCH] ^= 1;             /* V.21 Echoing is turned OFF, not on, by its switch      */

    if (pswit[OVERVIEW_SWITCH])       /* just print summary; don't echo */
        pswit[ECHO_SWITCH] = 0;

    /* Web uploads - for the moment, this is really just a placeholder
       until we decide what processing we really want to do on web uploads */
    if (pswit[WEB_SWITCH]) {          /* specific override for web uploads */
        pswit[ECHO_SWITCH] =     1;
        pswit[SQUOTE_SWITCH] =   0;
        pswit[TYPO_SWITCH] =     1;
        pswit[QPARA_SWITCH] =    0;
        pswit[PARANOID_SWITCH] = 1;
        pswit[LINE_END_SWITCH] = 0;
        pswit[OVERVIEW_SWITCH] = 0;
        pswit[STDOUT_SWITCH] =   0;
        pswit[HEADER_SWITCH] =   1;
        pswit[VERBOSE_SWITCH] =  0;
        }


    if (argc < MINARGS || argc > MAXARGS) {  /* check number of args */
        proghelp();
        return(1);            /* exit */
        }

    fprintf(stderr, "gutcheck: Check and report on an e-text\n");

    cnt_dquot = cnt_squot = cnt_brack = cnt_bin = cnt_odd = cnt_long =
    cnt_short = cnt_punct = cnt_dash = cnt_word = cnt_html = cnt_lineend =
    cnt_spacend = 0;

    procfile(argv[0]);

    if (pswit[OVERVIEW_SWITCH]) {
                         printf("    Checked %ld lines of %ld (head+foot = %ld)\n\n",
                            checked_linecnt, linecnt, linecnt - checked_linecnt);
                         printf("    --------------- Queries found --------------\n");
        if (cnt_long)    printf("    Long lines:                             %5ld\n",cnt_long);
        if (cnt_short)   printf("    Short lines:                            %5ld\n",cnt_short);
        if (cnt_lineend) printf("    Line-end problems:                      %5ld\n",cnt_lineend);
        if (cnt_word)    printf("    Common typos:                           %5ld\n",cnt_word);
        if (cnt_dquot)   printf("    Unmatched quotes:                       %5ld\n",cnt_dquot);
        if (cnt_squot)   printf("    Unmatched SingleQuotes:                 %5ld\n",cnt_squot);
        if (cnt_brack)   printf("    Unmatched brackets:                     %5ld\n",cnt_brack);
        if (cnt_bin)     printf("    Non-ASCII characters:                   %5ld\n",cnt_bin);
        if (cnt_odd)     printf("    Proofing characters:                    %5ld\n",cnt_odd);
        if (cnt_punct)   printf("    Punctuation & spacing queries:          %5ld\n",cnt_punct);
        if (cnt_dash)    printf("    Non-standard dashes:                    %5ld\n",cnt_dash);
        if (cnt_html)    printf("    Possible HTML tags:                     %5ld\n",cnt_html);
        printf("\n");
        printf("    TOTAL QUERIES                           %5ld\n",
            cnt_dquot + cnt_squot + cnt_brack + cnt_bin + cnt_odd + cnt_long +
            cnt_short + cnt_punct + cnt_dash + cnt_word + cnt_html + cnt_lineend);
        }

    return(0);
}



/* procfile - process one file */

void procfile(char *filename)
{

    char *s, *t, laststart;
    char inword[MAXWORDLEN], testword[MAXWORDLEN];
    char parastart[81];     /* first line of current para */
    FILE *infile;
    long quot, squot, firstline, alphalen, totlen, binlen,
         shortline, longline, verylongline, spacedash, emdash,
         space_emdash, non_PG_space_emdash, PG_space_emdash,
         footerline, dotcomma, start_para_line;
    long spline, nspline;
    signed int i, j, llen, isemptyline, isacro, isellipsis, istypo, alower,
         eNon_A, eTab, eTilde, eAst;
    signed int warn_short, warn_long, warn_bin, warn_dash, warn_dotcomma;
    unsigned int lastlen, lastblen;
    signed int s_brack, c_brack, r_brack;
    signed int open_single_quote, close_single_quote, guessquote;
    signed int isnewpara, vowel, consonant;
    char dquote_err[80], squote_err[80], rbrack_err[80], sbrack_err[80], cbrack_err[80];
    signed int qword_index, isdup;
    signed int enddash;



    laststart = CHAR_SPACE;
    lastlen = lastblen = 0;
    *dquote_err = *squote_err = *rbrack_err = *cbrack_err = *sbrack_err = *prevline = 0;
    linecnt = firstline = alphalen = totlen = binlen =
        shortline = longline = spacedash = emdash = checked_linecnt =
        space_emdash = non_PG_space_emdash = PG_space_emdash =
        footerline = dotcomma = start_para_line = 0;
    quot = squot = s_brack = c_brack = r_brack = 0;
    i = llen = isemptyline = isacro = isellipsis = istypo = 0;
    warn_short = warn_long = warn_bin = warn_dash = warn_dotcomma = 0;
    isnewpara = vowel = consonant = 0;
    spline = nspline = 0;
    qword_index = isdup = 0;
    *inword = *testword = 0;
    open_single_quote = close_single_quote = guessquote = 0;


    for (j = 0; j < MAX_QWORD; j++) {
        dupcnt[j] = 0;
        for (i = 0; i < MAX_QWORD_LENGTH; i++)
            qword[i][j] = 0;
            }


    if ((infile = fopen(filename, "rb")) == NULL) {
        if (pswit[STDOUT_SWITCH])
            fprintf(stdout, "gutcheck: cannot open %s\n", filename);
        else
            fprintf(stderr, "gutcheck: cannot open %s\n", filename);
        exit(1);
        }

    fprintf(stdout, "\n\nFile: %s\n\n", filename);
    firstline = shortline = longline = verylongline = 0;


    /*****************************************************/
    /*                                                   */
    /*  Run a first pass - verify that it's a valid PG   */
    /*  file, decide whether to report some things that  */
    /*  occur many times in the text like long or short  */
    /*  lines, non-standard dashes, and other good stuff */
    /*  I'll doubtless think of later.                   */
    /*                                                   */
    /*****************************************************/

    /*****************************************************/
    /* V.24  Sigh. Yet Another Header Change             */
    /*****************************************************/

    while (fgets(aline, LINEBUFSIZE-1, infile)) {
        while (aline[strlen(aline)-1] == 10 || aline[strlen(aline)-1] == 13 ) aline[strlen(aline)-1] = 0;
        linecnt++;
        if (strstr(aline, "*END") && strstr(aline, "SMALL PRINT") && (strstr(aline, "PUBLIC DOMAIN") || strstr(aline, "COPYRIGHT"))) {
            if (spline)
                printf("   --> Duplicate header?\n");
            spline = linecnt + 1;   /* first line of non-header text, that is */
            }
        if (!strncmp(aline, "*** START", 9) && strstr(aline, "PROJECT GUTENBERG")) {
            if (nspline)
                printf("   --> Duplicate header?\n");
            nspline = linecnt + 1;   /* first line of non-header text, that is */
            }
        if (spline || nspline) {
            lowerit(aline);
            if (strstr(aline, "end") && strstr(aline, "project gutenberg")) {
                if (strstr(aline, "end") < strstr(aline, "project gutenberg")) {
                    if (footerline) {
                        if (!nspline) /* it's an old-form header - we can detect duplicates */
                            printf("   --> Duplicate footer?\n");
                        else 
                            ;
                        }
                    else {
                        footerline = linecnt;
                        }
                    }
                }
            }
        if (spline) firstline = spline;
        if (nspline) firstline = nspline;  /* override with new */

        llen = strlen(aline);
        totlen += llen;
        for (i = 0; i < llen; i++) {
            if ((unsigned char)aline[i] > 127) binlen++;
            if (gcisalpha(aline[i])) alphalen++;
            }
        if (strlen(aline) > 2
            && lastlen > 2 && lastlen < SHORTEST_PG_LINE
            && lastblen > 2 && lastblen > SHORTEST_PG_LINE
            && laststart != CHAR_SPACE)
                shortline++;

        if (*aline)
            if (aline[strlen(aline)-1] <= 32) cnt_spacend++;

        if (strstr(aline, ".,")) dotcomma++;

        if (llen > LONGEST_PG_LINE) longline++;
        if (llen > WAY_TOO_LONG) verylongline++;
        /* Check for spaced em-dashes */
        if (strstr(aline,"--")) {
            emdash++;
            if (*(strstr(aline, "--")-1) == CHAR_SPACE ||
               (*(strstr(aline, "--")+2) == CHAR_SPACE))
                    space_emdash++;
            if (*(strstr(aline, "--")-1) == CHAR_SPACE &&
               (*(strstr(aline, "--")+2) == CHAR_SPACE))
                    non_PG_space_emdash++;             /* count of em-dashes with spaces both sides */
            if (*(strstr(aline, "--")-1) != CHAR_SPACE &&
               (*(strstr(aline, "--")+2) != CHAR_SPACE))
                    PG_space_emdash++;                 /* count of PG-type em-dashes with no spaces */
            }


        /* Check for spaced dashes */
        if (strstr(aline," -"))
            if (*(strstr(aline, " -")+2) != '-')
                    spacedash++;
        lastblen = lastlen;
        lastlen = strlen(aline);
        laststart = aline[0];

        }
    fclose(infile);


    /* now, based on this quick view, make some snap decisions */
    if (cnt_spacend > 0) {
        printf("   --> %ld lines in this file have white space at end\n", cnt_spacend);
        }

    warn_dotcomma = 1;
    if (dotcomma > 5) {
        warn_dotcomma = 0;
        printf("   --> %ld lines in this file contain '.,'. Not reporting them.\n", dotcomma);
        }

    /* if more than 50 lines, or one-tenth, are short, don't bother reporting them */
    warn_short = 1;
    if (shortline > 50 || shortline * 10 > linecnt) {
        warn_short = 0;
        printf("   --> %ld lines in this file are short. Not reporting short lines.\n", shortline);
        }

    /* if more than 50 lines, or one-tenth, are long, don't bother reporting them */
    warn_long = 1;
    if (longline > 50 || longline * 10 > linecnt) {
        warn_long = 0;
        printf("   --> %ld lines in this file are long. Not reporting long lines.\n", longline);
        }

    if (verylongline > 0) {
        printf("   --> %ld lines in this file are VERY long!\n", verylongline);
        }

    /* If there are more non-PG spaced dashes than PG em-dashes,    */
    /* assume it's deliberate                                       */
    /* Current PG guidelines say don't use them, but older texts do,*/
    /* and some people insist on them whatever the guidelines say.  */
    /* V.20 removed requirement that PG_space_emdash be greater than*/
    /* ten before turning off warnings about spaced dashes.         */
    warn_dash = 1;
    if (spacedash + non_PG_space_emdash > PG_space_emdash) {
        warn_dash = 0;
        printf("   --> There are %ld spaced dashes and em-dashes. Not reporting them.\n", spacedash + non_PG_space_emdash);
        }

    /* if more than a quarter of characters are hi-bit, bug out */
    warn_bin = 1;
    if (binlen * 4 > totlen) {
        printf("   --> This file does not appear to be ASCII. Terminating. Best of luck with it!\n");
        exit(1);
        }
    if (alphalen * 4 < totlen) {
        printf("   --> This file does not appear to be text. Terminating. Best of luck with it!\n");
        exit(1);
        }
    if ((binlen * 100 > totlen) || (binlen > 200)) {
        printf("   --> There are a lot of foreign letters here. Not reporting them.\n");
        warn_bin = 0;
        }
    if (firstline && footerline)
        printf("    The PG header and footer appear to be already on.\n");
    else {
        if (firstline)
            printf("    The PG header is on - no footer.\n");
        if (footerline)
            printf("    The PG footer is on - no header.\n");
        }
    printf("\n");

    /* V.22 George Davis asked for an override switch to force it to list everything */
    if (pswit[VERBOSE_SWITCH]) {
        warn_bin = 1;
        warn_short = 1;
        warn_dotcomma = 1;
        warn_long = 1;
        warn_dash = 1;
        printf("   *** Verbose output is ON -- you asked for it! ***\n");
        }

    if ((infile = fopen(filename, "rb")) == NULL) {
        if (pswit[STDOUT_SWITCH])
            fprintf(stdout, "gutcheck: cannot open %s\n", filename);
        else
            fprintf(stderr, "gutcheck: cannot open %s\n", filename);
        exit(1);
        }

    if (footerline > 0 && firstline > 0 && footerline > firstline && footerline - firstline < 100) { /* ugh */
        printf("   --> I don't really know where this text starts. \n");
        printf("       There are no reference points.\n");
        printf("       I'm going to have to report the header and footer as well.\n");
        firstline=0;
        }
        


    /*****************************************************/
    /*                                                   */
    /* Here we go with the main pass. Hold onto yer hat! */
    /*                                                   */
    /*****************************************************/

    /* Re-init some variables we've dirtied */
    quot = squot = linecnt = 0;
    laststart = CHAR_SPACE;
    lastlen = lastblen = 0;

    while (flgets(aline, LINEBUFSIZE-1, infile, linecnt+1)) {
        linecnt++;
        if (linecnt < firstline || (footerline > 0 && linecnt > footerline)) {
            if (pswit[HEADER_SWITCH]) {
                if (!strncmp(aline, "Title:", 6))
                    printf("    %s\n", aline);
                if (!strncmp (aline, "Author:", 7))
                    printf("    %s\n", aline);
                if (!strncmp(aline, "Release Date:", 13))
                    printf("    %s\n", aline);
                if (!strncmp(aline, "Edition:", 8))
                    printf("    %s\n\n", aline);
                }
            continue;                /* skip through the header */
            }
        checked_linecnt++;
        s = aline;
        isemptyline = 1;      /* assume the line is empty until proven otherwise */

        /* If we are in a state of unbalanced quotes, and this line    */
        /* doesn't begin with a quote, output the stored error message */
        /* If the -P switch was used, print the warning even if the    */
        /* new para starts with quotes                                 */
        /* Version .20 - if the new paragraph does start with a quote, */
        /* but is indented, I was giving a spurious error. Need to     */
        /* check the first _non-space_ character on the line rather    */
        /* than the first character when deciding whether the para     */
        /* starts with a quote. Using *t for this.                     */
        t = s;
        while (*t == ' ') t++;
        if (*dquote_err)
            if (*t != CHAR_DQUOTE || pswit[QPARA_SWITCH]) {
                if (!pswit[OVERVIEW_SWITCH]) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", parastart);
                    printf(dquote_err);
                    }
                else
                    cnt_dquot++;
            }
        if (*squote_err) {
            if (*t != CHAR_SQUOTE && *t != CHAR_OPEN_SQUOTE || pswit[QPARA_SWITCH] || squot) {
                if (!pswit[OVERVIEW_SWITCH]) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", parastart);
                    printf(squote_err);
                    }
                else
                    cnt_squot++;
                }
            squot = 0;
            }
        if (*rbrack_err) {
            if (!pswit[OVERVIEW_SWITCH]) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", parastart);
                printf(rbrack_err);
                }
            else
                cnt_brack++;
            }
        if (*sbrack_err) {
            if (!pswit[OVERVIEW_SWITCH]) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", parastart);
                printf(sbrack_err);
                }
            else
                cnt_brack++;
            }
        if (*cbrack_err) {
            if (!pswit[OVERVIEW_SWITCH]) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", parastart);
                printf(cbrack_err);
                }
            else
                cnt_brack++;
            }

        *dquote_err = *squote_err = *rbrack_err = *cbrack_err = *sbrack_err = 0;


        /* look along the line, accumulate the count of quotes, and see */
        /* if this is an empty line - i.e. a line with nothing on it    */
        /* but spaces.                                                  */
        /* V .12 also if line has just spaces, * and/or - on it, don't  */
        /* count it, since empty lines with asterisks or dashes to      */
        /* separate sections are common.                                */
        /* V .15 new single-quote checking - has to be better than the  */
        /* previous version, but how much better? fingers crossed!      */
        /* V .20 add period to * and - as characters on a separator line*/
        s = aline;
        while (*s) {
            if (*s == CHAR_DQUOTE) quot++;
            if (*s == CHAR_SQUOTE || *s == CHAR_OPEN_SQUOTE)
                if (s == aline) { /* at start of line, it can only be an openquote */
                    if (strncmp(s+2, "tis", 3) && strncmp(s+2, "Tis", 3)) /* hardcode a very common exception! */
                        open_single_quote++;
                    }
                else
                    if (gcisalpha(*(s-1)) && gcisalpha(*(s+1)))
                        ; /* do nothing! - it's definitely an apostrophe, not a quote */
                    else        /* it's outside a word - let's check it out */
                        if (*s == CHAR_OPEN_SQUOTE || gcisalpha(*(s+1))) { /* it damwell better BE an openquote */
                            if (strncmp(s+1, "tis", 3) && strncmp(s+1, "Tis", 3)) /* hardcode a very common exception! */
                                open_single_quote++;
                            }
                        else { /* now - is it a closequote? */
                            guessquote = 0;   /* accumulate clues */
                            if (gcisalpha(*(s-1))) { /* it follows a letter - could be either */
                                guessquote += 1;
                                if (*(s-1) == 's') { /* looks like a plural apostrophe */
                                    guessquote -= 3;
                                    if (*(s+1) == CHAR_SPACE)  /* bonus marks! */
                                        guessquote -= 2;
                                    }
                                }
                            else /* it doesn't have a letter either side */
                                if (strchr(".?!,;:", *(s-1)) && (strchr(".?!,;: ", *(s+1))))
                                    guessquote += 8; /* looks like a closequote */
                                else
                                    guessquote += 1;
                            if (open_single_quote > close_single_quote)
                                guessquote += 1; /* give it the benefit of some doubt - if a squote is already open */
                            else
                                guessquote -= 1;
                            if (guessquote >= 0)
                                close_single_quote++;
                            }

            if (*s != CHAR_SPACE
                && *s != '-'
                && *s != '.'
                && *s != CHAR_ASTERISK
                && *s != 13
                && *s != 10) isemptyline = 0;  /* ignore lines like  *  *  *  as spacers */
            if (*s == CHAR_OPEN_CBRACK) c_brack++;
            if (*s == CHAR_CLOSE_CBRACK) c_brack--;
            if (*s == CHAR_OPEN_RBRACK) r_brack++;
            if (*s == CHAR_CLOSE_RBRACK) r_brack--;
            if (*s == CHAR_OPEN_SBRACK) s_brack++;
            if (*s == CHAR_CLOSE_SBRACK) s_brack--;
            s++;
            }

        if (isnewpara && !isemptyline) {   /* This line is the start of a new paragraph */
            start_para_line = linecnt;
            strncpy(parastart, aline, 80); /* Capture its first line in case we want to report it later */
            parastart[79] = 0;
            s = aline;
            while (!gcisalpha(*s) && !gcisdigit(*s)) s++;
            if (*s >= 'a' && *s <='z') { /* and its first letter is lowercase */
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - Paragraph starts with lower-case\n", linecnt);
                else
                    cnt_punct++;
                }
            isnewpara = 0; /* Signal the end of new para processing */
            }

        /* Check for an em-dash broken at line end */
        if (enddash && *aline == '-') {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld - Broken em-dash?\n", linecnt);
            else
                cnt_punct++;
            }
        enddash = 0;
        for (s = aline + strlen(aline) - 1; *s == ' ' && s > aline; s--);
        if (s >= aline && *s == '-')
            enddash = 1;
            

        /* Check for invalid or questionable characters in the line */
        /* Anything above 127 is invalid for plain ASCII,  and      */
        /* non-printable control characters should also be flagged. */
        /* Tabs should generally not be there.                      */
        if (warn_bin) {
            eNon_A = eTab = eTilde = eAst = 0;  /* don't repeat multiple warnings on one line */
            for (s = aline; *s; s++) {
                if (!eNon_A && ((*s < CHAR_SPACE && *s != 9 && *s != '\n') || (unsigned char)*s > 126)) {
                    i = *s;                           /* annoying kludge for signed chars */
                    if (i < 0) i += 256;
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Non-ASCII character %d\n", linecnt, i);
                    else
                        cnt_bin++;
                    eNon_A = 1;
                    }
                if (!eTab && *s == CHAR_TAB) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Tab character?\n", linecnt);
                    else
                        cnt_odd++;
                    eTab = 1;
                    }
                if (!eTilde && *s == CHAR_TILDE) {  /* often used by OCR software to indicate an unrecognizable character */
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Tilde character?\n", linecnt);
                    else
                        cnt_odd++;
                    eTilde = 1;
                    }
                /* report asterisks only in paranoid mode, since they're often deliberate */
                if (!eAst && pswit[PARANOID_SWITCH] && !isemptyline && *s == CHAR_ASTERISK) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Asterisk?\n", linecnt);
                    else
                        cnt_odd++;
                    eAst = 1;
                    }
                }
            }

        /* Check for line too long */
        if (warn_long) {
            if (strlen(aline) > LONGEST_PG_LINE) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - Long line %d\n", linecnt, strlen(aline));
                else
                    cnt_long++;
                }
            }

        /* Check for line too short.                                     */
        /* This one is a bit trickier to implement: we don't want to     */
        /* flag the last line of a paragraph for being short, so we      */
        /* have to wait until we know that our current line is a         */
        /* "normal" line, then report the _previous_ line if it was too  */
        /* short. We also don't want to report indented lines like       */
        /* chapter heads or formatted quotations. We therefore keep      */
        /* lastlen as the length of the last line examined, and          */
        /* lastblen as the length of the last but one, and try to        */
        /* suppress unnecessary warnings by checking that both were of   */
        /* "normal" length. We keep the first character of the last      */
        /* line in laststart, and if it was a space, we assume that the  */
        /* formatting is deliberate. I can't figure out a way to         */
        /* distinguish something like a quoted verse left-aligned or     */
        /* the header or footer of a letter from a paragraph of short    */
        /* lines - maybe if I examined the whole paragraph, and if the   */
        /* para has less than, say, 8 lines and if all lines are short,  */
        /* then just assume it's OK? Need to look at some texts to see   */
        /* how often a formula like this would get the right result.     */
        if (warn_short) {
            if (strlen(aline) > 2
                && lastlen > 2 && lastlen < SHORTEST_PG_LINE
                && lastblen > 2 && lastblen > SHORTEST_PG_LINE
                && laststart != CHAR_SPACE) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", prevline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Short line %d?\n", linecnt-1, strlen(prevline));
                    else
                        cnt_short++;
                    }
            }
        lastblen = lastlen;
        lastlen = strlen(aline);
        laststart = aline[0];

        /* look for punctuation at start of line */
        if  (*aline && strchr(".?!,;:",  aline[0]))  {            /* if it's punctuation */
            if (aline[1] != ' ' || aline[2] != '.') {   /* exception for ellipsis */
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - Begins with punctuation?\n", linecnt);
                else
                    cnt_punct++;
                }
            }

        /* Check for spaced em-dashes */
        /* V.20 must check _all_ occurrences of "--" on the line */
        /* hence the loop - even if the first double-dash is OK  */
        /* there may be another that's wrong later on.           */
        if (warn_dash) {
            s = aline;
            while (strstr(s,"--")) {
                if (*(strstr(s, "--")-1) == CHAR_SPACE ||
                   (*(strstr(s, "--")+2) == CHAR_SPACE)) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Spaced em-dash?\n", linecnt);
                    else
                        cnt_dash++;
                    }
                s = strstr(s,"--") + 2;
                }
            }

        /* Check for spaced dashes */
        if (warn_dash)
            if (strstr(aline," -")) {
                if (*(strstr(aline, " -")+2) != '-') {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Spaced dash?\n", linecnt);
                    else
                        cnt_dash++;
                    }
                }
            else
                if (strstr(aline,"- ")) {
                    if (*(strstr(aline, "- ")-1) != '-') {
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                        if (!pswit[OVERVIEW_SWITCH])
                            printf("    Line %ld - Spaced dash?\n", linecnt);
                        else
                            cnt_dash++;
                        }
                    }



        /* Check for "to he" and other easy he/be errors          */
        /* This is a very inadequate effort on the he/be problem, */
        /* but the phrase "to he" is always an error, whereas "to */
        /* be" is quite common. I chuckle when it does catch one! */
        /* Similarly, '"Quiet!", be said.' is a non-be error      */
        /* V .18 - "to he" is _not_ always an error!:             */
        /*           "Where they went to he couldn't say."        */
        /* but I'm leaving it in anyway.                          */
        /* V .20 Another false positive:                          */
        /*       What would "Cinderella" be without the . . .     */
        /* V .21 Added " is be " and " be is " and " be was "     */

        if (strstr(aline," to he ")
            || strstr(aline,"\" be ")
            || strstr(aline,"\", be ")
            || strstr(aline," is be ")
            || strstr(aline," be is ")
            || strstr(aline," was be ")
            || strstr(aline," be was ")
            ) {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld - Query he/be error?\n", linecnt);
            else
                cnt_word++;
            }

        /* Special case - angled bracket in front of "From" placed there by an MTA */
        /* when sending an e-mail.  V .21                                        */
        if (strstr(aline, ">From")) {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld - Query angled bracket with From\n", linecnt);
            else
                cnt_word++;
            }


        /* Check for commonly mistyped words, and digits like 0 for O in a word */
        for (s = aline; *s;) {
            s = getaword(s, inword);
            if (!*inword) continue; /* don't bother with empty lines */
            if (mixdigit(inword)) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - Query digit in %s\n", linecnt, inword);
                else
                    cnt_word++;
                }

            /* put the word through a series of tests for likely typos and OCR errors */
            /* V.21 I had allowed lots of typo-checking even with the typo switch     */
            /* turned off, but I really should disallow reporting of them when        */
            /* the switch is off. Hence the "if" below.                               */
            if (pswit[TYPO_SWITCH]) {
                istypo = 0;
                strcpy(testword, inword);
                alower = 0;
                for (i = 0; i < (signed int)strlen(testword); i++) { /* lowercase for testing */
                    if (testword[i] >= 'a' && testword[i] <= 'z') alower = 1;
                    if (alower && testword[i] >= 'A' && testword[i] <= 'Z') {
                        /* we have an uppercase mid-word. However, there are common cases:
                             Mac and Mc like McGill
                             French contractions like l'Abbe
                        */
                        if ((i == 2 && testword[0] == 'm' && testword[1] == 'c') ||
                            (i == 3 && testword[0] == 'm' && testword[1] == 'a' && testword[2] == 'c') ||
                            (i > 0 && testword[i-1] == CHAR_SQUOTE))
                                ; /* do nothing! */

                        else {
                            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                            if (!pswit[OVERVIEW_SWITCH])
                                printf("    Line %ld - Query uppercase in %s\n", linecnt, inword);
                            else
                                cnt_word++;
                            break;
                            }
                        }
                    testword[i] = (char)tolower(testword[i]);
                    }

                /* check for certain unlikely letter combinations involving b, h */
                /* ch, sh, th, wh are common at word start, but never with b */
                if (strlen(testword)>1)
                    if (testword[1] == 'b')
                        if (strchr("cstw", testword[0]))
                            istypo = 1;

                /* bl, br, are common at word start, but never with h */
                if (!strncmp(testword, "hl", 2))
                    istypo = 1;
                if (!strncmp(testword, "hr", 2))
                    istypo = 1;

                /* tl, tn don't happen at word start, but are common scannos for th */
                /* ditto rn - common for m */
                if (!strncmp(testword, "tl", 2))
                    istypo = 1;
                if (!strncmp(testword, "tn", 2))
                    istypo = 1;
                if (!strncmp(testword, "rn", 2))
                    istypo = 1;

                /* ght is common, gbt never. Like that. */
                if (strlen(testword)>3) {
                    if (strstr(testword, "gbt")) istypo = 1;
                    if (strstr(testword, "pbt")) istypo = 1;
                    if (strstr(testword, "tbs")) istypo = 1;
                    if (strstr(testword, "mrn")) istypo = 1;
                    }
                if (strlen(testword)>4) {
                    if (strstr(testword, "ahle")) istypo = 1;
                    }

                /* "TBE" does happen - like HEARTBEAT - but uncommon. Only if asked for.
                    Similarly "ii" like Hawaii, or Pompeii, and in Roman numerals,
                    but these are covered in V.20. "ii" is a common scanno. */
                if (strstr(testword, "tbe")) istypo = 1;
                if (strstr(testword, "ii")) istypo = 1;

                /* ch gh ph sh th are common at end of word, but not with Bs */
                if (strlen(testword) > 1)
                    if (testword[strlen(testword)-1] == 'b')
                        if (strchr("cgpst", testword[strlen(testword)-2]))
                            istypo = 1;

                /* check for no vowels or no consonants.
                   If none, flag a typo */
                if (!istypo && strlen(testword)>1) {
                    vowel = consonant = 0;
                    for (i = 0; testword[i]; i++)
                        if (testword[i] == 'y' || gcisdigit(testword[i])) {  /* Yah, this is loose. */
                            vowel++;
                            consonant++;
                            }
                        else
                            if  (strchr("aeiou", testword[i])) vowel++;
                            else consonant++;
                    if (!vowel || !consonant) {
                        istypo = 1;
                        }
                    }

                /* now exclude the word from being reported if it's in */
                /* the okword list                                     */
                for (i = 0; *okword[i]; i++)
                    if (!strcmp(testword, okword[i]))
                        istypo = 0;

                /* what looks like a typo may be a Roman numeral. Exclude these */
                if (istypo) {
                    istypo = 0;
                    for (i = 0; testword[i]; i++)   /* exclude words that contain only numeral characters */
                        if (testword[i] != 'i' && testword[i] != 'v' && testword[i] != 'x' &&
                            testword[i] != 'l' && testword[i] != 'c' && testword[i] != 'm')
                            istypo = 1;
                    }

                /* check the manual list of typos */
                if (!istypo)
                    for (i = 0; *typo[i]; i++)
                        if (!strcmp(testword, typo[i]))
                            istypo = 1;

                /* V.21 - check lowercase s and l - special cases */
                if (!istypo && strlen(testword) == 1)
                    if (*inword == 's' || *inword == 'l')
                        istypo = 1;


                if (istypo) {
                    isdup = 0;
                    if (strlen(testword) < MAX_QWORD_LENGTH && !pswit[VERBOSE_SWITCH])
                        for (i = 0; i < qword_index; i++)
                            if (!strcmp(testword, qword[i])) {
                                isdup = 1;
                                ++dupcnt[i];
                                }
                    if (!isdup) {
                        if (qword_index < MAX_QWORD && strlen(testword) < MAX_QWORD_LENGTH) {
                            strcpy(qword[qword_index], testword);
                            qword_index++;
                            }
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                        if (!pswit[OVERVIEW_SWITCH]) {
                            printf("    Line %ld - Query word %s", linecnt, inword);
                            if (strlen(testword) < MAX_QWORD_LENGTH && !pswit[VERBOSE_SWITCH])
                                printf(" - not reporting duplicates");
                            printf("\n");
                            }
                        else
                            cnt_word++;
                        }
                    }
                }        /* end of typo-checking */

            if (pswit[PARANOID_SWITCH]) {   /* in paranoid mode, query all 0 and 1 standing alone */
                if (!strcmp(inword, "0") || !strcmp(inword, "1")) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Query standalone %s\n", linecnt, inword);
                    else
                        cnt_word++;
                    }
                }
            }

        /* look for added or missing spaces around punctuation and quotes */
        /* If there is a punctuation character like ! with no space on    */
        /* either side, suspect a missing!space. If there are spaces on   */
        /* both sides , assume a typo. If we see a double quote with no   */
        /* space or punctuation on either side of it, assume unspaced     */
        /* quotes "like"this.                                             */
        llen = strlen(aline);
        for (i = 1; i < llen; i++) {                               /* for each character in the line after the first */
            if  (strchr(".?!,;:_", aline[i])) {                    /* if it's punctuation */
                isacro = 0;                       /* we need to suppress warnings for acronyms like M.D. */
                isellipsis = 0;                   /* we need to suppress warnings for ellipsis . . . */
                if ( (gcisalpha(aline[i-1]) && gcisalpha(aline[i+1])) ||     /* if there are letters on both sides of it or ... */
                   (gcisalpha(aline[i+1]) && strchr("?!,;:", aline[i]))) { /* ...if it's strict punctuation followed by an alpha */
                    if (aline[i] == '.') {
                        if (i > 2)
                            if (aline[i-2] == '.') isacro = 1;
                        if (i + 2 < llen)
                            if (aline[i+2] == '.') isacro = 1;
                        }
                    if (!isacro) {
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                        if (!pswit[OVERVIEW_SWITCH])
                            printf("    Line %ld column %d - Missing space?\n", linecnt, i+1);
                        else
                            cnt_punct++;
                        }
                    }
                if (aline[i-1] == CHAR_SPACE && (aline[i+1] == CHAR_SPACE || aline[i+1] == 0)) { /* if there are spaces on both sides, or space before and end of line */
                    if (aline[i] == '.') {
                        if (i > 2)
                            if (aline[i-2] == '.') isellipsis = 1;
                        if (i + 2 < llen)
                            if (aline[i+2] == '.') isellipsis = 1;
                        }
                    if (!isemptyline && !isellipsis) {
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                        if (!pswit[OVERVIEW_SWITCH])
                            printf("    Line %ld column %d - Spaced punctuation?\n", linecnt, i+1);
                        else
                            cnt_punct++;
                        }
                    }
                }
            }

        /* v.21 breaking out the search for unspaced doublequotes */
        /* This is not as efficient, but it's more maintainable */
        for (i = 1; i < llen; i++) {                               /* for each character in the line after the first */
            if (aline[i] == CHAR_DQUOTE) {
                if ((!strchr(" -.'`,;:!([{?}])",  aline[i-1]) &&
                     !strchr(" -.'`,;:!([{?}])",  aline[i+1]) &&
                     aline[i+1] != 0
                     || (!strchr(" -([{'`", aline[i-1]) && gcisalpha(aline[i+1])))) {
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                        if (!pswit[OVERVIEW_SWITCH])
                            printf("    Line %ld column %d - Unspaced quotes?\n", linecnt, i+1);
                        else
                            cnt_punct++;
                        }
                }
            }

        /* v.20 also look for double punctuation like ,. or ,,     */
        /* I'm putting this in a separate loop for clarity         */
        /* Thanks to DW for the suggestion!                        */
        /* In books with references, ".," and ".;" are common      */
        /* e.g. "etc., etc.," and vol. 1.; vol 3.;                 */
        /* OTOH, from my initial tests, there are also fairly      */
        /* common errors. What to do? Make these cases paranoid?   */
        /* V.21 ".," is the most common, so invented warn_dotcomma */
        /* to suppress detailed reporting if it occurs often       */
        llen = strlen(aline);
        for (i = 0; i < llen; i++)                  /* for each character in the line */
            if (strchr(".?!,;:", aline[i])          /* if it's punctuation */
            && (strchr(".?!,;:", aline[i+1]))
            && aline[i] && aline[i+1])      /* followed by punctuation, it's a query, unless . . . */
                if (
                (aline[i] == aline[i+1]
                && (aline[i] == '.' || aline[i] == '?' || aline[i] == '!'))
                || (!warn_dotcomma && aline[i] == '.' && aline[i+1] == ',')
                )
                        ; /* do nothing for .. !! and ?? which can be legit */
                else {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld column %d - Double punctuation?\n", linecnt, i+1);
                    else
                        cnt_punct++;
                    }

        /* v.21 breaking out the search for spaced doublequotes */
        /* This is not as efficient, but it's more maintainable */
        s = aline;
        while (strstr(s," \" ")) {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld column %ld - Spaced doublequote?\n", linecnt, strstr(s," \" ")-aline+1);
            else
                cnt_punct++;
            s = strstr(s," \" ") + 2;
            }

        /* v.20 also look for spaced singlequotes ' and `  */
        s = aline;
        while (strstr(s," ' ")) {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld column %ld - Spaced singlequote?\n", linecnt, strstr(s," ' ")-aline+1);
            else
                cnt_punct++;
            s = strstr(s," ' ") + 2;
            }

        s = aline;
        while (strstr(s," ` ")) {
            if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
            if (!pswit[OVERVIEW_SWITCH])
                printf("    Line %ld column %ld - Spaced singlequote?\n", linecnt, strstr(s," ` ")-aline+1);
            else
                cnt_punct++;
            s = strstr(s," ` ") + 2;
            }

        /* v.21 Now check special cases - start and end of line - */
        /* for single and double quotes. Start is sometimes [sic] */
        /* but better to query it anyway.                         */
        /* While I'm here, check for dash at end of line          */
        llen = strlen(aline);
        if (llen > 1) {
            if (aline[llen-1] == CHAR_DQUOTE ||
                aline[llen-1] == CHAR_SQUOTE ||
                aline[llen-1] == CHAR_OPEN_SQUOTE)
                if (aline[llen-2] == CHAR_SPACE) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld column %d - Spaced quote?\n", linecnt, llen);
                    else
                        cnt_punct++;
                    }
            if (aline[0] == CHAR_DQUOTE ||
                aline[0] == CHAR_SQUOTE ||
                aline[0] == CHAR_OPEN_SQUOTE)
                if (aline[1] == CHAR_SPACE) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld column 1 - Spaced quote?\n", linecnt);
                    else
                        cnt_punct++;
                    }
            /* dash at end of line may well be legit - paranoid mode only */
            /* and don't report em-dash at line-end                       */
            if (pswit[PARANOID_SWITCH]) {
                if (aline[llen-1] == '-' && aline[llen-1] != '-') {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Hyphen at end of line?\n", linecnt);
                    else
                        cnt_punct++;
                    }
                }

            }

        /* v.21 also look for brackets surrounded by alpha                    */
        /* Brackets are often unspaced, but shouldn't be surrounded by alpha. */
        /* If so, suspect a scanno like "a]most"                              */
        llen = strlen(aline);
        for (i = 1; i < llen-1; i++) {           /* for each character in the line except 1st & last*/
            if (strchr("{[()]}", aline[i])         /* if it's a bracket */
                && gcisalpha(aline[i-1]) && gcisalpha(aline[i+1])) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - Unspaced bracket?\n", linecnt);
                else
                    cnt_punct++;
                }
            }

        llen = strlen(aline);

        /* Check for <HTML TAG> */
        /* If there is a < in the line, followed at some point  */
        /* by a > then we suspect HTML                          */
        if (strstr(aline, "<") && strstr(aline, ">")) {
            i = (int) (strstr(aline, ">") - strstr(aline, "<") + 1);
            if (i > 0) {
                strncpy(wrk, strstr(aline, "<"), i);
                wrk[i] = 0;
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - HTML Tag? %s \n", linecnt, wrk);
                else
                    cnt_html++;
                }
            }

        /* Check for &symbol; HTML */
        /* If there is a & in the line, followed at  */
        /* some point by a ; then we suspect HTML    */
        if (strstr(aline, "&") && strstr(aline, ";")) {
            i = (int)(strstr(aline, ";") - strstr(aline, "&") + 1);
            if (i > 0) {
                strncpy(wrk, strstr(aline,"&"), i);
                wrk[i] = 0;
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", aline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld - HTML symbol? %s \n", linecnt, wrk);
                else
                    cnt_html++;
                }
            }

        /* At end of paragraph, check for mismatched quotes.           */
        /* We don't want to report an error immediately, since it is a */
        /* common convention to omit the quotes at end of paragraph if */
        /* the next paragraph is a continuation of the same speaker.   */
        /* Where this is the case, the next para should begin with a   */
        /* quote, so we store the warning message and only display it  */
        /* at the top of the next iteration if the new para doesn't    */
        /* start with a quote.                                         */
        /* The -p switch overrides this default, and warns of unclosed */
        /* quotes on _every_ paragraph, whether the next begins with a */
        /* quote or not.                                               */
        /* Version .16 - only report mismatched single quotes if       */
        /* an open_single_quotes was found.                            */

        if (isemptyline) {          /* end of para - add up the totals */
            if (quot % 2)
                sprintf(dquote_err, "    Line %ld - Mismatched quotes\n", linecnt);
            if (pswit[SQUOTE_SWITCH] && open_single_quote && (open_single_quote != close_single_quote) )
                sprintf(squote_err,"    Line %ld - Mismatched single quotes?\n", linecnt);
            if (pswit[SQUOTE_SWITCH] && open_single_quote
                                     && (open_single_quote != close_single_quote)
                                     && (open_single_quote != close_single_quote +1) )
                squot = 1;    /* flag it to be noted regardless of the first char of the next para */
            if (r_brack)
                sprintf(rbrack_err, "    Line %ld - Mismatched round brackets?\n", linecnt);
            if (s_brack)
                sprintf(sbrack_err, "    Line %ld - Mismatched square brackets?\n", linecnt);
            if (c_brack)
                sprintf(cbrack_err, "    Line %ld - Mismatched curly brackets?\n", linecnt);
            quot = s_brack = c_brack = r_brack =
            open_single_quote = close_single_quote = 0;
            isnewpara = 1;     /* let the next iteration know that it's starting a new para */
            }

        /* V.21 _ALSO_ at end of paragraph, check for omitted punctuation. */
        /*      by working back through prevline. DW.                      */
        /* Hmmm. Need to check this only for "normal" paras.               */
        /* So what is a "normal" para? ouch!                               */
        /* Not normal if one-liner (chapter headings, etc.)                */
        /* Not normal if doesn't contain at least one locase letter        */
        /* Not normal if starts with space                                 */
        if (isemptyline) {          /* end of para */
            for (s = prevline, i = 0; *s; s++)
                if (*s >= 'a' && *s <='z')
                    i = 1;    /* use i to indicate the presence of a locase letter */
            if (i
                && lastblen > 2
                && start_para_line < linecnt - 1
                && *prevline > CHAR_SPACE
                ) {
                for (i = strlen(prevline)-1; i > 0; i--) {
                    if (gcisalpha(prevline[i])) {
                        if (pswit[ECHO_SWITCH]) printf("\n%s\n", prevline);
                        if (!pswit[OVERVIEW_SWITCH])
                            printf("    Line %ld - No punctuation at para end?\n", linecnt);
                        else
                            cnt_punct++;
                        break;
                        }
                    if (strchr("-.',;:!([{?}])\"", prevline[i]))
                        break;
                    }
                }
            }
        strcpy(prevline, aline);
    }
    fclose (infile);
    if (!pswit[OVERVIEW_SWITCH])
        for (i = 0; i < MAX_QWORD; i++)
            if (dupcnt[i])
                printf("\nNote: Queried word %s was duplicated %d time%s\n", qword[i], dupcnt[i], "s");
}



/* flgets - get one line from the input stream, checking for   */
/* the existence of exactly one CR/LF line-end per line.       */
/* Returns a pointer to the line.                              */

char *flgets(char *theline, int maxlen, FILE *thefile, long lcnt)
{
    char c;
    int len, isCR, cint;

    *theline = 0;
    len = isCR = 0;
    c = cint = fgetc(thefile);
    do {
        if (cint == EOF)
            return (NULL);
        if (c == 10)  /* either way, it's end of line */
            if (isCR)
                break;
            else {   /* Error - a LF without a preceding CR */
                if (pswit[LINE_END_SWITCH]) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", theline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - No CR?\n", lcnt);
                    else
                        cnt_lineend++;
                    }
                break;
                }
        if (c == 13) {
            if (isCR) { /* Error - two successive CRs */
                if (pswit[LINE_END_SWITCH]) {
                    if (pswit[ECHO_SWITCH]) printf("\n%s\n", theline);
                    if (!pswit[OVERVIEW_SWITCH])
                        printf("    Line %ld - Two successive CRs?\n", lcnt);
                    else
                        cnt_lineend++;
                    }
                }
            isCR = 1;
            }
        else {
            if (pswit[LINE_END_SWITCH] && isCR) {
                if (pswit[ECHO_SWITCH]) printf("\n%s\n", theline);
                if (!pswit[OVERVIEW_SWITCH])
                    printf("    Line %ld position %d - CR without LF?\n", lcnt, len+1);
                else
                    cnt_lineend++;
                }
             theline[len] = c;
             len++;
             theline[len] = 0;
             isCR = 0;
             }
        c = cint = fgetc(thefile);
    } while(len < maxlen);
    return(theline);
}





/* mixdigit - takes a "word" as a parameter, and checks whether it   */
/* contains a mixture of alpha and digits. Generally, this is an     */
/* error, but may not be for cases like 4th or L5 12s. 3d.           */
/* Returns 0 if no error found, 1 if error.                          */

int mixdigit(char *checkword)   /* check for digits like 1 or 0 in words */
{
    int wehaveadigit, wehavealetter, firstdigits, query, wl;
    char *s;


    wehaveadigit = wehavealetter = query = 0;
    for (s = checkword; *s; s++)
        if (gcisalpha(*s))
            wehavealetter = 1;
        else
            if (gcisdigit(*s))
                wehaveadigit = 1;
    if (wehaveadigit && wehavealetter) {         /* Now exclude common legit cases, like "21st" and "12l. 3s. 11d." */
        query = 1;
        wl = strlen(checkword);
        for (firstdigits = 0; gcisdigit(checkword[firstdigits]); firstdigits++)
            ;
        /* digits, ending in st, rd, nd, th of either case */
        if (firstdigits + 2 == wl &&
              (!matchword(checkword + wl - 2, "st")
            || !matchword(checkword + wl - 2, "rd")
            || !matchword(checkword + wl - 2, "nd")
            || !matchword(checkword + wl - 2, "th"))
            )
                query = 0;
        if (firstdigits + 3 == wl &&
              (!matchword(checkword + wl - 3, "sts")
            || !matchword(checkword + wl - 3, "rds")
            || !matchword(checkword + wl - 3, "nds")
            || !matchword(checkword + wl - 3, "ths"))
            )
                query = 0;
        if (firstdigits + 3 == wl &&
              (!matchword(checkword + wl - 4, "stly")
            || !matchword(checkword + wl - 4, "rdly")
            || !matchword(checkword + wl - 4, "ndly")
            || !matchword(checkword + wl - 4, "thly"))
            )
                query = 0;

        /* digits, ending in l, L, s or d */
        if (firstdigits + 1 == wl &&
            (checkword[wl-1] == 'l'
            || checkword[wl-1] == 'L'
            || checkword[wl-1] == 's'
            || checkword[wl-1] == 'd'))
                query = 0;
        /* L at the start of a number, representing Britsh pounds, like L500  */
        /* This is cute. We know the current word is mixeddigit. If the first */
        /* letter is L, there must be at least one digit following. If both   */
        /* digits and letters follow, we have a genuine error, else we have a */
        /* capital L followed by digits, and we accept that as a non-error.   */
        if (checkword[0] == 'L')
            if (!mixdigit(checkword+1))
                query = 0;
        }
    return (query);
}




/* getaword - extracts the first/next "word" from the line, and puts */
/* it into "thisword". A word is defined as one English word unit    */
/* or at least that's what I'm trying for.                           */
/* Returns a pointer to the position in the line where we will start */
/* looking for the next word.                                        */

char *getaword(char *fromline, char *thisword)
{
    int i, wordlen;
    char *s;


    wordlen = 0;
    for ( ; !gcisdigit(*fromline) && !gcisalpha(*fromline) && *fromline ; fromline++ );

    /* V .20                                                                   */
    /* add a look-ahead to handle exceptions for numbers like 1,000 and 1.35.  */
    /* Especially yucky is the case of L1,000                                  */
    /* I hate this, and I see other ways, but I don't see that any is _better_.*/
    /* This section looks for a pattern of characters including a digit        */
    /* followed by a comma or period followed by one or more digits.           */
    /* If found, it returns this whole pattern as a word; otherwise we discard */
    /* the results and resume our normal programming.                          */
    s = fromline;
    for (  ; (gcisdigit(*s) || gcisalpha(*s) || *s == ',' || *s == '.') && wordlen < MAXWORDLEN ; s++ ) {
        thisword[wordlen] = *s;
        wordlen++;
        }
    thisword[wordlen] = 0;
    for (i = 1; i < wordlen -1; i++) {
        if (thisword[i] == '.' || thisword[i] == ',') {
            if (gcisdigit(thisword[i-1]) && gcisdigit(thisword[i-1])) {   /* we have one of the damned things */
                fromline = s;
                return(fromline);
                }
            }
        }

    /* we didn't find a punctuated number - do the regular getword thing */
    wordlen = 0;
    for (  ; (gcisdigit(*fromline) || gcisalpha(*fromline) || *fromline == '\'') && wordlen < MAXWORDLEN ; fromline++ ) {
        thisword[wordlen] = *fromline;
        wordlen++;
        }
    thisword[wordlen] = 0;
    return(fromline);
}





/* matchword - just a case-insensitive string matcher    */
/* yes, I know this is not efficient. I'll worry about   */
/* that when I have a clear idea where I'm going with it.*/

int matchword(char *checkfor, char *thisword)
{
    unsigned int ismatch, i;

    if (strlen(checkfor) != strlen(thisword)) return(0);

    ismatch = 1;     /* assume a match until we find a difference */
    for (i = 0; i <strlen(checkfor); i++)
        if (toupper(checkfor[i]) != toupper(thisword[i]))
            ismatch = 0;
    return (ismatch);
}





/* lowerit - lowercase the line. Yes, strlwr does the same job,  */
/* but not on all platforms, and I'm a bit paranoid about what   */
/* some implementations of tolower might do to hi-bit characters,*/
/* which shouldn't matter, but better safe than sorry.           */

void lowerit(char *theline)
{
    for ( ; *theline; theline++)
        if (*theline >='A' && *theline <='Z')
            *theline += 32;
}





/* gcisalpha is a special version that is somewhat lenient on 8-bit texts.     */
/* If we use the standard isalpha() function, 8-bit accented characters break  */
/* words, so that tete with accented characters appears to be two words, "t"   */
/* and "t", with 8-bit characters between them. This causes over-reporting of  */
/* errors. gcisalpha() recognizes accented letters from the CP1252 (Windows)   */
/* and ISO-8859-1 character sets, which are the most common PG 8-bit types.    */

int gcisalpha(unsigned char c)
{
    if (c >='a' && c <='z') return(1);
    if (c >='A' && c <='Z') return(1);
    if (c < 140) return(0);
    if (c >=192 && c != 208 && c != 215 && c != 222 && c != 240 && c != 247 && c != 254) return(1);
    if (c == 140 || c == 142 || c == 156 || c == 158 || c == 159) return (1);
    return(0);
}

/* gcisdigit is a special version that doesn't get confused in 8-bit texts.    */
int gcisdigit(unsigned char c)
{   
    if (c >= '0' && c <='9') return(1);
    return(0);
}




/* gcstrchr wraps strchr to return NULL if the character being searched for is zero */

char *gcstrchr(char *s, char c)
{
    if (c == 0) return(NULL);
    return(strchr(s,c));
}


void proghelp()                  /* explain program usage here */
{
    fputs("Version .95  Copyright 2001, 2002 Jim Tinsley <jtinsley@pobox.com>.\n",stderr);
    fputs("Gutcheck comes wih ABSOLUTELY NO WARRANTY. For details, read the file COPYING.\n", stderr);
    fputs("This is free software; you may redistribute it under certain conditions;\n", stderr);
    fputs("read the file COPYING for details (GPL).\n\n", stderr);
    fputs("Usage is: gutcheck [-setpxloyh] filename\n",stderr);
    fputs("  where -s checks single quotes, -e suppresses echoing lines, -t checks typos\n",stderr);
    fputs("  -x (paranoid) switches OFF -t and extra checks, -l turns OFF line-end checks\n",stderr);
    fputs("  -o just displays overview without detail, -h echoes header fields\n",stderr);
    fputs("  -v (verbose) unsuppresses duplicate reporting\n",stderr);
    fputs("Sample usage: gutcheck warpeace.txt >queries.lst\n",stderr);
    fputs("\n",stderr);
    fputs("Gutcheck looks for errors in Project Gutenberg(TM) etexts.\n", stderr);
    fputs("Gutcheck queries anything it thinks shouldn't be in a PG text; non-ASCII\n",stderr);
    fputs("characters like accented letters, lines longer than 75 or shorter than 55,\n",stderr);
    fputs("unbalanced quotes or brackets, a variety of badly formatted punctuation, \n",stderr);
    fputs("HTML tags, some likely typos. It is NOT a substitute for human judgement.\n",stderr);
    fputs("\n",stderr);
}


/*  fputs("gutcheck comes wih ABSOLUTELY NO WARRANTY. This is free software (GPL); you may\n", stderr);
    fputs("redistribute it under certain conditions. For details, read the file COPYING.\n", stderr); */

/*********************************************************************
  Revision History:

  04/22/01 Cleaned up some stuff and released .10

           ---------------

  05/09/01 Added the typo list, added two extra cases of he/be error,
           added -p switch, OPEN_SINGLE QUOTE char as .11

           ---------------

  05/20/01 Increased the typo list,
           added paranoid mode,
           ANSIfied the code and added some casts
              so the compiler wouldn't keep asking if I knew what I was doing,
           fixed bug in l.s.d. condition (thanks, Dave!),
           standardized spacing when echoing,
           added letter-combo checking code to typo section,
           added more h/b words to typo array.
           Not too sure about putting letter combos outside of the TYPO conditions -
           someone is sure to have a book about the tbaka tribe, or something. Anyway, let's see.
           Released as .12

           ---------------

  06/01/01 Removed duplicate reporting of Tildes, asterisks, etc.
  06/10/01 Added flgets routine to help with platform-independent
           detection of invalid line-ends. All PG text files should
           have CR/LF (13/10) at end of line, regardless of system.
           Gutcheck now validates this by default. (Thanks, Charles!)
           Released as .13

           ---------------

  06/11/01 Added parenthesis match checking. (c_brack, cbrack_err etc.)
           Released as .14

           ---------------

  06/23/01 Fixed: 'No',he said. not being flagged.

           Improved: better single-quotes checking:

           Ignore singlequotes surrounded by alpha, like didn't. (was OK)

           If a singlequote is at the END of a word AND the word ends in "s":
                  The dogs' tails wagged.
           it's probably an apostrophe, but less commonly may be a closequote:
                  "These 'pack dogs' of yours look more like wolves."

           If it's got punctuation before it and is followed by a space
           or punctuation:
              . . . was a problem,' he said
              . . . was a problem,'"
           it is probably (certainly?) a closequote.

           If it's at start of paragraph, it's probably an openquote.
              (but watch dialect)

           Words with ' at beginning and end are probably quoted:
               "You have the word 'chivalry' frequently on your lips."
               (Not specifically implemented)
           V.18 I'm glad I didn't implement this, 'cos it jest ain't so
           where the convention is to punctuate outside the quotes.
               'Come', he said, 'and join the party'.

           If it is followed by an alpha, and especially a capital:
              'Hello,' called he.
           it is either an openquote or dialect.

           Dialect breaks ALL the rules:
                  A man's a man for a' that.
                  "Aye, but 'tis all in the pas' now."
                  "'Tis often the way," he said.
                  'Ave a drink on me.

           This version looks to be an improvement, and produces
           fewer false positives, but is still not perfect. The
           'pack dogs' case still fools it, and dialect is still
           a problem. Oh, well, it's an improvement, and I have
           a weighted structure in place for refining guesses at
           closequotes. Maybe next time, I'll add a bit of logic
           where if there is an open quote and one that was guessed
           to be a possessive apostrophe after s, I'll re-guess it
           to be a closequote. Let's see how this one flies, first.

           (Afterview: it's still crap. Needs much work, and a deeper insight.)

           Released as .15

           TODO: More he/be checks. Can't be perfect - counterexample:
              I gave my son good advice: be married regardless of the world's opinion.
              I gave my son good advice: he married regardless of the world's opinion.

           ---------------

  07/01/01 Added -O option.
           Improved singlequotes by reporting mismatched single quotes
           only if an open_single_quotes was found.

           Released as .16

           ---------------

  08/27/01 Added -Y switch for Robert Rowe to allow his app to
           catch the error output.

           Released as .17

           ---------------

  09/08/01 Added checking Capitals at start of paragraph, but not
           checking them at start of sentence.

           TODO: Parse sentences out so can check reliably for start of
                 sentence. Need a whole different approach for that.
                 (Can't just rely on periods, since they are also
                 used for abbreviations, etc.)

           Added checking for all vowels or all consonants in a word.

           While I was in, I added "ii" checking and "tl" at start of word.

           Added echoing of first line of paragraph when reporting
           mismatched quoted or brackets (thanks to David Widger for the
           suggestion)

           Not querying L at start of a number (used for British pounds).

           The spelling changes are sort of half-done but released anyway
           Skipped .18 because I had given out a couple of test versions
           with that number.

  09/25/01 Released as .19

           ---------------

           TODO:
           Use the logic from my new version of safewrap to stop querying
             short lines like poems and TOCs.
           Ignore non-standard ellipses like .  .  . or ...


           ---------------
  10/01/01 Made any line over 80 a VERY long line (was 85).
           Recognized openquotes on indented paragraphs as continuations
               of the same speech.
           Added "cf" to the okword list (how did I forget _that_?) and a few others.
           Moved abbrev to okword and made it more general.
           Removed requirement that PG_space_emdash be greater than
               ten before turning off warnings about spaced dashes.
           Added period to list of characters that might constitute a separator line.
           Now checking for double punctuation (Thanks, David!)
           Now if two spaced em-dashes on a line, reports both. (DW)
           Bug: Wasn't catching spaced punctuation at line-end since I
               added flgets in version .13 - fixed.
           Bug: Wasn't catching spaced singlequotes - fixed
           Now reads punctuated numbers like 1,000 as a single word.
               (Used to give "standalone 1" type  queries)
           Changed paranoid mode - not including s and p options. -ex is now quite usable.
           Bug: was calling `"For it is perfectly impossible,"    Unspaced Quotes - fixed
           Bug: Sometimes gave _next_ line number for queried word at end of line - fixed

  10/22/01 Released as .20

           ---------------

           Added count of lines with spaces at end. (cnt_spacend) (Thanks, Brett!)
           Reduced the number of hi-bit letters needed to stop reporting them
               from 1/20 to 1/100 or 200 in total.
           Added PG footer check.
           Added the -h switch.
           Fixed platform-specific CHAR_EOL checking for isemptyline - changed to 13 and 10
           Not reporting ".," when there are many of them, such as a book with many references to "Vol 1., p. 23"
           Added unspaced brackets check when surrounded by alpha.
           Removed all typo reporting unless the typo switch is on.
           Added gcisalpha to ease over-reporting of 8-bit queries.
           ECHO_SWITCH is now ON by default!
           PARANOID_SWITCH is now ON by default!
           Checking for ">From" placed there by e-mail MTA (Thanks Andrew & Greg)
           Checking for standalone lowercase "l"
           Checking for standalone lowercase "s"
           Considering "is be" and "be is" "be was" "was be" as he/be errors
           Looking at punct at end of para

  01/20/02 Released as .21

           ---------------

           Added VERBOSE_SWITCH to make it list everything. (George Davis)

           ---------------

  02/17/02 Added cint in flgets to try fix an EOF failure on a compiler I don't have.
           after which
           This line caused a coredump on Solaris - fixed.
                Da sagte die Figur: " Das ist alles gar schoen, und man mag die Puppe
  03/09/02 Changed header recognition for another header change
           Called it .24
  03/29/02 Added qword[][] so I can suppress massive overreporting
           of queried "words" like "FN", "Wm.", "th'", people's 
           initials, chemical formulae and suchlike in some texts.
           Called it .25
  04/07/02 The qword summary reports at end shouldn't show in OVERVIEW mode. Fixed.
           Added linecounts in overview mode.
           Wow! gutcheck gutcheck.exe doesn't report a binary! :-) Need to tighten up. Done.
           "m" is a not uncommon scanno for "in", but also appears in "a.m." - Can I get round that?
  07/07/02 Added GPL.
           Added checking for broken em-dash at line-end (enddash)
           Released as 0.95

*********************************************************************/

