Web Wiz Guide's Site Searcher Script modified as a file content indexer to produce pipe-delimited csv output


THANK-YOU to Vladimir Miho for his logging script that creates the csv file from the output: http://planet-source-code.com/vb/scripts/showcode.asp?lngWId=4&txtCodeId=8911

I used the search script to index the content of html horse pedigree files with pipe-delimited csv output. I imported the csv files
into Access and now have the file contents in my pedigrees database and database-searchable, which is faster and
less server intensive.

The idea was to get the text content of the html files entered into my database. You see, my community pedigree files collection consists mostly of html files. It was easy enough to get basic information about each file entered into the Access database, such as file size, date uploaded, file name, and so on. What I really wanted in the database was the CONTENT of each file. This is because if I used a file system object search script like Web Wiz Guide's, to search the content of the files, it just took too long to grind through about 650 (and growing) files. By getting the file contents into the Access database, searches of the content are therefore greatly speeded up, especially for our community members who have slow computers or who are on dialup.

View another example: pipe-delimited csv formatted text file.
Here is an example of the pipe-delimited csv formatted results that the script gave me:

583|tiegste.htm|(c) 1999, AQHA. All Rights Reserved. 4 GENERATION RACE/SHOW PEDIGREE|There is no description available for this page| 339|Tuesday, August 20, 2002|14| (c) 1999, AQHA. All Rights Reserved. 4 GENERATION RACE/SHOW PEDIGREE A.newslink { COLOR: blue; FONT-FAMILY: arial; FONT-SIZE: 9pt; FONT-STYLE: normal; FONT-WEIGHT: bold; LINE-HEIGHT: 9pt; TEXT-ALIGN: left; TEXT-DECORATION: none } A:hover { COLOR: maroon } A:link { COLOR: navy; TEXT-DECORATION: underline } "); //--> Tiegs Te 2416842 1985 chestnut mare S.I.: 0 Starts: 0 Wins: 0 2nds: 0 3rds: 0 Earnings: 0 Hlt Pts: 11.0 Perf Pts: 7.0 SIRE side of pedigree Azure Te (TB) 1962      T0052758 bay        Nashville (TB) 1954      T0061755 bay        TE n' Te 1973       1135492 bay      H- 22.0  P- 0.0 Blue One (TB) 1952      T0152831 bay        Vila 1960       0137530 brown      H- 5.0  P- 0.0 Leon Bars 1954       0072126 sorrel      H- 0.0  P- 1.0 Skipalindas Te 1981       1712355 sorrel      H- 43.0  P- 0.0 Fairy Adams 1949       0030302 bay        Eternal Ben 1963       0300267 sorrel      H- 33.0  P- 14.5 Eternal Sun 1958       0151802 sorrel 95 12 2 1 1 $ 1,676    H- 41.0  P- 0.0 Skipalinda 1973       0987469 chestnut      H- 46.0  P- 110.0 Benetta Bar 1959       0104107 dun        Bar Y Skippa 1961       0170623 palomino        Skipper Jr 1951       0040514 sorrel      H- 17.0  P- 9.0 Bar Y Lady 8 1942       0005054 dun        DAM side of pedigree Torino 1967       0497174 sorrel      H- 55.0  P- 39.0 Ledo Bars 1963       0281380 sorrel      H- 29.0  P- 0.0 The Continental 1971       0739908 red roan      H- 34.0  P- 18.0 Rita Blue 1961       0160022 bay        Frosty Money 1963       0281133 roan      H- 157.0  P- 15.5 Jimmy Mac Bee 1958       0092825 dun      H- 1.0  P- 0.0 Cheryl Tiegs 1978       1385900 sorrel      H- 51.0  P- 1.0 Red Dee Money 1958       0093048 roan      H- 7.0  P- 0.0 Pute Cee Bonanza 1964       0431937 bay      H- 28.0  P- 21.5 Coy's Bonanza 1959       0143099 sorrel 85 16 1 3 2 $ 289    H- 154.0  P- 7.5 Kim's Bonanza 1968       0611925 bay      H- 48.0  P- 0.0 Zon-Zoe (TB) 1959      T0067332 bay        Trixie Snipper 1964       0306773 sorrel      H- 22.0  P- 14.0 Little Feller 1955       0051631 bay      H- 1.0  P- 0.0 Miss Snipper 1956       0086123 bay        "); //--> (c) 1999, AQHA. All Rights Reserved. 4 GENERATION RACE/SHOW PEDIGREE

187|frecklessanpeppy.jpg|No Title|There is no description available for this page| 337|Monday, August 19, 2002|37|

190|FrostedDoc.htm|PEDIGREE RECORD 2003, AQHA. All Rights Reserved.|There is no description available for this page| 335|Sunday, June 01, 2003|8| PEDIGREE RECORD 2003, AQHA. All Rights Reserved. .page { PAGE-BREAK-AFTER: always } .page2 { PAGE-BREAK-AFTER: always } Frosted Doc 2498066 1986 red roan stallion SI: 0 Starts 0 Wins: 0 2nds: 0 3rds: 0 Earnings: 0 Hlt Pts: 0 Perf Pts: 0 SIRE side of pedigree Doc Bar 1956       0076136 chestnut 75 4 0 0 1 $ 95    H- 36.0  P- 0.0 Lightning Bar 1951       0037566 sorrel 95 10 4 3 1 $ 1,491    H- 18.0  P- 0.0 Mr Senbar 1968       0542728 bay      H- 0.0  P- 51.0 Dandy Doll 1948       0026556 chestnut 85 21 5 2 3 $ 876    H- 0.0  P- 1.5 Miss Sen Sen 1947       0024153 bay 85 44 7 7 3 $ 4,538      Okie Smokie 1       U0077226 n/a        Mr Senjell 1978       1416170 buckskin        Miss El Peco 1943       0010040 chestnut 75 7 2 3 0 $ 28      Barjo 1949       0032673 chestnut 95 78 21 20 12 $ 11,573      Three Bars (TB) 1940      T0065983 chestnut        Barjelle 1961       0184878 buckskin 95 61 11 11 8 $ 8,509      Betty Joe 1940       0000831 bay        Hancock Belle 1944       0005593 dun        Buck 8       U0240700 n/a        Miss Hancock 1       U0076388 n/a        DAM side of pedigree Doc's Gold Mine 1969       0697898 palomino      H- 7.0  P- 19.0 Doc Bar 1956       0076136 chestnut 75 4 0 0 1 $ 95    H- 36.0  P- 0.0 Gold Mine Sonny 1976       1221717 palomino        Saucy Bess 1957       0081082 palomino      H- 78.0  P- 10.0 Konspikuus 1956       0071418 sorrel        Poco Shade 1951       0048114 dun      H- 2.0  P- 11.5 Docs Candy Cat 1980       1676284 red roan        Sheffieldpeachescrea 1948       0028710 roan        Sugar Bars 1951       0042606 sorrel 95 30 7 4 7 $ 3,166    H- 2.0  P- 0.0 Three Bars (TB) 1940      T0065983 chestnut        Sugar Bar Lil 1961       0163407 bay        Frontera Sugar 1943       0005731 palomino        Pond Lillie 1949       0044388 gray        Pondfly 1945       0011636 dun        J a Mare 1       U0072561 n/a          --> 1 PEDIGREE RECORD 2003, AQHA. All Rights Reserved.

I used pipe delimiters to make csv format.
I made the search input text box hidden and made the value an asterisk
"*" because that is traditional for wildcard.  I removed the code that would show if no results were found, so that by default, the script shows everything it reads.

 I used the variable intNumFilesShown as the primary key for each record when imported into Access.
The script will simultaneously and automatically
create a csv file from the output.

Html and carriage returns and line feeds and blank spaces are filtered out.

I used it to index just under 700 pedigree html files. It took quite a while for the script to grind through that many files. When I index files again, I will set the script to index only the new files. If you have a large number of files, you may want to set the script to do only a certain number at a time...I would have to study up on how to make it work that way. Maybe you can tell me how. :)

I would be grateful for suggestions on how to improve this, or if someone else would take the idea and improve it.

I spent HOURS putting this script together. Now that it is done, I think the idea is fairly "simple" but I had to learn how to do some stuff I didn't know how to do before. If you use this script or find it helpful, I would appreciate a rating on www.aspin.com, or send me an email to let me know how you're using it. (As usual, donations to my paypal account are an excellent sign of appreciation.) --Lil

My favorite scripts site is CODANGO.


