StringSearch

Class stringsearch is extensively used by class getlinks to return an array of string containing URL references in a file.

Method strSearch takes an absolute path of a saved file on the disk as a parameter. It scans the requested file for the occurence of the <A HREF="......." or the <IMG SRC="......." tags for the html or the gif/jpg file links. The filename associated with the tag is to be retrieved. Since the reference strings consists of eight characters, an array of the same number is used as a 'character' shift register. Initially the first char read from the file is placed at the rightmost position of the shift register. The shift register contents are then compared with the strings. A perfect match returns the result zero else every char in the shiftregister is shifted left by one. In the resulting rightmost free position at the right, is placed, a new char read from the the source file. Then a comparision is made again.

Now, it appeares as though the whole file is passed through a pipe, character by character, and is being examined for the specified strings. Each URL associated with the reference anchor tag is copied into the successive positions of an array of strings and this array is returned to the calling procedure.

Loopbacks occurs if repeated references to same URL are made in the same or different files. Therefore a call to xnoloops( ) ensures that the the array returned does not contain multiple reference to the same URL.

Unlike C, java does not support comparision between char[ ] and a string, therefore a char[ ] to string conversion using object String's (char value[ ]) constructor has to be performed everytime before comparision.

// filename: stringSearch.java import java.io.*; import java.lang.*; import java.awt.*; import java.util.*; public class stringSearch { static int nlinks = 0; //nlinks is used to return exact no of links. // Used in getlinks. public stringSearch( ) { } //constructor //String[ ] since it has to return all the file names // individual filename is stored as a string in an array public static String[ ] strSearch(String absfilename) throws Exception { File fp; FileInputStream fip; FileOutputStream fop; String sr = "<A HREF="; // HTML format. For other scripts replace the strings. String sr1 = "<IMG SRC"; String arraytostring; char aa[ ]; int i = 0; int j = 0; int k; int ptr,ptr1; int count = 0; int z; char myfile[ ]; myfile = new char[100]; String strlinknames[ ]; strlinknames = new String[400]; // contents of myfile array,after converting // as string are stored in strlinknames and returned // refer #1 fp = new File(absfilename); fip = new FileInputStream(absfilename); aa = new char[8]; if(fp.exists( ) && fp.isFile( ) && fp.canRead( )) { nlinks = 0; boolean slash; while((i=fip.read( ))!=-1) { for(k=1;k<=7;k++) { aa[k-1] = Character.toUpperCase(aa[k]); } aa[7] = Character.toUpperCase((char)i); //aa[7] arraytostring = new String(aa); ptr = sr.compareTo(arraytostring); ptr1 = sr1.compareTo(arraytostring); // ptr1 = 1; // arraytostring = arraytostring.trim( ); if(ptr * ptr1 == 0) { while((i=fip.read( ))!='"'); //waits till it gets 1st " if(i=='"') // HTML format. For java script you can check for ' { // slash = true; while((i=fip.read( ))!='"') //waits till it gets 2nd " { if((i == '/') && (slash == true)) { slash = false; i = fip.read( ); } myfile[count] = (char)i; //copies contents betn " " count++; //i.e. copies file/addrsname }// end while count = 0; } // end if(i=='"') String myfiletostring; //#1 myfiletostring = new String(myfile); //#1 myfiletostring = myfiletostring.toLowerCase(Locale.getDefault( )); if(myfiletostring.lastIndexOf(sint.refstr)!= -1 || ptr1 == 0) //"spor") != -1) { if (myfiletostring.startsWith("MAILTO:", 0)) // 7/7/97 { //System.out.println("mailto found...."); break; //this code ignores <a href="mailto: i.e. mailto URL part entirely } String ss1; ss1 = settrims.mytrims(myfiletostring); // refer (b) //System.out.println(" ####### SS1 = "+ss1); if(settrims.xnoloops(ss1) && ss1!=null) { strlinknames[nlinks] = myfiletostring; //#1 // System.out.println(" Strlink - "+strlinknames[nlinks]+" "+nlinks); nlinks++; } } for(z=0;z<100;z++) //To ensure that array is cleared run **ak352.java myfile[z]='\0'; } //end if ptr }// end while -1 }//end if file check //}//end try return strlinknames; //#1 return the root links to linknames }//end strSearch( ) }//end of class stringSearch

next >>

Index