Fetch meta tags from crawled html docs
                                                    I have used apache nutch-1.4 and crawled a website. Now i want to fetch meta tags from every html page. Is this possible ?
I have just started using nutch so I don't even know how to compile the code. For crawling i have downloaded the binary files and run some very simple commands.
So one of the doubt in my mind is How to run nutch if i modify one of the source files.
 
And what modification i can do which can show me the meta tag info corresponding to URL of pages.
                    
                    I have just started using nutch so I don't even know how to compile the code. For crawling i have downloaded the binary files and run some very simple commands.
So one of the doubt in my mind is How to run nutch if i modify one of the source files.
And what modification i can do which can show me the meta tag info corresponding to URL of pages.
            0
        
     
                                            