You can import existing Web resources using wizards that invoke HTTP or FTP. These import wizards automate the transfer of complete Web sites into Web projects by:
These import wizards also support the import capabilities for Web servers that are equipped with firewalls. Both HTTP and FTP import support Proxies while FTP import supports SOCKS.
To use the HTTP or FTP import wizards, you must designate an existing project in which to import the files. You will be able to view all the files from the imported Web site within the selected project folder.
The HTTP import uses the HTTP protocol to crawl through the Web site based on an initial URL that you provide. The import action uses the URL to retrieve any HTML content available and also parses for HTTP links. The process repeats until it parses content and links that are referenced to other web pages that are encountered within the web site. HTTP import cannot parse pages that contain servlets or programs that are executed when a form is posted or embedded in JavaServer Pages (JSPs).
The files transferred to your project represent a logical snapshot of the Web site's URL. This means that your Web project is populated with files that are acquired by the HTML response of the serving site. This also means that it is not necessary that the physical resources on the serving site will be copied to your project. For example, an HTTP request for a JSP page will return a rendered HTML response, not the JSP page itself. It is recommended that you use HTTP import for static pages and for sites that do not have FTP access.
To import existing Web resources into the Web project using HTTP, perform the following steps:
For example, one might specify a crawl depth of 2 and an initial URL http://host/initialLevel/index.html . If index.html has a reference to http://host/initialLevel/L2/L3/index2.html , then index2.html, which is at level 3, is filtered out and its content will not be parsed for follow on crawling.
(C) Copyright IBM Corporation 2000, 2005. All Rights Reserved.