I am planning to leverage WebView/WebEngine in order to build a web spider.
One of my goals is to able to save a complete (html+images) web page on my FS (File System).
Let's detail here the options:
a) One option could to read the DOM tree in Java and save it on the FS.
All the DOM tree would have to be transfered from C++ to Java.
b) Another option is to leverage WebEngine and to call an API for such a save.
No DOM tree transfer from C++ to Java here, just a Java-to-C++ call in order to ask for the save.
The process (b) involves less processing as, while the DOM tree is on the WebKit/C++ side, it's simple to save a complete web page from WebKit/C++.
So, I would like WebEngine Java class to have an API offering at least the following methods:
- saveContent(String file) : save simply the loaded content of WebEngine in a file (the html page only)
- saveComplete(String file) : save the complete loaded content of WebEngine (html+images in a directoy, the html file referencing the images into the directory)
- saveContent(String url, String file) : save simply the content of the 'url' resource in a file (the html page only)
- saveComplete(String url, String file) : save the complete content of the 'url' resource (html+images in a directoy, the html file referencing the images into the directory)
- getSaveWorker() : get a worker presenting all the current saves' status, and enabling to cancel, if needed, some saves.
One of my goals is to able to save a complete (html+images) web page on my FS (File System).
Let's detail here the options:
a) One option could to read the DOM tree in Java and save it on the FS.
All the DOM tree would have to be transfered from C++ to Java.
b) Another option is to leverage WebEngine and to call an API for such a save.
No DOM tree transfer from C++ to Java here, just a Java-to-C++ call in order to ask for the save.
The process (b) involves less processing as, while the DOM tree is on the WebKit/C++ side, it's simple to save a complete web page from WebKit/C++.
So, I would like WebEngine Java class to have an API offering at least the following methods:
- saveContent(String file) : save simply the loaded content of WebEngine in a file (the html page only)
- saveComplete(String file) : save the complete loaded content of WebEngine (html+images in a directoy, the html file referencing the images into the directory)
- saveContent(String url, String file) : save simply the content of the 'url' resource in a file (the html page only)
- saveComplete(String url, String file) : save the complete content of the 'url' resource (html+images in a directoy, the html file referencing the images into the directory)
- getSaveWorker() : get a worker presenting all the current saves' status, and enabling to cancel, if needed, some saves.