Introduction
I want to share my experiences on downloading JDK with Java from Oracle.com - I needed to automatize the site navigation a user needs to do to download Java - there are several scenarios, some of which require logging into Oracle account. The result of my work is shared at GitHub. The whole process of installing Java apps has always been annoying for both a developer and an end user, and the goal of this article is to show less painful approach to auto install the JDK. It is by the way used in author's remote automation tool Bear.
Background
When downloading the latest stable version of JDK from Oracle.com, a user will require only two clicks:
In this article, I will only highlight the most interesting moments of solving this task with Java, if you have any question, leave them in the comment section below and I will answer and update the article.
The result is a few lines of code to download the JDK:
import static bear.fx.DownloadFxApp.*;
public static void main(String[] args){
version = "7u51";
tempDestDir = new File(".");
miniMode = false;
launchUI();
DownloadFxApp downloadFxApp = getInstance().awaitDownload(1, TimeUnit.HOURS);
DownloadFxApp.DownloadResult r = downloadFxApp.downloadResult.get();
downloadFxApp.stop();
System.out.println("Downloaded to: " + r.file.getPath());
}
There are no publicly available JDK archives on the net, because Oracle's license prohibits it, though Java is considered being an open platform. Which is a strange thing, Oracle.
This article contains a small portion of criticism for Oracle which is initially meant to be constructive. Being a Java Developer for 8+ years already, I really want to avoid any sort of confrontation when I highlight Oracle misses.
Which Approach or Technology to Choose?
I found two existing open source solutions which download JDK from Oracle's site:
- (Ruby) capistrano-jdk-installer
- (Java)
<a href="http://grepcode.com/file/repo1.maven.org/maven2/org.eclipse.hudson/hudson-core/3.1.0/hudson/tools/JDKInstaller.java#JDKInstaller">JDKInstaller</a>
in Jenkins Sources
The first approach is in Ruby which I tried to mimic by using both HttpClient
and <a href="https://github.com/GistLabs/mechanize">Mechanize</a>
. This approach has a dependency on a pre-populated list of JDKs URLs placed on Jenkins site. And you also need to login into Oracle's site to download older releases. While implementing this, I had issues with SSL certificates support and not being logged in for sequential request. From my practice, such solutions usually just don't work or require too much maintenance, so I decided to move on with something less confusing.
Which is browser emulation. Of course! You can always download a file as if it was a user download who browsed the site, entered his data, filled the forms, read the yadayada. -1 to Oracle for filling out the forms, the UI of 90s and really messy experience while downloading JDK of a previous version. Another -1 Oracle receives for blocking my account. :-) Don't worry, this most probably won't happen to you - I just had too many debug sessions. Updating in time is nice, but in most situations is not the case when your code goes live.
Selenium WebDriver is fine and stable, but gives a lot of dependencies and might not be an option for a project which is not using it. So my choice was WebView
from JavaFX 2.
Using WebView
WebView is a simple component in JavaFX which is based on WebKit browser and can be used to browse web pages. In Java 8 WebView received a new fast JavaScript engine called Nashorn which is up to x20 times faster than Rhino from Java 7, so the whole browsing experience is very smooth and indistinguishable from a regular browser like Chrome or Firefox. (Ok, +5 to Oracle for Nashorn, -1 for lacking JSR-221 integration in a WebView out-of-the box)
My thoughts on WebView vs JavaFX? Nowadays desktop apps seem a bit outdated, so I'm confused with Oracle updating Swing to JavaFX 2. Why can't they use WebView to render modern web UI technologies like Backbone or AngularJS and bind them to the app? For example, Bear uses JavaFX to render a local web page UI, so instead of using JavaFX 2 components, it uses AngularJS and Twitter Bootstrap. When this becomes a mainstream and desktop/web apps are transparent for the developer (i.e. you write your UI app for a server and this code could also be run locally), why would anyone want to hire a JavaFX developer to create some crazy FXML forms when he could build his app based on a more productive JavaScript UI stack? How many times this would be cheaper? 2, 5, 10? I really hope Oracle would consider focusing on the modern web which grows so rapidly.
The Algorithm
Below is the workflow scheme for traversing the download pages and filling registration forms. The entry point is a page The Latest JDK Version Page or
Oracle Java Archive. Some of the elements of this scheme are described below in more detail.
SimpleBrowser
I developed a SimpleBrowser
component which is a wrapper for WebView
. At the moment, SimpleBrowser
adds the following features to WebView
:
useFirebug
- Embeds Firebug Lite panel useJQuery
- Embeds jQuery when loading a page load(url, callback)
- Opens a page and runs a callback when it is ready getHTML()
- Gets HTML of a current page createWebView
- WebView
is an optional visual component, this enables a WebEngine
-only mode. In Java 8, however one is still required to start the stage and there is a feature request in JavaFX tracker to make JavaFX available for headless environment waitFor()
- Waits for a jQuery condition to become true
, i.e., an element to appear - Logging via JavaScript's
alert()
s
Getting Links from a Web Page
This is simple when you have jQuery - just add a function which uses jQuery selectors to visit DOM tree nodes and return the link found:
browser.getEngine().executeScript(
scriptText() + "\n " +
"downloadIfFound('" + version + "', true, 'linux');"
);
The source of the scriptText()
is at the GitHub. Note that this script is quite long and it is not really efficient to compile it for each Java-to-JS call, but it's ok for less demanding tasks or demonstration purposes.
Clicking a Link
I copied code from this Stackoverflow.com answer: How to simulate a click with JavaScript? clickIt
would click a result of a jQuery selector, which would spawn navigation events for WebEngine
.
var clickIt = function($el){
var el = $el[0];
var etype = 'click';
clickDom(el, etype);
}
var clickDom = function(el, etype){
if (el.fireEvent) {
el.fireEvent('on' + etype);
} else {
var evObj = document.createEvent('Events');
evObj.initEvent(etype, true, false);
el.dispatchEvent(evObj);
}
}
Embedding jQuery
To embed jQuery (or any other library) into existing page, one requires only to add a <script ...>
entry to the <head ...>
node, which is easy with JavaScript.
The problem is that this script is loaded asynchronously, so I needed to provide a Java callback which is called from JavaScript when the script loads. There could also be a problem of too frequent sequential loads, so there is a queue which would store callbacks with their eventId
s and fire them when the script loads inside a web page for a given initial eventId
.
A short version of a page load schema is shown below.
Note: This is all hidden for the user of SimpleBrowser
.
void SimpleBrowser::load(url, Runnable onLoad) {
int eventId = random.nextInt();
webEngine.load(url);
whenPageLoaded({
jQueryLoads.put(eventId, load);
if(useJQuery) embedJQuery(eventId);
} as Runnable);
}
void SimpleBrowser::embedJQuery(int eventId){
webEngine.executeScript("insertNode($eventId);");
};
var insertNode = function(eventId){
var scriptNode = ...;
scriptNode.onload = function(){
window.simpleBrowser.jQueryReady(eventId);
}
}
Following Redirects
WebView
does this for you, however I don't know if it's a bug, but I was unable to wait for a page to load after a redirect in a normal way. So I added SimpleBrowser::waitFor
which waits in a separate Thread for element to appear based on jQuery condition, i.e. $('#id').length > 0
.
Logging
I found two ways to log JavaScript and both at the moment are more of a joke, -1 to WebView
(+10 to WebView
for otherwise great and unique technology). The first one is to use 3rd-party FirebugLite's console. Though this extension is awesome, I didn't like working with it at all in conjunction with WebView
. The latest 1.4 version just didn't work for me, in some cases you spend a couple of hours almost crying "Firebug panel, please come back!", -1 to Oracle for emotional abuse. Oh, just kidding.
The other one is to define an alert event handler for WebEngine
and use alert()
to log in JS. It works well, but guys, seriously?
Downloading a File
Download starts when there is a location property change event for WebEngine
, which provides the download location. I use Apache's HttpClient
library and I also start a separate thread to download JDK. A bar shows the download progress:
Points of Interest
In this article, I showed how to work with WebView to load pages, embed external JS libraries into a page, click links and wait for content to appear.
One of the messiest things in this process was debugging the states of WebView
, understanding the states of WebEngine
to load the pages and finding problems when having heap dumps, i.e., in situations when you make something illegal - like calling a Java method with a bad argument in JavaScript or weird bugs like calling a method of GC-ed object (this is my guess for the type of bugs I did not understand).
Pieces of this code could be used as a basis for a library which could be built upon WebView
. I am really happy that in the end I was able to simplify such a challenging task as downloading Java with Java.