Part 1: Welcome Back
In high school, I always watched guys from the "cool kids
club" show up late to a party, make a huge splashy entrance, get all the
attention, and leave again with the girls I’d wanted to talk to. Getting to the
party early meant I got a good seat to watch the various goings-on, but
sometimes, if you were too early, there wasn’t much of a crowd to talk to.
The same, it seems, is true of technologies: the first
entrants into a new technology space sometimes find that the crowd isn’t yet
ready to hear about their offering, at least not until there’s a few more
players and a better understanding of what this new thing is all about. This
was certainly true of the cloud: In 2008, Google joined (arguably the verb
should be "launched") the nascent cloud movement with the release of "Google
App Engine", but between the "Why is this better?" questions and the hoopla
raised by the other companies jumping into the cloud bandwagon with their cloud
plays (Azure, Heroku, VMWare, and others) shortly thereafter, it became pretty
easy to lose sight of Google’s offerings.
What to do? Same thing party veterans learned to do: Leave,
then come back again, this time dressed to kill and with a big entrance.
With Google’s "Google Cloud Platform" announcement in 2012, Google App Engine (and the developers using it) not only got a slew of new partners to
help complement the act of developing applications in the cloud, but also swung
renewed focus back on the company’s offerings and opportunities to write code
that runs on one of the biggest server farms in the world. For those who’ve
never looked at Google’s Cloud Platform implementation, it’s well past time to
do so. For those who looked into Google App Engine back in 2008, welcome back,
and stick around—it’s a whole new ballgame.
This is the first of several articles on Google Cloud
Platform, in which we’ll walk through getting started with Google App Engine
(since most new applications start with some kind of code). Then, in successive
articles, we’ll examine more of Google Cloud Platform feature set, including
some of the APIs and tools offered by Google App Engine and the data storage
options within Google Cloud Platform. We’re not going to cover the whole of
it—a whole book would have a hard time doing that in any level of detail—but
we’ll cover enough to get started and know where else to look when needed.
Overview
Google Cloud Platform consists of several components that
combine to form a comprehensive platform for developing applications.
Compute
Google App Engine (PaaS) and Google Compute Engine (IaaS)
are the server-side processing components. Compute Engine lets you run Linux
virtual machines in the cloud (similar to Amazon’s EC2). Of the two, we’ll
focus on Google App Engine, the more traditional approach: an application
hosting platform that takes care of the servers for you. For Google App
Engine, developers write code in one of four languages, usually either Python
or Java. Google App Engine also has support for PHP and Go. In this and other
articles in this series, we’ll use Java; that shouldn’t be taken as a technical
judgment—Python is an equally viable option for those who prefer dynamic
scripting languages over statically-typed compiled ones.
That said, bear in mind that Google App Engine runs Java
bytecode (and doesn’t care where that bytecode came from), so those developers
who want to write in Scala or Clojure or any other JVM-based language are very
welcome, so long as they can compile it into .class files and collect it into a
JavaEE WAR format. (Google doesn’t directly offer support for anything beyond
Java, but as we’ll see later, if it runs on the local machine, it should run in
Google Cloud Platform, so this isn’t quite as scary a prospect as it sounds.)
Storage
Most cloud applications will need to store data, and Google
Cloud Platform offers three options for doing so: Google Cloud Datastore
(NoSQL, non-relational database), Google Cloud SQL (MySQL, relational database), and Google Cloud Storage
(object / blob storage). Of
the three, given that most Web developers are familiar with relational
databases, it is a relatively straightforward option for them to use Google
Cloud SQL for storing data in a relational format.
Not everything fits into a relational format, though, and
rather than struggle to force objects that don’t really follow a relational
model into one, Google offers Google Cloud Storage, which offers a
"bucket-oriented" model for storing data and metadata. It’s a model that
vaguely resembles Amazon’s S3 storage system, but with significant enough
differences that it deserves its own discussion.
Analysis
Although we won’t cover it in this
series, Google also offers Big Data options. Google BigQuery allows for
large-scale queries against terabyte-sized data sets. Or, for those more
Hadoop-ish minded, there’s also Hadoop
on Compute Engine.
Extras
Google Cloud Platform also offers some non-traditional cloud
options. One such is Google Prediction API, which offers some machine-learning
algorithms for those applications that can benefit from things like customer
feedback analysis, spam detection, or document classification. Another is
Google Translate API, which provides language-translation facilities, obviating
much of the need for developers to do the traditional things regarding
internationalization (storing text in resources, fetching the translated
versions of those resources according to user-selected localization settings,
and so on).
But before any of those spiffy features can get turned on, a
developer should get started with the basics of Google App Engine.
Getting Started with Google App Engine
The first step with any tool or technology is to create a
"Hello World" application in said tool, and here will be no different. It
begins by picking one of the four supported languages for Google App Engine (in
this case, I choose Java), downloading the appropriate tools for that language
if they aren’t already present on the developer’s machine (such as, in this case,
the JDK) as well as the language-specific Google Cloud SDK (in this case, the
Java one) available here. Once all those tools are in place,
it’s time to create the Hello World application. Although setting up a Google
Cloud Platform account will be necessary before deployment, we don’t need it to
develop the application, since the tools can run locally, running a local web
server that mimics the behavior of Google App Engine.
Once everything’s installed on the development machine, put
Google App Engine SDK’s "bin" directory on the PATH, and verify that it’s
running by issuing the "appcfg" command at the command-line. This batch
file/shell script is a command-line tool that will provide a variety of
utilities related to uploading, downloading, and tracking the application once
it’s in the cloud; our only interest in it at the moment is to make sure the
SDK installed correctly. Assuming it works, we can start to look at developing code
for Google App Engine.
Google App Engine
Fundamentally, when writing Java code for Google App Engine,
it is a Java Servlet platform, which means that Java developers will be working
with something both familiar and well-understood, not to mention extensible.
(Much of the "alternative" JVM-based languages and web toolkits assume a
baseline platform based on Servlets and the corresponding Servlet
Specification, which means that, for the most part, they should work with
minimal, if any, modification.) Given the popularity and ubiquity of the
Servlet platform, we won’t be covering the ins-and-outs of writing a servlet
here; numerous resources exist that discuss Servlets in some detail, including
tutorials on the Oracle website as well as here on CodeProject. By taking
Servlets as its base platform, by the way, this means that all of the
traditional Java-Web players are also available for use on Google App Engine,
such as Java Server Pages (JSPs). However, basing the platform on Servlets does
mean that, in contrast to some of the recent Web toolkits’ desires to embrace the
Ruby-on-Rails-style "convention over configuration", several different
configuration files that must be in place for the Java-based Google App Engine
code to work correctly.
In addition, there are several tools
that will typically be at use during the development of a Google App Engine
application, and in an effort to simplify the use of those tools, Google offers
an Eclipse-based plugin. Since not everybody uses Eclipse for servlet
development, however, Google also offers Ant integration, and since it’s
helpful to understand what’s happening "under the covers", so to speak, we will
use the Ant script instead of the Eclipse plugin.
In the grand tradition of Java developers everywhere, which
is to figure out a base Ant script and copy-paste that over and over again, a
useful starting point is to use the example Ant script listed in Figure 1. It
assumes a basic directory structure consisting of two subdirectories: "src",
containing the usual subdirectory-structured package structure that is common
to most Java projects, and "war", which is the WAR (Web ARchive)-based layout
for the resources and code of the application. Again, these are described in
great detail in other places, but for those who don’t remember the details, the
WAR layout looks roughly as described in Figure 2. The only other dependency in
the Ant file is on a property named "sdk.dir", which must point to the root of Google
App Engine SDK installation; in this (slightly modified) version of the example
Ant script, that value is being pulled out of the environment, so as to allow
for global installs of the SDK, instead of being relative to the project
directory as the example Ant script assumes.
As with any Ant script, running "ant –projecthelp" will list
all the targets, but several targets are immediately obvious:
-
"compile" takes the source code, compiles it to
"war/WEB-INF/classes", where the WAR format expects compiled class files to
reside, and copies over Google App Engine JAR files from the SDK into
"war/WEB-INF/lib", where (again) the WAR format expects dependent JARs to
reside. Note that if the application makes use of any libraries outside of Google App Engine, it’s the developers’ responsibility to put them into
"war/WEB-INF/lib" themselves; the Ant script will assume any JAR files in that
directory are part of the compile process and put them on the compilation
CLASSPATH.
-
"datanucleusenhance" will take the compiled code and run an
"enhancement" process on it, preparing it for data access using the custom Ant
task "enhance_war". Since the application doesn’t (yet) require any data
access, this step is unnecessary, but it expects an "enhancement descriptor"
file that we won’t be discussing (or building) yet, so for the moment, comment
out this part of the task, otherwise we’ll run into errors we don’t want to
deal with quite yet.
-
"runserver" launches a local HTTP server that mimics the Google App
Engine environment, by default using the local machine’s 8080 port. This is a
relatively popular port number for local/development web servers to use, so if
there’s an error launching, check to make sure there isn’t another server
currently listening on that port. It also uses the "war" subdirectory as the
"root" of the web server, so any HTML files in the "war" subdirectory should be
immediately browsable and visible. This task doesn’t return when run, so
launching "ant runserver" should only be done from a separate terminal/command
prompt window.
-
"update" and "rollback" will upload and rollback, respectively,
the application to Google Cloud Platform, which we’re also not yet ready
for—more on this later.
Aside from configuring the SDK location (and commenting out
the for-the-moment unnecessary enhancement step), the Ant script is pretty much
good-to-go. Thus, if we kick off "ant runserver", even without any code, we
should see the contents of a simple "index.html" file in the root of the "war"
subdirectory.
Configurations
Well, sort of good-to-go. The Ant script itself is ready,
but two configuration files need to be in place before the Ant script can
launch the development Web server. One is the Servlet-mandated "web.xml" file,
which describes (among other things) the relationship of URL patterns to
servlets within the app, and the other is Google App Engine-mandated
"appengine-web.xml" file, which describes Google App Engine-specific elements
of this application.
The first, "web.xml", can be easily slapped together by
starting with the world’s simplest web.xml file, shown in Figure 3. The
"<welcome-file-list>
" element simply describes the list, in order, of
files the server will use for the "default" resource served up when hit by a
request. The other elements we will add, in a moment, will be the
"<servlet>
" element to describe a servlet class and give it a unique (to
this application) name, and "<servlet-mapping>
", which will take a URL
pattern and associate it with a servlet by name.
(Note, by the way, that the order these elements appear in
the "web.xml" file is significant; see the Servlet Specification for more
details, or use the XSD for code-completion help in your favorite editor or IDE
to avoid stupid order-dependent errors and mistakes.)
The second, "appengine-web.xml", is Google App Engine-specific
information, shown in Figure 4, and for now, the only critical element in the
file is the "<threadsafe>
" element, which indicates whether the
application can allow multiple threads within the application—if this element
is missing, the development server won’t launch. (The "<application>
" and
"<version>
" elements will be more important when we push this to the
cloud, but for now we can leave them as-is.) The "<system-properties>"
are properties that will be passed into the JVM (as if they had been passed at
the command-line using "-D"), allowing for configuration capabilities, which in
this case consists of setting diagnostic log configuration.
Both of these files must reside in the "war/WEB-INF"
directory, and if both are present, "ant runserver" at the command-line should
bring up the contents of your "index.html" file in the "war" directory.
Show me the code!
Whoof. Lots of setup, and we still haven’t gotten to the
point of code yet. Rectifying that is pretty straightforward, though: create a
simple servlet, like so:
package com.tedneward.hello;
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
public class HelloServlet extends HttpServlet
{
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws IOException, ServletException
{
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<body>");
out.println("<p>Hello, world</p>");
out.println("</body>");
out.println("</html>");
}
}
Then doctor up the "web.xml" file to include a
"<servlet>
" element that names it and a "<servlet-mapping>
" element
to tie it to a URL pattern:
<web-app xmlns="http://java.sun.com/xml/ns/javaee" version="2.5">
<servlet>
<servlet-name>Hello</servlet-name>
<servlet-class>com.tedneward.hello.HelloServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>Hello</servlet-name>
<url-pattern>/hello</url-pattern>
</servlet-mapping>
<welcome-file-list>
<welcome-file>index.html</welcome-file>
</welcome-file-list>
</web-app>
Kick off "ant runserver" again, and assuming it doesn’t find
any errors in the code, the server should come up. Now, in the browser, plug
"http://localhost:8080/hello" into to the address bar, and the
servlet-generated response should appear.
Not an impressive demo
In of itself, what’s been accomplished doesn’t seem like
much—in fact, it feels like we spent more time talking about how to write and
deploy a servlet than we did about Google Cloud Platform. And in truth, that’s
part of the point: Google App Engine, being essentially a servlet platform,
means that Java developers have little to no retraining required to understand
how Google App Engine will figure into the development and deployment process,
as opposed to an on-premise Tomcat or other JavaEE servlet container.
Plus, Google App Engine has a great deal more to show us—in
addition to the libraries that Google App Engine provides for Java developers
to use, there’s the data access elements, plus a pretty straightforward
(meaning, "simple") deploy-to-the-cloud step. For now, though, we have a full
up-and-running local development story, meaning all of the traditional things
that Java developers want to do, such as unit- and/or end-to-end-tests, are
still here without complicating things with a cloud deployment step.
Things heat up in the next one, so stick close. In the
meantime, however, happy coding!
Figure 1: Google App Engine Ant
<project>
<property environment="env" />
<property name="sdk.dir" location="${env.APPENG_HOME}" />
<import file="${sdk.dir}/config/user/ant-macros.xml" />
<path id="project.classpath">
<pathelement path="war/WEB-INF/classes" />
<fileset dir="war/WEB-INF/lib">
<include name="**/*.jar" />
</fileset>
<fileset dir="${sdk.dir}/lib">
<include name="shared/**/*.jar" />
</fileset>
</path>
<target name="copyjars"
description="Copies the Google App Engine JARs to the WAR.">
<copy
todir="war/WEB-INF/lib"
flatten="true">
<fileset dir="${sdk.dir}/lib/user">
<include name="**/*.jar" />
</fileset>
</copy>
</target>
<target name="compile" depends="copyjars"
description="Compiles Java source and copies other source files to the WAR.">
<mkdir dir="war/WEB-INF/classes" />
<copy todir="war/WEB-INF/classes">
<fileset dir="src">
<exclude name="**/*.java" />
</fileset>
</copy>
<javac
srcdir="src"
destdir="war/WEB-INF/classes"
classpathref="project.classpath"
debug="on" />
</target>
<target name="datanucleusenhance" depends="compile"
description="Performs JDO enhancement on compiled data classes.">
<enhance_war war="war" />
</target>
<target name="runserver" depends="datanucleusenhance”>
description="Starts the development server.">
<dev_appserver war="war" />
</target>
<target name="update" depends="datanucleusenhance"
description="Uploads the application to Google App Engine.">
<appcfg action="update" war="war" />
</target>
<target name="update_indexes" depends="datanucleusenhance"
description="Uploads just the datastore index configuration to Google App Engine.">
<appcfg action="update_indexes" war="war" />
</target>
<target name="rollback" depends="datanucleusenhance"
description="Rolls back an interrupted application update.">
<appcfg action="rollback" war="war" />
</target>
<target name="request_logs"
description="Downloads log data from Google App Engine for the application.">
<appcfg action="request_logs" war="war">
<options>
<arg value="--num_days=5"/>
</options>
<args>
<arg value="logs.txt"/>
</args>
</appcfg>
</target>
</project>
Figure 2: WAR layout
(project-root)
/src
... (Java source goes here)
/war
/index.html
... (other directly-browsable elements here)
/WEB-INF
web.xml
appengine-web.xml
... (non-browsable elements here)
/lib
... (JAR files)
/classes
... (.class files)
Figure 3: web.xml
<web-app xmlns="http://java.sun.com/xml/ns/javaee" version="2.5">
<welcome-file-list>
<welcome-file>index.html</welcome-file>
</welcome-file-list>
</web-app>
Figure 4: appengine-web.xml
="1.0" ="utf-8"
<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application>Hello</application>
<version>1</version>
<threadsafe>true</threadsafe>
</appengine-web-app>