Introduction
This application shows the basics of building an EPUB file viewer for Android.
The features of the viewer are:
- A "list view" of the .epub files on the SD card.
- A "list view" of the book's Table of Contents (ToC), with the ability to select a ToC entry, and jump to it.
- Viewing an EPUB's contents.
- Ability to set a bookmark and return to the bookmark.
- Using Android's Text to Speech API to read the book aloud.
What this article will cover:
- A summary of the EPUB file format, and the steps involved in reading it.
- How to use the android.sax library to parse an EPUB file.
- How to use the android WebClient to display HTML, including:
- Using Android 3.0's
shouldInterceptRequest()
to fetch the HTML to display. - Web Client differences between Android 2.3 and 3.0.
- Dealing with the Web Client's caching
- Getting the current scroll position of the HTML document
- Restoring the scroll position, when a document is reloaded
- Adding "Fling" gesture handling to the WebClient
- Formatting URIs
- How to set up a chain of XMLFilters to process an XML document with a SAX parser.
- Converting a SAX parser's output back into XML.
- Using Parcelable to package data into an intent, for passing to an activity.
- Using Android's Text to Speech API
Caution, as this project is intended to show the basics of viewing an EPUB file, there's a number of simplifying assumptions I've made.
- All files are UTF-8, don't handle UTF-16
- Language is English (only really relevant for Text to Speech)
- The EPUB files are well formed. (No error handling in XML parsing.)
- Only supporting EPUB 2.0 (and a limited set at that, e.g. minimal SVG support, not all manifest attributes.)
Using the code
If you don't know how to set up Eclipse and the Android SDK, go here for instructions.
Download the project, unzip and import into Eclipse. Requires minimum of Android 2.3
EPUB File Format and Parsing an EPUB
Wikipedia provides a good description of EPUB format, including links to the official documentation. For those of you who don't want to read it, here's the 30 second executive summary. An EPUB file is ZIP file that contains HTML files and images (often in multiple folders) and a few XML files. The HTML (actually XHTML) files and images are the content of the book. The XML files are metadata, covering things like:
- Information on the book itself, e.g. Title, Author(s), Publisher, etc.
- Details of the HTML files. e.g. Format for each file, order to read them in, which file corresponds to each chapter of the book, etc.
- The table of contents.
As the EPUB file is a zip file, the first thing we need is a function that allows us to extract the files from a zip file. We can use the java.util.zip.ZipFile
class in Android to do most of the work for us.
public class Book {
private ZipFile mZip;
public Book(String fileName) {
try {
mZip = new ZipFile(fileName);
} catch (IOException e) {
Log.e(Globals.TAG, "Error opening file", e);
}
}
public InputStream fetchFromZip(String fileName) {
InputStream in = null;
ZipEntry containerEntry = mZip.getEntry(fileName);
if (containerEntry != null) {
try {
in = mZip.getInputStream(containerEntry);
} catch (IOException e) {
Log.e(Globals.TAG, "Error reading zip file " + fileName, e);
}
}
return in;
}
}
We will also need to parse the XML files. Android has a number of ways to parse an XML file, IBM's DeveloperWorks has an excellent article on the options. We're going to use a SAX parser. The basic idea of the SAX approach is you write a ContentHandler and plug it into the SAX pipeline. There is a certain amount of boiler plate involved in setting up a SAX pipeline, but the following helper function will handle it for us.
void parseXmlResource(String fileName, ContentHandler handler) {
InputStream in = fetchFromZip(fileName);
if (in != null) {
try {
try {
SAXParserFactory parseFactory = SAXParserFactory.newInstance();
XMLReader reader = parseFactory.newSAXParser().getXMLReader();
reader.setContentHandler(handler);
InputSource source = new InputSource(in);
source.setEncoding("UTF-8");
reader.parse(source);
} finally {
in.close();
}
} catch (ParserConfigurationException e) {
Log.e(Globals.TAG, "Error setting up to parse XML file ", e);
} catch (IOException e) {
Log.e(Globals.TAG, "Error reading XML file ", e);
} catch (SAXException e) {
Log.e(Globals.TAG, "Error parsing XML file ", e);
}
}
}
The first step in parsing an EPUB file is to read the container.xml file for the location of the .opf file. From the EPUB specification, the container file is always called "container.xml" and must be in the folder "META-INF". A typical example is:
="1.0"="UTF-8"
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
<rootfiles>
<rootfile full-path="content.opf" media-type="application/oebps-package+xml"/>
</rootfiles>
</container>
To get the location of the .opf file, we look through the <rootfile>
elements until we find one with a media-type of application/oebps-package+xml
. The full-path attribute of this element gives us the name of the .opf file which holds most of the metadata needed to understand the contents of the EPUB file. We need to write a ContentHandler to do this, and write the .opf filename to mOpfFileName
, a String
variable of the Book
class.
The traditional way to create a ContentHandler is to derive a class from ContentHandler
overriding the startElement()
and/or endElement()
functions (and possibly others) as required by your needs. The difficulty with this pattern is that these functions are called for every element, regardless of type. Thus, the startElement()
implementation needs to know how to parse each type of element it will encounter. But, things are even more complicated. It also has to keep track of where it is in the XML schema, so that it can know which element it is currently working on. When working with a non-trivial schema, this frequently results in complicated code, as the logic to parse the individual elements gets mixed up with the schema tracking logic.
A little thought suggests a better design for building parsers would separate the two concerns. Something like this:
- Describe the relationship of the XML elements
- Start with the root element of the tree.
- List the expected child elements of the root node we have an interest in.
- For each child element, recursively list its child elements we have an interest in.
- For each type of element we've defined, specify how to parse it by providing appropriate
startElement()
and/or endElement()
logic.
The good news is that the android.sax package allows us to construct a ContentHandler
in exactly this manner.
private static final String XML_NAMESPACE_CONTAINER = "urn:oasis:names:tc:opendocument:xmlns:container";
private String mOpfFileName;
private ContentHandler constructContainerFileParser() {
RootElement root = new RootElement(XML_NAMESPACE_CONTAINER,"container");
Element rootfilesElement = root.getChild(XML_NAMESPACE_CONTAINER,"rootfiles");
Element rootfileElement = rootfilesElement.getChild(XML_NAMESPACE_CONTAINER, "rootfile");
rootfileElement.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
String mediaType = attributes.getValue("media-type");
if ((mediaType != null) && mediaType.equals("application/oebps-package+xml")) {
mOpfFileName = attributes.getValue("full-path");
}
}
});
return root.getContentHandler();
}
As you can see from the above code, to use the android.sax, you:
- start by creating a
RootElement
. - You then add child Elements to match the XML elements you're interested in. Note, you can leave out any elements you're not interested in.
- For each element, you use
setStartElementListener
, setEndTextElementListener
, and/or setEndElementListener
to add the logic to process/extract the wanted portions of the Element
. - Finally,
getContentHandler()
is called to package it all up into a ContentHandler
that can be to passed to a XMLReader
.
Putting all the pieces together, we can obtain the name of the .opf file with the following code:
parseXmlResource("META-INF/container.xml", constructContainerFileParser());
The .opf file contains a manifest, a spine, and some other things we don't care about. Namely, metadata and guide. A .opf file looks something like this:
="1.0"="UTF-8"
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="calibre_id">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<-- assorted values here, which we don't care about, so I've deleted then-->
</metadata>
<manifest>
<item id="id1" href="title_page.html" media-type="application/xhtml+xml"/>
<item href="chapter1.html" id="id2.1" media-type="application/xhtml+xml"/>
<item href="chapter2.html" id="id2.2" media-type="application/xhtml+xml"/>
<item id="id3.1" href="stylesheet1.css" media-type="text/css"/>
<item id="id3.2" href="stylesheet2.css" media-type="text/css"/>
<item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
<item href="content/resources/_cover_.jpg" id="id4.1" media-type="image/jpeg"/>
</manifest>
<spine toc="ncx">
<itemref idref="id1"/>
<itemref idref="id2.1"/>
<itemref idref="id2.2"/>
</spine>
</package>
The manifest element holds a list of all the files contained in the zip file.
A manifest item's href
element is the name (including path) of the file in the zip. Note that the path may be relative to the .opf file's location. The media-type
attribute is the mimetype of the file. The id
attribute links to the spine's idref
attribute.
The spine provides two things, the reading order for the files in the manifest and the table of contents file. The order to read the files is trivial. The items in the spine are in the correct order and it's just a case of matching the idref
attributes in the spine to the id
attributes in the manifest. Obtaining the table of contents is almost as easy. The spine's toc
attribute matches the id
attribute of the manifest entry for the table of contents file. Given this, content handler to parse an .opf file is:
private static final String XML_NAMESPACE_PACKAGE = "http://www.idpf.org/2007/opf";
private HashMap<String, String> mManifestIndex;
private HashMap<String, String> mManifestMediaTypes;
private ArrayList<String> mSpine;
private String mTocName;
private ContentHandler constructOpfFileParser() {
RootElement root = new RootElement(XML_NAMESPACE_PACKAGE, "package");
Element manifest = root.getChild(XML_NAMESPACE_PACKAGE, "manifest");
Element manifestItem = manifest.getChild(XML_NAMESPACE_PACKAGE, "item");
Element spine = root.getChild(XML_NAMESPACE_PACKAGE, "spine");
Element itemref = spine.getChild(XML_NAMESPACE_PACKAGE, "itemref");
manifestItem.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
String href = attributes.getValue("href");
href = FilenameUtils.concat(FilenameUtils.getPath(mOpfFileName), href);
mManifestIndex.put(attributes.getValue("id"), href);
mManifestMediaTypes.put(href, attributes.getValue("media-type"))
}
});
spine.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
String toc = attributes.getValue("toc");
mTocName = mManifestIndex.get(toc).getHref();
}
});
itemref.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
mSpine.add(attributes.getValue("idref"));
}
});
return root.getContentHandler();
}
The table of contents file (also known as a .ncx file) contains a hierarchical table of contents, along with assorted metadata. It is used to provide the user of an e-book a table of contents where the user can select an item in the table, and then have the book jump to that position in the book. An example of a .ncx file is as follows:
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="en">
<head>
<meta name="dtb:uid" content="ae60509a-b048-5f93-abd0-5333f347e4c1"/>
<meta name="dtb:depth" content="3"/>
<meta name="dtb:totalPageCount" content="0"/>
<meta name="dtb:maxPageNumber" content="0"/>
</head>
<docTitle><text>Tax Guide</text></docTitle>
<docAuthor><text>IRS</text></docAuthor>
<navMap>
<navPoint id="116f4d31-2b73-4fbd-85c4-8d437f6fccc1" playOrder="1">
<navLabel><text>Volume 1</text></navLabel>
<content src="volume1.html"/>
<navPoint id="1563d3d9-33c5-472e-bcf4-587923f3137b" playOrder="2">
<navLabel><text>Chapter 1</text></navLabel>
<content src="volume1/chapter001.html"/>
</navPoint>
<navPoint id="1563d3d9-33c5-472e-bcf4-587923f3137d" playOrder="3">
<navLabel><text>Chapter 2</text></navLabel>
<content src="volume1/chapter002.html"/>
<navPoint id="1563d3d9-33c5-472e-bcf4-587923f3137c" playOrder="4">
<navLabel><text>Section 1</text></navLabel>
<content src="volume1/chapter002.html#Section_1"/>
</navPoint>
</navPoint>
</navPoint>
<navPoint id="1563d3d9-33c5-472e-bcf4-587923f3137a" playOrder="5">
<navLabel><text>Volume 2</text></navLabel>
<content src="volume2.html"/>
</navPoint>
</navMap>
</ncx>
The major points to note are:
- The table of content information is provided by the
<navPoint>
elements. - Each
<navPoint>
represents a Table of Contents item to show the user. - The
<navPoint>
elements are in order. <navPoint>
elements can be (but usually are not) nested. - The name of the item to show to the user is the
<navLabel>
element. - The
src
attribute is where the content is. If the attribute contains a hash '#' character, then the part to the left of the hash is the name of the file in the zip holding the content. The part to the right of the hash is the position in the file where the content begins.
A navPoint
can be represented by a Plain Old Java Object (POJO):
public class NavPoint {
private String mNavLabel;
private String mContent;
public String getNavLabel() { return mNavLabel; }
public String getContent() { return mContent; }
public void setNavLabel(String navLabel) { mNavLabel = navLabel; }
public void setContent(String content) { mContent = content; }
}
Using this class, we can parse the Table of Contents.
private ArrayList<NavPoint> mNavPoints;
private int mCurrentDepth = 0;
private int mSupportedDepth = 1;
public NavPoint getLatestPoint() {
return mNavPoints.get(mNavPoints.size() - 1);
}
private ContentHandler constructTocFileParser() {
RootElement root = new RootElement(XML_NAMESPACE_TABLE_OF_CONTENTS, "ncx");
Element navMap = root.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS, "navMap");
Element navPoint = navMap.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS, "navPoint");
AddNavPointToParser(navPoint);
return root.getContentHandler();
}
private void AddNavPointToParser(final Element navPoint) {
Element navLabel = navPoint.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS, "navLabel");
Element text = navLabel.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS, "text");
Element content = navPoint.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS, "content");
navPoint.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
mNavPoints.add(new NavPoint());
if (mSupportedDepth == ++mCurrentDepth) {
Element child = navPoint.getChild(XML_NAMESPACE_TABLE_OF_CONTENTS,"navPoint");
AddNavPointToParser(child);
++mSupportedDepth;
}
}
});
text.setEndTextElementListener(new EndTextElementListener(){
public void end(String body) {
getLatestPoint().setNavLabel(body);
}
});
content.setStartElementListener(new StartElementListener(){
public void start(Attributes attributes) {
getLatestPoint().setContent(attributes.getValue("src"));
}
});
navPoint.setEndElementListener(new EndElementListener(){
public void end() {
--mCurrentDepth;
}
});
}
You may notice that to cope with the arbitrary nesting of navPoints
, in StartElementListener
the nesting level is tracked and additional levels are added (as required) to the ContentHandler
while the XML is being parsed.
Providing a Table of Contents
The simplest way to provide a Table of Contents is to use a ListActivity
, passing it mNavPoints
. The list view shows the NavLabel
as the text for each item. When an item is selected, the corresponding content is returned. Passing mNavPoints
to the ListActivity
is an interesting problem. Probably the easiest way is to make the NavPoint
class parcelable. In which case we can add mNavPoints
to the intent that launches the ListActivity
with a single line of code.
showTocIntent.putExtra("CHAPTERS_EXTRA", mNavPoints);
Likewise, extracting mNavPoints
from the bundle passed to ListActivity.onCreate()
is also a single line of code.
mNavPoints = getIntent().getParcelableArrayListExtra("CHAPTERS_EXTRA");
The steps to make NavPoint parcelable are quite simple, and are given in the Google's android documentation. They are:
- Have the class implement the
android.os.Parcelable
interface. - Add the boilerplate code provided in the android instructions, replacing
MyParcelable
in the boilerplate with your class. - Implement
writeToParcel()
and the private constructor as appropriate for your class.
The rest of the steps involved in creating a ListActivity
to display a table of contents from an array of NavPoints
are trivial. You can look at this project's file ListChaptersActivity.java. For a more detailed explanation, I recommend this article.
Viewing the Content (on Android 3.0 and above)
The obvious way to view the HTML files in the zip would be to use the android.webkit.WebView
class, as it's whole purpose is to display HTML files. And the obvious way to use it is to extract the HTML files from the EPUB, and send them to the WebView using loadDataWithBaseURL(String baseUrl, String data, String mimeType, String encoding, String historyUrl)
. Unfortunately, this doesn't work because the HTML files have links to other documents. For example.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Cover</title>
<link href="resources/stylesheet00.css" type="text/css" charset="UTF-8" rel="stylesheet"/>
</head>
<body>
<img src="resources/cover.jpg" alt="cover" style="height: 100%"/>
</html>
As you can see, this HTML file has both a stylesheet and an image. So, when WebView
tries to show this document, it will try to obtain the stylesheet and JPEG. As these are both buried inside the EPUB's zip file, the WebView
is unable to obtain them and it will fail to show the document. If we're running on Android 3.0, (i.e. Honeycomb) or above, we can solve this problem by intercepting the calls the WebClient
makes to obtain the linked resources. This is done by setting the WebView
's WebViewClient
to a WebViewClient
instance that overrides shouldInterceptRequest()
to retrieve the desired file. E.g.:
public class EpubWebView extends WebView {
public EpubWebView(Context context, AttributeSet attrs) {
super(context, attrs);
settings.setCacheMode(WebSettings.LOAD_NO_CACHE);
setWebViewClient(new WebViewClient() {
@Override
public WebResourceResponse shouldInterceptRequest(WebView view, String url) {
return onRequest(url);
}
});
}
However, there is still a small problem. If you look at onRequest()
you will note that the requested file is given by a URL. You may also note that loadDataWithBaseURL()
takes a URL parameter. Thus, we need a way to convert the file names in the zip file into URLs, and back. We also need to cope with the fact that the zip file may contain folders, and the links in the HTML files may be relative. E.g.:
Assume we wish to show the HTML file "content/title.xhtml". This file references the image "content/resource/cover.jpg". Then the <img> element in the HTML file will look something like '<img src="resource/cover.jpg"/>
':
Converting the file names into file URIs solves this problem. If "file:///content/title.xhtml" is passed in as the URL, then the WebView
will ask for the jpeg with a URL of "file:///content/resource/cover.jpg". Converting file names into URIs is not just a case of tacking "file:///" to the file name, as many of the characters that can appear in a file name need to be escaped. Fortunately, Android provides APIs to do most of the work.
private static String url2ResourceName(Uri url) {
String resourceName = url.getPath();
if (resourceName.charAt(0) == '/') {
resourceName = resourceName.substring(1);
}
return resourceName;
}
public static Uri resourceName2Url(String resourceName) {
return new Uri.Builder().scheme("file")
.authority("")
.appendEncodedPath(Uri.encode(resourceName, "/"))
.build();
}
Given all the above, implementing onRequest()
, in our Book
class, is trivial.
public WebResourceResponse onRequest(String url) {
String resourceName = url2ResourceName(Uri.parse(url));
return new WebResourceResponse(
fetchFromZip(resourceName),
"UTF-8",
mManifestMediaTypes.get(resourceName)
);
}
At this point, calling loadDataWithBaseURL()
is no longer required. Instead, call loadUrl(String url)
. However, before calling loadUrl()
, "clearCache(false)
" must be called. This is because WebView
caches resource requests, and when the WebView
thinks a resource is cached, it will not call shouldInterceptRequest()
. This can be a problem, because many EPUB books use the same name for their stylesheet (stylesheet.css) and cover page image (cover.jpg). Thus, if you view one EPUB book, and then view another, the WebView
will use the cached resource from the previous book. This can be quite disorientating for the user to open an EPUB and get the title page from the previous book. The only workaround I've found to the problem is to explicitly clear the cache before calling loadUrl()
. Setting the WebView
's cache mode to LOAD_NO_CACHE
does not work.
Viewing the Content on Android 2.3 (and below)
As you've probably guessed by now, Android 2.3 does not have shouldInterceptRequest()
. The workaround is to rewrite the XHTML to put any resources inline and then pass the XHTML to the WebView
by calling loadDataWithBaseURL()
. To use our example XHTML from earlier:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Cover</title>
<link href="resources/stylesheet00.css" type="text/css" charset="UTF-8" rel="stylesheet"/>
</head>
<body>
<img src="resources/cover.jpg" alt="cover" style="height: 100%"/>
</html>
This has two sorts of links; a stylesheet and an image. Conceptually, a stylesheet link is easy to remove. The steps are:
- Find the stylesheet link(s). In this case, there's just one,
<link href="resources/stylesheet00.css" type="text/css" charset="UTF-8" rel="stylesheet"/>
- Find the referred stylesheet file ("resources/stylesheet00.css") in the zip.
- Fetch the contents of the stylesheet. e.g.
.bold {font-weight: bold}
- Replace the link in the XHTML with a stylesheet element holding the stylesheet's contents. e.g.
<style>.bold {font-weight: bold}</style>
Removing the link in image element is a similar, but slightly more difficult, process because we can't just inject the jpeg's raw bytes into the XHTML. Instead, we need to pack the JPEG into a DataURI
and put that into the XHTML. Which I will get to in a moment.
The final link I've seen in an EPUB is a SVG image. These look something like:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"
width="100%" height="100%" viewBox="0 0 300 400" preserveAspectRatio="xMidYMid meet">
<image width="600" height="800" xlink:href="Cover.jpg"/>
</svg>
This is a problem because the Android 2.3 WebView
does not have support for SVG elements. The solution I use is to convert these into <img>
elements.
DataURI
Wikipedia has an excellent article on DataURIs. However, the concept is to replace a link to a file with the base64 encoded contents of the file. e.g. Replace the image tag from the example above:
<img src="resources/cover.jpg" alt="cover" style="height: 100%"/>
with something like:
<img src="data:image/png;base64,iQU228cmaui98as..." alt="cover" style="height: 100%"/>
The "iQU228cmaui98as..." that I've shown here is just the start of a base64 sequence that usually runs to tens of thousands of characters. You will note that the DataURI
includes the mime type of the file. Obtaining this does not present a problem, as the manifest includes this information, which we stored away in the "mManifestMediaTypes
" member. So, we can retrieve a file as a DataURI
with the following function:
public static String fetchDataUri(String fileName) throws IOException {
StringBuilder sb = new StringBuilder("data:");
sb.append(mManifestMediaTypes.get(fileName));
sb.append(";base64,");
int buflen = 4096;
byte[] buffer = new byte[buflen];
int offset = 0;
int len = 0;
InputStream in = fetchFromZip(fileName);
while (len != -1) {
len = in.read(buffer, offset, buffer.length - offset);
if (len != -1) {
int total = offset + len;
offset = total % 3;
int bytesToProcess = total - offset;
if (0 < bytesToProcess) {
sb.append(Base64.encodeToString(buffer, 0, bytesToProcess, Base64.NO_WRAP));
}
System.arraycopy(buffer, bytesToProcess, buffer, 0, offset);
} else if (0 < offset) {
sb.append(Base64.encodeToString(buffer, 0, offset, Base64.NO_WRAP));
}
}
return sb.toString();
}
Implementing an XML filter pipeline with XMLFilterImpl
So we're now at the stage where we want to take the XHTML file, run a series of conversions (e.g. turn any links into embedded resources) and feed the resulting XHTML into the WebView
. How can we do this?
Well, so far we've being doing our processing using the ContentHandler
hooked into an XMLReader
. But for this XHTML conversion, writing a single ContentHandler
to do the job would require a pretty complex ContentHandler
. A better solution is to use the XMLFilterImpl
class. This class allows us to build a XML processing "pipeline" of simpler processing stages, where the output of each stage is the input of the next. I.e. instead of this:
file -> XMLReader -> ContentHandler -> file
We have this:
file -> XMLReader -> InlineStyleSheetFilter -> SvgFilter -> InlineImageFilter -> XMLWriter -> file
Using the XMLFilterImpl
is pretty simple. It derives from ContentHandler
, so the basic steps are:
- Build a
ContentHandler
much as you normally would, except derive from XMLFilterImpl
- In each
ContentHandler
function you override, call XMLFilterImpl
's version of the function with the changed parameters.
Example, here's what a filter to convert <img>
elements into DataURI
s looks like.
public class InlineImageElementFilter extends XMLFilterImpl {
@Override
public void startElement(String namespaceURI, String localName,
String qualifiedName, Attributes attrs) throws SAXException {
if (localName.equals("img")) {
attrs = XmlUtil.replaceSrcAttributeValueWithDataUri(attrs);
}
super.startElement(namespaceURI, localName, qualifiedName, attrs);
}
}
Hooking the filters together in a pipeline is also easy, although it's somewhat non-intuitive.
- Start with the
XMLReader
. - Add filters by calling
setParent()
for the new filter, passing it the existing filter that will precede it in the pipeline. - Once the pipeline has been created, call
parse()
on the last filter added to the pipe.
E.g. to implement the pipeline:
XMLReader -> InlineStyleSheetFilter -> RemoveSvgFilter -> InlineImageFilter -> XMLWriter
The code would look like:
XMLReader reader = makeReader();
XMLFilterImpl stylesheetFilter = new InlineStyleSheetFilter(uri, getBook());
XMLFilterImpl svgFilter = new RemoveSvgFilter();
XMLFilterImpl imageFilter = new InlineImageFilter(uri, getBook());
XMLFilterImpl xmlWriter = new XmlWriter(uri, getBook());
stylesheetFilter.setParent(reader)
svgFilter.setParent(stylesheetFilter);
imageFilter.setParent(svgFilter);
xmlWriter.setParent(imageFilter);
xmlWriter.parse(source);
Producing XML output from an Android SAX parser
As some of you may be aware, ContentHandler
s do not produce XML output. Which begs the question, how is it done? Well, Android provides us with the org.xmlpull.v1.XmlSerializer
class for producing XML. Unfortunately, this class does not derive from ContentHandler
so it can't just be plugged into a SAX pipeline.
However, its member functions are very similar to those of the ContentHandler
, in fact, they're almost a one-to-one match. Thus, I introduce to you the XmlSerializerToXmlFilterAdapter
, which takes a XmlSerializer
and wraps it in a XMLFilterImpl
so the XmlSerializer
can be plugged into a pipeline. With the error handling stripped out, it looks like this:
public class XmlSerializerToXmlFilterAdapter extends XMLFilterImpl {
XmlSerializer mSerializer;
public XmlSerializerToXmlFilterAdapter(XmlSerializer serializer) {
mSerializer = serializer;
}
@Override
public void startElement(String namespaceURI, String localName,
String qualifiedName, Attributes attrs) throws SAXException {
super.startElement(namespaceURI, localName, qualifiedName, attrs);
mSerializer.startTag(namespaceURI, localName);
for(int i = 0; i < attrs.getLength(); ++i) {
mSerializer.attribute(attrs.getURI(i),
attrs.getLocalName(i), attrs.getValue(i));
}
}
@Override
public void endElement(String namespaceURI, String localName,
String qualifiedName) throws SAXException {
super.endElement(namespaceURI, localName, qualifiedName);
mSerializer.endTag(namespaceURI, localName);
}
}
Supporting both 2.3 and 3.0 in one app
From the previous sections, it should be reasonably obvious that there are really only two differences between an Android 2.3 and 3.0 EPUB viewer.
- In 3.0, the
WebViewClient
we create implements a shouldInterceptRequest()
handler, which is not available in 2.3. - In 3.0, we can load content into the
WebView
by calling loadUrl()
with a URI for the file to load, in 2.3 we have to fetch the XML and pre-process it before passing the actual XML to the WebView via loadDataWithBaseURL()
.
To produce a single app that works on both 2.3 and 3.0, the following steps are required.
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
EpubWebView epubWebView = null;
if (android.os.Build.VERSION.SDK_INT <= android.os.Build.VERSION_CODES.GINGERBREAD_MR1) {
epubWebView = new EpubWebView23(this);
} else {
epubWebView = return new EpubWebView30(this);
}
setContentView(epubWebView);
}
- In the AndroidManifest.xml, we set
android:minSdkVersion="9"
and android:targetSdkVersion="16"
(9 = Android 2.3 and 16 = 3.0). - We take our
WebView
derived class "EpubWebView
" and make two virtual functions, createWebClient()
and loadUri()
. - We derive two classes from
EpubWebView
, one for Android 2.3, the other for Android 3.0. In each class we implement the virtual functions, using logic that is appropriate for the intended Android version. - We decorate the Android 3.0 class with "
@TargetApi(Build.VERSION_CODES.HONEYCOMB)
", so we don't get compiler warnings that we're using functions that are not available in 2.3. - In
MainActivity
's onCreate()
we check the version of Android, create an appropriate EpubWebView
and set it to the activity's view. E.g.:
Obviously, we could have solved the problem by just using the Android 2.3 code, but that would have cost us the benefits of using the 3.0 features when they were available (such as SVG support).
We could also have solved the problem by having a single EpubWebView
class, and inside the createWebClient()
and loadUri()
functions check the Android version and execute the appropriate logic. This is not a good idea, as you wind up with OS dependant code (and tests for the OS version) being spread throughout the code. E.g. consider what EpubWebView
would look like if there were, say, 10 differences between 2.3 and 3.0 we needed to allow for.
Implementing a bookmark
A bookmark requires three things to be recorded.
- The EPUB file being viewed.
- The selected HTML file of the EPUB.
- What part of the HTML is being viewed. This is needed because the HTML files can be many pages long, often an entire chapter.
Unfortunately, in Android, there's no official API to obtain the text being shown on the screen. But, we can do something close. WebClient
's getContentHeight()
function returns a 32 bit integer indicating the viewing length of the HTML, and getScrollY()
returns an integer indicating how the text currently at the top of the screen corresponds to the length of the HTML. Dividing getScrollY() by getContentHeight() gives ratio of how far though the current HTML the user is. We use a ratio, because it allows us to handle font size changes and changing the screen orientation between landscape and portrait.
To restore a bookmark, scrolling the WebView
to the desired position is done by calling scrollTo()
once the HTML is loaded. According to the Android Docs, if you add a onPageFinished()
handler to the WebViewClient
, when a HTML file finishes loading, the handler will be called. So, the obvious solution is to add a onPageFinished()
handler to the WebViewClient
, and have the handler call scrollTo()
. Unfortunately, this doesn't work, because the handler is called before the viewing length is calculated, so the scrollTo()
call does nothing. We must wait until the viewing length has been calculated. We can do this by using PictureListener()
. This is a little complicated because PictureListener()
gets called a lot and it's very resource intensive. Which is probably why it has been depreciated. The solution is to only set PictureListener()
when we need it, i.e. when onPageFinished()
is called, and tear it down when we're done.
Thus, steps for restoring a bookmark are:
- Set flag to indicate we're restoring a bookmark. (Because the
onPageFinished()
is called every time a HTML file is loaded.) - Call
loadUrl()
. - When page is initially loaded.
onPageFinished()
is called by OS. In onPageFinished()
, if bookmark flag is set, register a PictureListener
. - When
WebView
finishes working out page layout, PictureListener
is called, which unregisters itself, gets contentHeight
and calls scrollTo()
to set the correct position.
The resulting code for handling the PictureListener
is simple, the only point of interest is the use of @SuppressWarnings
, to stop the compiler warnings. Note, setting an onPageFinished()
handler is very similar to setting a shouldInterceptRequest()
handler, which has already been shown.
private boolean mRestoringBookmark;
private float mScrollY;
@SuppressWarnings("deprecation")
protected void onPageFinished() {
if (mRestoringBookmark) {
setPictureListener(mPictureListener);
mRestoringBookmark = false;
}
}
@SuppressWarnings("deprecation")
private PictureListener mPictureListener = new PictureListener() {
@Override
public void onNewPicture(WebView view, Picture picture) {
setPictureListener(null);
scrollTo(0, (int)(getContentHeight() * mScrollY));
}
};
Implementing horizontal swipe to move between HTML files
This is almost identical to the method used in my Comic Book Viewer article. To detect flings, we override the WebView
's onTouchEvent()
, pass the MotionEvents
onto a GestureDetector
, and override GestureDetector.SimpleOnGestureListener.onFling()
to be called when a fling occurs. The only part that's different is that the WebView
has its own gesture processing logic. To make sure that works, in onTouchEvent()
we must pass any MotionEvents()
that our GestureDetector
doesn't use on to the WebView
. E.g.:
@Overridee
public boolean onTouchEvent(MotionEvent event) {
return mGestureDetector.onTouchEvent(event) ||
super.onTouchEvent(event);
}
The fling logic is trivial, as we have the HTML files in viewing order (from the spine), we just locate the current HTML file in the spine, find the next (or previous) HTML file, and call WebView
's loadUrl()
with the HTML's URI.
Using Android Text To Speech (TTS) API
The basic steps for doing text to speech in Android are:
- Check if TTS is present. This requires launching an intent and checking the return in
onActivityResult()
. Except, on Android 4.1 and above, this doesn't work. But, the 4.1 specs require TTS to be present, so skip the test. - Create an instance of
android.speech.tts.TextToSpeech
, passing it a OnInitListener
handler to call when the TextToSpeech
is available. - When the handler is called, configure the TTS. E.g. Set language, reading speed and an
OnUtteranceCompletedListener
. - You can now call
speak()
, passing in the text to speak. - Your
OnUtteranceCompletedListener
will be called when TTS finishes reading the text, at this point, call speak()
with the next text to speak.
I was planning to write more about TTS. But, this article covers the details very well. And I can't think of anything more I can usefully add.
Extracting Text from XHTML
In order to do text to speech, we need the text. Thus, we need to extract the text from the XHTML.
To do this, we use, you guessed it, a ContentHandler
. There's not really much to it. The text that is shown to a user is the inner text of the elements, so extracting the text is just collecting the results of the ContentHandler
's characters()
function. Beyond that, the only issues are:
- None of the text in the
<head>
element is visible to the user, so ignore any text found there. - Add white space after the text in some elements, e.g.
<h1>
, to prevent the text from adjacent elements running together.
For the fine details, look at the file XhtmlToText.java
in the attached source.
Credits and Thanks
I'd like to thank:
- Paul, Ross and Sean at Pharos Systems for their feedback and proofing of early drafts of this document.
- Baen Books, for their permission to use excerpts from some of their epub books as examples. (Although in the end I wrote my own.)