Introduction
HTML5 File API is used to access files on the client machine via web browsers. There are 3 levels of File APIs in HTML5 standards:
- File API (Let me call it as File Reader API)
- File API: Writer (also called as File Writer API)
- File API: Directories and System (also called as File System API)
I spends some time on analyzing the capabilities of each level and the compatibility with different web browsers. This would be helpful to choose technologies for my future web application development which may access the client-side local files, for example, uploading and downloading files.
Experimental Browsers
I chose the following web browsers as experimental browsers. They were all in the latest versions at the time when the research was being done.
- Microsoft Internet Explorer 10
- Google Chrome 30
- Mozilla Firefox 24
- Safari 6 (Mac OS and iOS 7)
NOTE: Safari 5.1.7 is the last version for Windows. Apple terminates support for Windows from Safari 6. And Safari 5.1.7 does not support File API. In other words, we cannot use File API when we use Safari on Windows.
Demo Code
The demo code files are in a VS 2012 web application. It contains the following files:
- FileReadApi.html: demonstrates reading a small or huge file and showing the progress. Also support stopping and resuming. This demo can run in all previously listed browsers.
- FileSaver.html: demonstrates how to implement and use
FileSaver
. This demo can run in all previously listed browsers. - FileSystemApi.html: demonstrates how to use File System API and
FileWriter
. It is only workable in Chrome among the previously listed browsers. - FileSystemApiConsumer.html: lists all files created in FileSystemApi.html and reads the user specified one. Only for Chrome.
- DownloadFileViaBrowser.html and DownloadHandler.ashx: demonstrates the server code to enable browser downloading capability without exposing the file locations on the server. This demo can run in almost all browsers.
File Reader API
For standard details, please refer to http://www.w3.org/TR/FileAPI/.
Capabilities
With File Reader API, we could:
- Get one or multiple file objects (instances of the
File
interface) which the client selects via a pop-up open-file-dialog (<input type=”file”/>
). Any further file operations must be done on these file objects. - Only read the following information of the file object:
- Name without path.
- MIME type.
- Size in bytes. Updated in real-time.
- Last modified date and time. Updated in real-time.
- Read file data from and to any byte positions (via the
FileReader
interface). This can help us to implement interrupting and resuming upload. - Support RWW (Read-while-Writing: read a file that is still being written by another program which has set the file sharing mode to read).
Limitations
File Reader API has the following limitations:
- Only available for the file selected explicitly by the client from a pop-up open-file-dialog. We are not able to open a file even if we know the full file path.
- No way to monitor whether a file is deleted. So we cannot monitor flag files in RWW scenarios.
There was an interesting thing. I tried to find a workaround to monitor file deletion. My planned steps were:
- Let the client choose the existing flag file explicitly so that I can get the flag file object.
- Let my code periodically get the flag file size, last modified date and read the first byte. If there is some weird thing happening like a file-not-found exception, I could say the flag file is deleted.
However, I found the behaviors were quite different in different browsers in my experiment:
- Firefox: throwing file-not-found exception. Very good!
- Chrome: no exception and the last modified date keep changing to the Now time. Weird!
- IE 10: no exception and no information changes. Very weird!
- Safari: not tested.
The experiment indicated the planned workaround could not work for all browsers. I thought it was because the HTML5 File API standards did not give explicit description on how to handle such error. So I had to give up the workaround plan.
Browser Support
All our experimental browsers support File Reader API.
File Writer API
For standard details, please refer to http://www.w3.org/TR/file-writer-api/.
Capabilities
With File Writer API, we could:
- Save one blob as the file which the client specifies via a pop-up save-file-dialog (via the
saveAs
function and FileSaver
interface). If the file exists, it will be overwritten. - In Sandbox, write multiple blobs to a single file at any byte position (
FileWriter
interface).
Please refer to the File System API chapter for more information about Sandbox.
Limitations
File Writer API has the following limitations:
- Only one blob is allowed when writing a file using
FileSaver
. That means we cannot write to a file with multiple data chunks continuously received from Internet. It is not feasible to wait all chunks arriving and then construct a huge blob. The blob size limitation is determined by the client machine’s hardware capacity so that it is impossible and meaningless to figure out a number which is not only safe but also as large as possible. Anyway, it is OK to write small files (maybe <= 10MB). FileSaver
will not write a file before all data in the blob is ready on the client side. That means the file cannot be read by other programs on the client side before writing finishes. FileWriter
is only available in Sandbox. Actually, the standards said it could be potentially moved to File System API. - No support for deleting file.
NOTE: although BlobBuilder
and FileSaver
are in standards, they are deprecated by all browser vendors. Instead, we need to construct Blob
directly and implement our own FileSaver
or use third-party implementations.
Browser Support
All our experimental browsers support our own implementation of FileSaver
.
Chrome is the only one which supports FileWriter
.
File System API
For standard details, please refer to http://www.w3.org/TR/file-system-api/.
Capabilities
With File System API, we could:
- Create a Sandbox as a private file system (using the
requestFileSystem
function). - No size limit for the Sandbox. The only limitation is the hard disk capacity.
- Do all directory related operations within the Sandbox, including creating, renaming, deleting, copying, moving, etc.
- Do all file related operations within the Sandbox, including creating, renaming, reading, writing, deleting, copying, moving, etc.
Limitations
File System API can only be used within an Application Isolated Sandbox. Applications are differentiated by the combination of protocol, address and port. For example, “http://www.codeproject.com”, “https://www.codeproject.com”, “http://www.codeproject.com:8080” and “http://us.codeproject.com” are all different applications; “http://www.codeproject.com/my/default.aspx” and “http://www.codeproject.com/api/data/” are in the same application.
Therefore, File System API has the following limitations:
- Items in one application’s Sandbox are invisible to others.
- Items in the Sandbox is invisible to other types of web browsers. For example, a Sandbox created in Google Chrome is invisible to other non-Chrome browsers.
- Items in the Sandbox is invisible to non-browser programs.
NOTE: with the help of File Reader API and File Writer API, it is still possible to copy files between the Sandbox and the client’s file system.
Browser Support
Chrome is the only one which supports File System API among our experimental browsers. No other browser vendors guarantee they would implement it in future. In fact, it has failed to gain traction among browser vendors and is only supported in some Webkit based browsers.
The following screenshot shows the test result for Firefox on http://html5test.com. Please note the central italic comment.
Related Topics
Download via Web Browsers
One reason that we need FileSaver
or File Writer API is to enable special file download logic. Without any plugins like Flash, Silverlight or ActiveX, the best way to download a file in a web browser is to use the embedded download feature of the browser. This feature can:
- Cover all functionalities of
FileSaver
. - Download a file to the browser’s default download folder or any other folder which the client specifies in a pop-up save-file-dialog. The client is able to choose the favorite means via the browser settings.
- The client can choose whether to open the file after download.
- HTTP protocol supports Breakpoint Resuming in downloading files. Although no browser implements HTTP Breakpoint Resuming at present, some downloading software supports it. And it is very possible that some or all web browsers will support it in future.
This feature has its limitations as well:
- The file data is written to a temporary file with a fake name before download is done. The browser renames the file when download is done. So the file cannot be read by other programs until download is done.
The easiest way to enable browser downloading is to expose a file directly to the client as a URL like “http://mysite.com/GDrive/Data/Org2_53/Loc2_230/Site2_232/Folder2_242/My_244.data” which maps to the physical path of the file on the InterACT web site. No extra coding work needs to be done. However, this way has the following weakness:
- It exposes too much information to the client.
- Complicated in path mapping and deployment in web farm.
- Complicated to implement authorization.
- Not flexible for future changes, e.g. folder structure change.
A better way is to write a web service or web API or HTTP handler to serve the download requests coming from web browsers. Then we could provide download URLs like “http://mysite.com/api/files/download?id=244” which is more clean, flexible, secure and manageable.
NOTE: do NOT use HttpResponse.WriteFile
because it is not able to transfer big files to the client. The size limitation is determined by the server’s hardware capability. There is a better solution introduced by Microsoft: http://support.microsoft.com/kb/812406.
Conclusion
As a summary, there are some important memos:
- Any read or write operation to the client file system requires the client's explicit approval via a open-file or save-file dialog.
- File Reader API is widely supported for use.
FileSaver
of File Writer API can be implemented by ourselves and widely supported. - File System API and
FileWriter
are only available in some WebKit browsers like Chrome. And they can only operate files within an Application Isolated Sandbox. - Browser embedded download feature is the best choice to implement your special file download logic if you don't want to use any plugin technologies..
References
[1] “W3C HTML5 File API”, http://www.w3.org/TR/FileAPI/
[2] “W3C HTML5 File API: Writer”, http://www.w3.org/TR/file-writer-api/
[3] “W3C HTML5 File API: Directories and System”, http://www.w3.org/TR/file-system-api/
[4] “The HTML5 Test”, http://html5test.com
[5] “Exploring the FileSystem APIs”, http://www.html5rocks.com/en/tutorials/file/filesystem/, Eric Bidelman