A possible algorithm for preventing directory traversal would be to:
- Process URI requests that do not result in a file request, e.g., executing a hook into user code, before continuing below.
- When a URI request for a file/directory is to be made, build a full path to the file/directory if it exists, and normalize all characters (e.g., %20 converted to spaces).
- It is assumed that a 'Document Root' fully qualified, normalized, path is known, and this string has a length N. Assume that no files outside this directory can be served.
- Ensure that the first N characters of the fully qualified path to the requested file is exactly the same as the 'Document Root'.
- If so, allow the file to be returned.
- If not, return an error, since the request is clearly out of bounds from what the web-server should be allowed to serve.
- Check if the string contains '..' (two periods next to each other).
- Using a hard-coded predefined file extension to suffix the path does not limit the scope of the attack to files of that file extension.
<?php
include($_GET['file'] . '.html');
The user can use %00 (return to line, \n) to bypass everything after the $_GET
[From Wikipedia -
Directory traversal attack[
^]]