|
| 1 | +<!DOCTYPE qhelp PUBLIC |
| 2 | + "-//Semmle//qhelp//EN" |
| 3 | + "qhelp.dtd"> |
| 4 | +<qhelp> |
| 5 | + |
| 6 | +<overview> |
| 7 | +<p>Extracting files from a malicious tar archive without validating that the destination file path |
| 8 | +is within the destination directory can cause files outside the destination directory to be |
| 9 | +overwritten, due to the possible presence of directory traversal elements (<code>..</code>) in |
| 10 | +archive paths.</p> |
| 11 | + |
| 12 | +<p>Tar archives contain archive entries representing each file in the archive. These entries |
| 13 | +include a file path for the entry, but these file paths are not restricted and may contain |
| 14 | +unexpected special elements such as the directory traversal element (<code>..</code>). If these |
| 15 | +file paths are used to determine an output file to write the contents of the archive item to, then |
| 16 | +the file may be written to an unexpected location. This can result in sensitive information being |
| 17 | +revealed or deleted, or an attacker being able to influence behavior by modifying unexpected |
| 18 | +files.</p> |
| 19 | + |
| 20 | +<p>For example, if a tar archive contains a file entry <code>..\sneaky-file</code>, and the tar archive |
| 21 | +is extracted to the directory <code>c:\output</code>, then naively combining the paths would result |
| 22 | +in an output file path of <code>c:\output\..\sneaky-file</code>, which would cause the file to be |
| 23 | +written to <code>c:\sneaky-file</code>.</p> |
| 24 | + |
| 25 | +</overview> |
| 26 | +<recommendation> |
| 27 | + |
| 28 | +<p>Ensure that output paths constructed from tar archive entries are validated |
| 29 | +to prevent writing files to unexpected locations.</p> |
| 30 | + |
| 31 | +<p>The recommended way of writing an output file from a tar archive entry is to check that |
| 32 | +<code>".."</code> does not occur in the path. |
| 33 | +</p> |
| 34 | + |
| 35 | +</recommendation> |
| 36 | + |
| 37 | +<example> |
| 38 | +<p> |
| 39 | +In this example an archive is extracted without validating file paths. |
| 40 | +If <code>archive.tar</code> contained relative paths (for |
| 41 | +instance, if it were created by something like <code>tar -cf archive.tar |
| 42 | +../file.txt</code>) then executing this code could write to locations |
| 43 | +outside the destination directory. |
| 44 | +</p> |
| 45 | + |
| 46 | +<sample src="examples/tarslip_bad.py" /> |
| 47 | + |
| 48 | +<p>To fix this vulnerability, we need to check that the path does not |
| 49 | +contain any <code>".."</code> elements in it. |
| 50 | +</p> |
| 51 | + |
| 52 | +<sample src="examples/tarslip_good.py" /> |
| 53 | + |
| 54 | +</example> |
| 55 | +<references> |
| 56 | + |
| 57 | +<li> |
| 58 | +Snyk: |
| 59 | +<a href="https://snyk.io/research/zip-slip-vulnerability">Zip Slip Vulnerability</a>. |
| 60 | +</li> |
| 61 | +<li> |
| 62 | +OWASP: |
| 63 | +<a href="https://www.owasp.org/index.php/Path_traversal">Path Traversal</a>. |
| 64 | +</li> |
| 65 | +<li> |
| 66 | +Python Library Reference: |
| 67 | +< href="https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extract">TarFile.extract</a>. |
| 68 | +</li> |
| 69 | +<li> |
| 70 | +Python Library Reference: |
| 71 | +< href="https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall">TarFile.extractall</a>. |
| 72 | +</li> |
| 73 | + |
| 74 | +</references> |
| 75 | +</qhelp> |
0 commit comments