Not normalising before validation bypasses security checks

A security patch that was added to Apache’s httpd to prevent path traversal vulnerability was still vulnerable. This vulnerability was reported in CVE-2021-41773: Apache Path Traversal

while (path[l] != '\0') {
        /* RFC-3986 section 2.3:
         *  For consistency, percent-encoded octets in the ranges of
         *  ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D),
         *  period (%2E), underscore (%5F), or tilde (%7E) should [...]
         *  be decoded to their corresponding unreserved characters by
         *  URI normalizers.
                && path[l] == '%' && apr_isxdigit(path[l + 1])
                                  && apr_isxdigit(path[l + 2])) {
            const char c = x2c(&path[l + 1]);
            if (apr_isalnum(c) || (c && strchr("-._~", c))) {
                /* Replace last char and fall through as the current
                 * read position */
                l += 2;
                path[l] = c;

So what is wrong with the above patch?

If we do not normalise or enforce charset-safety before security parsing, our security validation can get bypassed. Apache’s patch for the above path traversal bug is prime example of this. They did a right thing to implement a validation, but an encoded version of the path could bypass it.

How it can be attacked? The picture below tells the story

Thanks to @raaqim to point me to this finding

1 Like