ASP applications can use the
Server.HTMLEncode
API to sanitize common
malicious characters within a user-controllable string, before this is copied into
the server’s response. This API converts the characters
“ & <
and
>
to their cor-
responding HTML entities, and also converts any ASCII character above 0x7f
using the numeric form of encoding.
On the Java platform, there is no equivalent built-in API available; however,
it is simple to construct your own equivalent method using just the numeric
form of encoding. For example:
public static String HTMLEncode(String s)
{
StringBuffer out = new StringBuffer();
for (int i = 0; i < s.length(); i++)
{
char c = s.charAt(i);
if(c > 0x7f || c==’“‘ || c==’&‘ || c==’<’ || c==’>’)
out.append(““ + (int) c + “;”);
else out.append(c);
}
return out.toString();
}
A common mistake made by developers is to HTML-encode only the char-
acters that immediately appear to be of use to an attacker in the specific con-
text. For example, if an item is being inserted into a double-quoted string, the
application might encode only the
“
character; if the item is being inserted
unquoted into a tag, it might encode only the
>
character. This approach con-
siderably increases the risk of bypasses being found. As you have seen, an
attacker can often exploit browsers’ tolerance of invalid HTML and JavaScript
to change context or inject code in unexpected ways. Further, it is often possi-
ble to span an attack across multiple controllable fields, exploiting the differ-
ent filtering being employed in each one. A far more robust approach is to
always HTML-encode every character that may be of potential use to an
attacker, regardless of the context where it is being inserted. To provide the
highest possible level of assurance, developers may elect to HTML-encode
every non-alphanumeric character, including whitespace. This approach nor-
mally imposes no measurable overhead on the application, and presents a
severe obstacle to any kind of filter bypass attack.
The reason for combining input validation and output sanitization is that this
involves two layers of defenses, either one of which will provide some protec-
tion if the other one fails. As you have seen, many filters which perform input
and output validation are subject to bypasses. By employing both techniques,
the application gains some additional assurance that an attacker will be defeated
even if one of its two filters is found to be defective. Of the two defenses, the out-
put validation is the most important and is absolutely mandatory. Performing
strict input validation should be viewed as a secondary failover.
Do'stlaringiz bilan baham: