The Dark Side of Covariance

What’s wrong with the following code?

var names = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
...
if (names.Contains(sqlCommand.ExecuteScalar())

This  code is intended to check whether the result of a SQL query is contained in a case-insensitive collection of names.  However, if you run this code, the resulting check will be case-sensitive.  Why?

As you may have guessed from the title, this is caused by covariance.  In fact, this code will not compile at all against .Net 3.5. 

The problem is that ExecuteScalar() returns object, not string.  Therefore, it doesn’t call HashSet<string>.Contains(string), which is what it’s intending to call (and which uses the HashSet’s comparer).  Instead, on .Net 4.0, this calls the  Enumerable.Contains<object>(IEnumerable<object>, string) extension method, using the covariant conversion from IEnumerable<string> to IEnumerable<object>.  Covariance allows us to pass object to the Contains method of any strongly-typed collection (of reference types).

Still, why is it case-sensitive?  As Jon Skeet points out, the LINQ Contains() method is supposed to call any built-in Contains() method from ICollection<T>, so it should still use the HashSet’s case-insensitive Contains().

The reason is that although HashSet<String> implements ICollection<string>, it does not implement ICollection<object>.  Since we’re calling Enumerable.Contains<object>, it checks whether the sequence implements ICollection<object>, which it doesn’t.  (ICollection<T> is not covariant, since it allows write access)

Fortunately, there’s a simple fix: just cast the return value back to string (and add a comment explaining the problem).  This allows the compiler to call HashSet<string>.Contains(string), as was originally intended.

//Call HashSet<string>.Contains(string), not the
//covariant Enumerable.Contains(IEnumerable<object>, object)
//http://blog.slaks.net/2011/12/dark-side-of-covariance.html
if (names.Contains((string)sqlCommand.ExecuteScalar())
(I discovered this issue in my StringListConstraint for ASP.Net MVC)

CAPTCHAs do not mitigate XSS worms

One common misconception about web security is that protecting important actions with CAPTCHAs can prevent XSS attacks from doing real damage.  By preventing malicious code from scripting critical tasks, the idea goes, XSS injections won’t be able to accomplish much.

This idea is dangerously wrong. 

First of all, this should not even be considered except as a defense-in-depth mechanism.  Regardless of whether the actions you care about are protected by CAPTCHAs, XSS attacks can create arbitrary UI on your pages, and can thus make “perfect” phishing attacks.

Also, even with CAPTCHAs, an XSS injection can wait until the user performs the critical action, then change the submitted data to the attacker’s whim.

For example, if Twitter took this approach to prevent XSS injections from sending spammy tweets, the attacker could simply wait until the user sends a real tweet, then silently append advertising to the tweet as the user submits it and fills out the CAPTCHA.

However, there is also a more fundamental issue.  Since the injected Javascript is running in the user’s browser, it simply display the CAPTCHA to the user and block all page functionality until the user solves the CAPTCHA.  The attacker can even put his own text around the CAPTCHA to look like a legitimate security precaution, so that the (typical) user will not realize that the site has been compromised.  (that could be prevented by integrating a description of the action being performed into the CAPTCHA itself in a way that the attacker can’t hide)

I haven’t even mentioned the inconvenience of forcing all legitimate, uncompromised users to fill out CAPTCHAs every time they do anything significant.

In summary, CAPTCHAs should only be used to prevent programs from automatically performing actions (eg, bulk-registering Google accounts), and as a rate-limiter if a user sends too many requests too quickly (eg, getting a password wrong too many times in a row).

XSS can only be stopped by properly encoding all user-generated content that gets concatenated into markup (whether HTML, Javascript, or CSS)