Even the most casual user of the internet is familiar with the irritating moment at which a website asks you to "Type in the characters in this image" - surely it knows what they are?!
The slightly more advanced surfer is acquainted with the reason why, too - the point is that computers can't read images! And, if you don't want spammers using automated programs to send thousands of comments/applications/requests to your website every day, you need to ensure that the end-user is an actual human. This is akin to the, in some circles well-known, concept known as the Turing Test.
The Turing Test is a test of a machine's capability to react in the same way as a human to stimuli. The classic example, often cited in relation to this matter, is of the game in which a person and a computer both respond to the same question, and another person attempts to discern which is which, given that they are both trying to impersonate the other. A computer which can succeed in convincing a person that it is human is said to have passed this test.
This particular test is orientated towards textual responses to textual questions; however, the concept of the CAPTCHA (NB: the term is ™ Carnegie Mellon University, so don't go using it everywhere :D) deals with a substantially separate issue.
The difference is twofold.
Obviously, the aspect of the human mind which is being tested for is no longer the ability to respond to a series of generic letters - it is to identify the shape of letters concealed within an image - i.e. from a set of coloured pixels. This task is of a significantly different nature.
Also, the entity (thing, person, computer etc.) which is performing the test is not a human playing the game, but a computer - the server. Therefore, the CAPTCHA-style tested is comprised of a computer trying to identify whether the entity it is communicating with is a human or a computer. This means that the test must have a firm logical grounding, in practice.
The general method for using CAPTCHA-style tests follows a structure similar to the following:
- Receive new connection
- Make some note of a detail which identifies it (such as PHP session ID, or IP address)
- Generate the string (i.e. piece of text) to be disguised
- Record it along with the identifying information
- Write the text to an image, applying distortions and so on
- Send the image to the user
- Wait for a reply
- Test if the response matches with the recorded string, and act accordingly
I've chosen to use the most common platform for this - a PHP script with the GD library. However, I have chosen to write my own script to generate the images... it's not very good. Have a look at the commenting page for some idea of how very bad they are :D
Anyway, I think that this will make a good front line against spammers (especially seeing as the comments must be approved anyway) seeing as it is a unique format, with even the image format being randomised. Shortly, as I do say on the commenting page, I will soon make it possible to register. Then, after the account is approved, no more verification images or moderation delays! Obviously, the registering system will be abused, but I can then ban accounts, and make the registering process tedious for the liar.
That's all folks! Why don't you comment me?