Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Community > MacNN Lounge > We love Captchas!

We love Captchas!
Thread Tools
Andrew Stephens
Mac Elite
Join Date: Jan 2004
Status: Offline
Reply With Quote
Aug 19, 2008, 04:49 PM
 
http://news.bbc.co.uk/1/hi/technology/7567692.stm

Converting scanned text from old books into digital format by handing out the words for use in captchas. Sort of like a human OCR but in reverse. With over 100 million captchas used every day, that's a lot of old text digitised for free.

Inspired.
     
Dakar the Fourth
Addicted to MacNN
Join Date: Nov 2007
Location: In the hearts and minds of MacNNers
Status: Offline
Reply With Quote
Aug 19, 2008, 04:51 PM
 
Clever.
     
Doofy
Clinically Insane
Join Date: Jul 2005
Location: Vacation.
Status: Offline
Reply With Quote
Aug 19, 2008, 05:11 PM
 


Maybe I'm being ultra-dumb here, but if the system running the captcha doesn't know what text is being displayed in the graphic then how exactly does it function as a captcha? And if it does know what text the graphic is displaying, then what's the point in the conversion at user-level?
Been inclined to wander... off the beaten track.
That's where there's thunder... and the wind shouts back.
     
moep
Senior User
Join Date: Nov 2003
Status: Offline
Reply With Quote
Aug 19, 2008, 05:18 PM
 
I was thinking the same thing, but I guess it runs the same captcha past multiple users, compares all entered words and then picks the one with the most matches.
(does that make any sense?)
( Last edited by moep; Aug 20, 2008 at 02:25 AM. )
"The road to success is dotted with the most tempting parking spaces."
     
Andrew Stephens  (op)
Mac Elite
Join Date: Jan 2004
Status: Offline
Reply With Quote
Aug 19, 2008, 05:39 PM
 
Originally Posted by moep View Post
I was thinking the same thing, but I guess it runs the same captcha past multiple users, compares at all entered words and then picks the one with the most matches.
(does that make any sense?)
Yes that's exactly what they do. With 99.1% accuracy apparently, which is he same a a proffessional human transcriber.
     
Doofy
Clinically Insane
Join Date: Jul 2005
Location: Vacation.
Status: Offline
Reply With Quote
Aug 19, 2008, 06:25 PM
 
Originally Posted by Andrew Stephens View Post
Yes that's exactly what they do. With 99.1% accuracy apparently, which is he same a a proffessional human transcriber.
But how does that captcha the first user to see a particular image?
Been inclined to wander... off the beaten track.
That's where there's thunder... and the wind shouts back.
     
Laminar
Posting Junkie
Join Date: Apr 2007
Location: Iowa, how long can this be? Does it really ruin the left column spacing?
Status: Offline
Reply With Quote
Aug 19, 2008, 07:33 PM
 
That reminds me of another captcha use. A program will take a captcha that blocks automated email signups, display it to a human with the promise of porn if entered correctly, then uses it to automatically sign up for an email account.
     
andreas_g4
Professional Poster
Join Date: Mar 2002
Location: adequate, thanks.
Status: Offline
Reply With Quote
Aug 19, 2008, 09:12 PM
 
Now that is one clever use of technology.
     
MarkLT1
Mac Enthusiast
Join Date: Nov 2002
Location: More Cowbell...
Status: Offline
Reply With Quote
Aug 19, 2008, 09:48 PM
 
Originally Posted by Doofy View Post
But how does that captcha the first user to see a particular image?
The captcha system requires the user to decipher two words- one of the two words is known (and is what is used by the client system to prove you are human) the other is an unknown from a scanned text, and your response is added to a database. You dont know which is known and which is unknown, so you answer both.
     
vmarks
Moderator Emeritus
Join Date: Apr 2001
Location: Up In The Air
Status: Offline
Reply With Quote
Aug 19, 2008, 10:30 PM
 
Captchas are usually a bad idea for security- the email and blogging systems that use captchas don't circumvent spamming, and do inhibit the sight-disabled.

At least using them for OCR serves a decent purpose. Using them for security is a waste of time.
     
Doofy
Clinically Insane
Join Date: Jul 2005
Location: Vacation.
Status: Offline
Reply With Quote
Aug 20, 2008, 02:40 AM
 
Originally Posted by MarkLT1 View Post
The captcha system requires the user to decipher two words- one of the two words is known (and is what is used by the client system to prove you are human) the other is an unknown from a scanned text, and your response is added to a database. You dont know which is known and which is unknown, so you answer both.
Ahhhh. That makes sense now. Thanks.
Been inclined to wander... off the beaten track.
That's where there's thunder... and the wind shouts back.
     
Railroader
Banned
Join Date: Jun 2005
Location: Indy.
Status: Offline
Reply With Quote
Aug 20, 2008, 11:24 AM
 
Interesting.

I got an eye raising captcha when signing up for coupons for digital converter boxes this morning:

     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Aug 20, 2008, 10:26 PM
 
Why the fark do they not use OCR ?

This is the absolute mostest stupidest idea I have heard today.

(Yes, only today. Thanks to my employer, I run across a lot of stupid sh!t).

-t
     
hayesk
Guest
Status:
Reply With Quote
Aug 23, 2008, 12:31 PM
 
Originally Posted by turtle777 View Post
Why the fark do they not use OCR ?
Because it's not good enough on old text. These aren't crisp and clear laser prints they're dealing with.
This is the absolute mostest stupidest idea I have heard today.
(Yes, only today. Thanks to my employer, I run across a lot of stupid sh!t).
You only think it's stupid because you thought OCR was a good solution.
     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Aug 23, 2008, 03:05 PM
 
Originally Posted by hayesk View Post
Because it's not good enough on old text. These aren't crisp and clear laser prints they're dealing with.

You only think it's stupid because you thought OCR was a good solution.
I don't buy this. OCR these days is highly sophisticated. I doesn't even have problems recognizing handwriting w/o training. Example: Evernote: even your hand-scribbled notes will be converted using OCR, no training needed.

Even if a text is barely readable, a computer with OCR can do a much better job trying to understand what the different characters are by cross-comparing to other sections in that book.

-t
     
MarkLT1
Mac Enthusiast
Join Date: Nov 2002
Location: More Cowbell...
Status: Offline
Reply With Quote
Aug 25, 2008, 09:19 AM
 
Originally Posted by turtle777 View Post
I don't buy this. OCR these days is highly sophisticated. I doesn't even have problems recognizing handwriting w/o training. Example: Evernote: even your hand-scribbled notes will be converted using OCR, no training needed.

Even if a text is barely readable, a computer with OCR can do a much better job trying to understand what the different characters are by cross-comparing to other sections in that book.

-t
IIRC, they start with OCR, and it properly encodes most of the text. The OCR software flags words that it can not recognize, and sends it off to be interpreted by the peoples.
     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Aug 25, 2008, 12:13 PM
 
Originally Posted by MarkLT1 View Post
IIRC, they start with OCR, and it properly encodes most of the text. The OCR software flags words that it can not recognize, and sends it off to be interpreted by the peoples.
Now THAT would make sense.

-t
     
- - e r i k - -
Posting Junkie
Join Date: May 2001
Location: Brisbane, Australia
Status: Offline
Reply With Quote
Aug 25, 2008, 09:11 PM
 
I was wondering why I got a weird captcha containing an indecipherable name consisting of punctuated initials the other day. Something like D.C.tol. I thought I'd never get it right, but it got through. Must be this thing then

[ fb ] [ flickr ] [] [scl] [ last ] [ plaxo ]
     
   
Thread Tools
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 02:11 PM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,