In time: May 2010

Saturday, May 29, 2010

很牛的Picasa

Picasa网络相册中添加了一个“名称标签”的功能很强，内部的人脸检测和人脸识别正确率做得很高，加上友好的标注界面，让用户可以迅速地标完整个相册中的人脸，让整个相册按人脸进行组织。我猜其内部包含：
1. 一个多视角人脸检测器(MVFD)，先检测出所有人脸
2. 人脸聚类算法将检测的人脸进行聚类，将按聚类从大到小先呈现给用户加注标签（大类代表此人在相册中占主导地位）。
3. 每个聚类中首先第一轮呈现给用户的照片是经典视角的脸，用户标注完后，呈现第二轮的（这中间应该有个学习的过程，在线学习。）

比较两个大文件是否相同

可以计算它们的MD5，看MD5值是否相同
//Code for calculate MD5
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.Security.Cryptography; namespace MD5
{
class Program
{
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder();
FileStream fs = new FileStream(@"F:\tiny_images.bin", FileMode.Open); //The file path
MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
byte[] hash = md5.ComputeHash(fs);
fs.Close();
foreach (byte hex in hash)
sb.Append(hex.ToString("x2"));
string md5sum = sb.ToString();
}
}
}

Friday, May 28, 2010

Robot Vision

Pictures taken by a robot can often have significantly different properties, both in terms of image quality and viewing geometry, when compared to those taken by a human.

Thursday, May 27, 2010

VS不能调试

To do this:
1) Goto Project->Properties
2) Make sure "Configuration" at the top is "Debug"
3) On the left, select "C/C++", then "General"
4) On the right, change "Debug information format" to "Program Database for edit and continue (/ZI)"
5) On the left, Select "Optimization"
6) On the right, Change "Optimization" to "Disabled (/Od)"
7) On the left, select "Code Generation"
8) On the right, change "Runtime library" to "Multi-Threaded Debug (/MTd)"
9) On the left, expand "Linker" and select "Debugging"
10) On the right, change "Generate Debug info" to "Yes (/DEBUG)"
11) Rebuild your project.

Tuesday, May 25, 2010

Human Computation

CAPTCHA

A CAPTCHA or Captcha (pronounced /ˈkæptʃə/) is a type of challenge-response test used in computing to ensure that the response is not generated by a computer. The process usually involves one computer (a server) asking a user to complete a simple test which the computer is able to generate and grade. Because other computers are unable to solve the CAPTCHA, any user entering a correct solution is presumed to be human. Thus, it is sometimes described as a reverse Turing test, because it is administered by a machine and targeted to a human, in contrast to the standard Turing test that is typically administered by a human and targeted to a machine.It is a contrived acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart."
Applications

CAPTCHAs are used to prevent automated software from performing actions which degrade the quality of service of a given system, whether due to abuse or resource expenditure. CAPTCHAs can be deployed to protect systems vulnerable to e-mail spam, such as the webmail services of Gmail, Hotmail, and Yahoo! Mail.

CAPTCHAs found active use in stopping automated posting to blogs, forums and wikis, whether as a result of commercial promotion, or harassment and vandalism. CAPTCHAs also serve an important function in rate limiting, as automated usage of a service might be desirable until such usage is done in excess, and to the detriment of human users. In such a case, a CAPTCHA can enforce automated usage policies as set by the administrator when certain usage metrics exceed a given threshold. The article rating systems used by many news web sites are another example of an online facility vulnerable to manipulation by automated software.

Circumvention

There are a few approaches to defeating CAPTCHAs:

* exploiting bugs in the implementation that allow the attacker to completely bypass the CAPTCHA,
* improving character recognition software, or
* using cheap human labor to process the tests (see below).
CAPTCHA is vulnerable to a relay attack that uses humans to solve the puzzles. One approach involves relaying the puzzles to a group of human operators who can solve CAPTCHAs. In this scheme, a computer fills out a form and when it reaches a CAPTCHA, it gives the CAPTCHA to the human operator to solve.

Spammers pay about $0.80 to $1.20 for each 1,000 solved captchas to companies employing human solvers in Bangladesh, China and India.

Another approach involves copying the CAPTCHA images and using them as CAPTCHAs for a high-traffic site owned by the attacker. With enough traffic, the attacker can get a solution to the CAPTCHA puzzle in time to relay it back to the target site. In October 2007, a piece of malware appeared in the wild which enticed users to solve CAPTCHAs in order to see progressively further into a series of striptease images. A more recent view is that this is unlikely to work due to unavailability of high-traffic sites and competition by similar sites.

These methods have been used by spammers to set up thousands of accounts on free email services such as Gmail and Yahoo!. Since Gmail and Yahoo! are unlikely to be blacklisted by anti-spam systems, spam sent through these compromised accounts is less likely to be blocked.

Human solvers are a potential weakness for strategies such as Asirra. If the database of cat and dog photos can be downloaded, then paying workers $0.01 to classify each photo as either a dog or a cat means that almost the entire database of photos can be deciphered for $30,000.

Monday, May 17, 2010

Read file larger than 2GB

fopen
fseek(out,offset,SEEK_SET); /*the values used by fseek and ftell are long (not unsigned) so you may have trouble with files greater than 2GB*/
_fseeki64(out, offset, SEEK_SET);//可以读>2GB文件

In time

Pages