Why data mining? - Computer Science

Why data mining?

Edo Liberty

Edo Liberty: Why data mining?

1 / 19

Old programing paradigm

Input

Program

Output

The input is small and the program can store/read it many times There is a lot of domain intelligence built into the program

Edo Liberty: Why data mining?

2 / 19

Old programing paradigm

"I is a lawyer"

Complicated Grammar

correction

"I am a lawyer"

A short sentence is given to a grammar correction software. Programers and linguists produced code which is highly specialized.

Edo Liberty: Why data mining?

3 / 19

Old programing paradigm

Part of a stemming module (tiny fraction of the whole process)

Edo Liberty: Why data mining?

4 / 19

New programing paradigm

Data

Data

Data

Input

Program

Output

There is a huge (virtually infinite) amount of data The "brain" is the data and not the program

Edo Liberty: Why data mining?

5 / 19

New programing paradigm

Text from Web pages,

Blogs posts, Q&A, forums

Literature, News editorials

Wikipedia....

Emails, IMs, SMS's etc...

"I is a lawyer"

Counter

"I am a lawyer"

"I is a lawyer" appeared 800,000 times usually like "i) is a lawyer ..." or "George I. is a lawyer" etc.

"I am a lawyer" appeared as is 1,200,000 in respected sources.

Edo Liberty: Why data mining?

6 / 19

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download