1. Committee Members and Signatures:



Master Project ReportPhishLurk: A Mechanism for Classifying and Preventing Phishing Websitesby: Mohammed Alqahtani1. Committee Members and Signatures: Approved by Date __________________________________ _____________ Advisor: Dr. Edward Chow __________________________________ _____________ Committee member: Dr. Albert Glock __________________________________ _____________ Committee member: Dr. Chuan YueAbstractPhishing attackers haves been increasingly improving and sophisticating their attempts using different ways and methods to target users. At the same time, users started are using varieties ways to access the internet with different platforms, different computation capabilities and various level of protection support which expands the surface for phishing attackers and complicates the provisioning of security protection.I proposed PhishLurk, an anti-phishing search website that classifies and prevents phishing attacks. PhishLurk provides the protection from the server side and uses the coloring scheme and note text warning for classification in order to consume as little computation and screen resource as possible on the client-side. It can work efficiently with varieties of devices having different capabilities. PhishLurk uses PhishTank as the blacklist provider and checks the list in real time to achieve the maximum possible accuracy. The idea of PhishLurk can be a useful enhancement, if it is adopted by major search engines, e.g.,i.e. Google and Yahoo. Besides the mechanism can be optimized to apply and work efficiently forin smartphones. Introduction Phishing is a cybercrime when an attacker tries in an attempt to gather personal and financial information, such as usernames, passwords, and credit card numbers, from recipients, information such as usernames, passwords and credit card details, by pretending to be a legitimate website. Mostly, phishing attacks come into two types: emails and webpages that spoof or lure the user to enter sensitive information. On other words, phishing is directing users to fraudulent web sites in order to get the sensitive information. The sensitive information can be confidential information or financial data [22]. Fiugre 1 shows a sample of phishing website. Phishers used to utilize emails to lure the targets to give away some information. Lately, Phishers started to used different methods to lure and steal the targeted users’ information, Methods such as faked websites, tTrojans, key-loggers and screen captures [23].Fiugre1 : Sample of a phishing website (source: )1.1 Impact of phishingPhishing has been a major concern in the IT security. In the U.S., companies lose more than $2 billion every year as results of phishing attacks [6]. 1.2 million users in the?U.S. were phished between May 2004 & May 2005 which approximately cost $929 million[6]. AOL-UK announced that one out of twenty users has lost money from phishing attacks [25]. In 2010 a survey indicates that generally between half a billion dollars to $1 trillion every year is the loss from cybercrime due to the loss of confidential banking information or corporate data [25].BackgroundRecently, Users started to have more varieties of access to surf the internet for example notebooks, PC, game console, handhelds, and smartphones , However; using more varieties of devises made in different abilities and features leads tomake it complicate and sophisticateto provideing a full protection, especially from phishing attacks methods. Yet Currently there is no such a completeperfect protection. One of the most used devices is smartphones. According to a survey of ComScore, Inc. the number of smartphones subscribers increased 60 percent in 2010 compared to 2009 [4]. Another report by Nielsen Company indicates that by 2011 half of cell-phones users would be using smartphones [5]. Users prefer to use these types of access to do their activities and tasks due to the advantages they provide. Si.e. smartphone is preferred to use because of the easiness, flexibility, and mobility that smartphone have. Some activates activities such as online banking, paying bills, online shopping, emailing, and social networking[5] demand users need to enter sensitive information to complete the authentication and authorization process., Ssensitive information could be credit-cards numbers, password and usernames. In fact, having varieties of accesses to the internet would expands the surface for phishing attackers and complicate the protection. History of Phishing The idea of luring people to give away their sensitive information started back in the seventies [27]. Phishers used the combined phishing technique: making phone calls “Phreaking” and luring the target client “Fishing”. In mid-1990’s, the main target of phishing attackers’ target was America Online (AOL). Phishers keep sending instant messages to users, using social engineering and similar domain names like , to lure users to reveal their passwords. Then, utilize users’ account for free. Later attackers started seeking for more details and information such as credit card numbers and social security numbers. During the past ten years, Phishing attackers start attacking at a higher level and target users withof financial service and online payment directly such as E-buyers, PayPal, eBay and banks. In addition to the previous techniques, attackers used more advance techniques such as key-logging, browser vulnerabilities, and link obfuscation [27].Most Targeted IndustriesAs result of the denseintense confidential content and financial use, the financial services and online payment areis the most targeted industries by phishing attackers [22]. Figure 2 shows the distribution of the phishing activities by the targeted areas.Figure 2. Phishing Activity Trends Report - 2nd Half 2010 - Anti-Phishing Working Group (APWG)Why Phishing Works Phishing works because of many reasons., Oone of the most common reasons is the users’ carelessness and ignorance about how to differentiate whether the website is legitimate or phishing [1]. Moreover, phishing attackers work hard by sending millions of messages and attempts, looking for vulnerabilities, and seeking for sensitive information.Existing Work Anti-Phishing: MThere are many techniques have been proposed focusing on anti-phishing, using different methods of filtering and detection, such as black lists, plugs-in, extensions, and toolbars for browsers [2]. The developers of desktop browsers try hard to provide a solid protection such as warning the user by displaying a box massage if the website is a potential phishing website, or contains invalid or expired SSL certificates. Mostly Often a third party and black-lists are involved to display and identify phishing websites [3]. Related Work PhishTank is an unprofitable nonprofit project aimed to build dependable database of phishing URLs [7]., Tthe project is to collect, verify, track, and share phishing data. In order to report a phishing link, the user has to be registered as a member. So the admin can learn and judge each member's contribution. The phishing websites can be reported and submitted via emails or via PhishTank’s websites. The data are verified by a committee after they are submitted by the members. PhishTank’s database can be shared via the an API. The links in the original database are only classified as “phishing” and “unknown”. We propose to will classify the phishing links based on PhishTank database with a more precise modification and used them in the proposed project. PhishTank has been working effectively to fight against phishing attacks, thousands of phishing links are monthly detected and verified as valid phishing sites monthly [9]. , It usesing the public’s effort and contribution to build a trustworthy and dependable database that is open for everyone to use and share. As a result, of that several well-known organizations and browsers started using PhishTank database such as Yahoo mail, Opera, MacAfee, and Mozilla Firefox [10]. In my prototype, I use PhishTank as a phishing blacklist provider.In the paper titled “Large-Scale Automatic Classification of Phishing Pages [2]”, Colin Whittaker, Brian Ryner, and Marria Nazif proposed an automatic classi?er to detect phishing websites. The classi?er maintains Google’s phishing blacklist automatically and analyzes millions of pages a day including examining the URL and the contents to verify whether the page is phishing or not. The paper proposed a classifier works automatically with large-scale system which will maintain a false positive rate below 0.1% and reduce the life time of phishing page. They used machine learning technique to analyze the web page content. In my project, the determination is based on Phishtank’s blacklist, however; I aim to provide a methodology for classification the phishing website. My ultimate goal is not to determine whether the page phishing or not, PhishLurk determines depending on PhishTank’s blacklist, but to provide a new method to classify phishing links and considering two factors: consuming as less memory and screen space as possible which eventually improve the overall classification efficiency.In the paper titled “PhishGuard: A Browser Plug-in for Protection from Phishing [8], Joshi, Y. Saklikar, S. Das, D. Saha, proposed a mechanism to detect a forged website via submitting fake credentials before the actual credentials during the login process of a website, then the server-side analyzes the responses of the submissions of all those credentials to determine whether the website is phishing or not. The mechanism was implemented on browsers side “user-side” as plug-in of Mozilla Firefox, However; the mechanism only detects during the log-in process for a user. If another user log-in to the same phishing website, he will goes through the same detection process. In my project, if the website reported as phishing site, no other user can get access, the reported link will be blocked, to the reported website.In the paper titled “BogusBiter: A Transparent Protection Against Phishing Attacks [17]” Chuan Yue and Haining Wang proposed a client-side tool called BogusBiter that send a large number of bogus credentials to suspected phishing sites and hides the real credentials from phishers . BogusBiter is unique it alsoand helps legitimate web sites to detect stolen credentials in a timely manner by having the phisher to verify the credentials he has collected at that legitimate web site. Bogus Biter was implanted as Firefox 2 extension. , however; My project is different since itthat uses the server side to provide the protection.In the paper titled “The Battle Against Phishing: Dynamic Security Skins [18]” Rachna Dhamija and J. D. Tygar proposed, an extended paper of [1], an anti-phishing tools helps user distinguishing if they are interacting with a trusted site or not by [1]. This approach uses shared cryptographic image that remote web servers use to proof their identities to users, in a way that supports easy veri?cation for humans being and hard for attackers to spoof/, however; in my project there is no dependency on the client-side. It[18] can’t provide protection when we have users utilizing a public access because the approach requires support from both client-sides and server-side. In my project there is no dependency on the client-side.BlacklistingBlacklisting simply is the idea of denying the access to resources based on a list. The blacklisting is determined either by a mechanism automatically e.g.,i.e. Google’s blacklist [2] or by the users’ feedback as the case in PhishTank [7], where users submit and report the suspicious websites. The object of a blacklist can be a user, IP, website, or software.We can classify varieties of blacklists as follows: Content filter: It is aA proxy server to filter the content., Tthe proxy server not only blocks banned URLs using blacklist but also use keywords, metadata, and pictures to filter the content. Examples of content filters include DansGuardian [28] and SquidGuard [Refs]. In SquidGuard, The proxy use advance web filtering polices to prevent inappropriate content for the organization or company. The filter blocks URLs using blacklist, controls the content by using the inferred keywords blocking from the metadata and the page content. SquidGuard mostly are used mostly at educational environments and for kids’ protection. The main goal of content filter is to speed up the access control management efficiently. In DansGuardian, the client requests URLs, DansGuardian collects them and compare against the blacklist and whitelist. In case the requests is clean, then DansGuardian passes along the URL request. If the URL it’s is not clean, then DansGuardian blocks it [28].E-mail spam filter: It monitors, prevents, and blocks spam emails and phishing emails using a blacklist of spam emails resource. It prevents them from reaching the client side. There are many blacklists of emails’ anti spams, e.g.,i.e., GFI MailEssentials’s list, ATL Abuse Block List, Blacklist Master, Composite Blocking List (CBL), and SpamCop.Many web-browsers and companies use their own black list against spams and phishing, e.g.,i.e., IE, Google, and Norton.Current Browser’s Phishing Protection Most popular browsers provide a phishing filter that warns users from malicious websites including phishing websites. Filters mainly depend on certain lists to detect the malicious websites. IE7 used “Phishing Filter” that has been improved to be SmartScreen Filter in later version of IE due to the weak protection phishing filter provides [15]. In IE 8 and IE 9 "SmartScreen Filter" verifies the visited websites based on the updated list of malicious websites that Microsoft created and updated continuously [11] [12]. Similar to IE, Safari browser has filters checkings the websites while the user browsing against a list of phishing sites. After the warning of PayPal to its members that Safari is not safe for their service [13], Safari started to use an extended validation certificates to support analyzing websites [14]. EFirefox earlier versions of Firefox take advantage of ant-phishing companies such as GeoTrust, or the Phish-Tank, using their list to support identifying malicious websites. The current version of Firefox has adopted Google's anti-phishing program to support its phishing protection.Many research projects have proposed mechanisms that implemented as browser plugs-in and or tool-bar against phishing attack. The main problem with plugs-in and tool bar is the need for users’ cooperation. Users may not cooperate and install the tool. Some users occasionally prefer to turn their filter off to brows faster [16]. Plugs-in and tools bar in some devices may not be as effective as it in desktop browser due to the limitation in the performance and the screen space as the case in smartphones. Classification of Phishing DefenseThe different phishing defense approaches can be further classified based on where the alerts are generated:? Browsers themselves: IE9, Firefox 5. ? Browsers extensions or plug-ins: BogusBiter, PhishGuard.? Anti-phishing Search Site: PhishLurk “my project”.? Proxy server: Dansguardian [20].? Anti-phishing Server: OpenDNS [19], GFI MailEssentials [21], and some browser extensions use server side partially such as Skins [18]. According to the official website [20], DansGuardian is an active web content filter that filters web sites based on a number of criteria including website URL, words and phrases included in the page, file type, mime type and more. DansGuardian used is configured as a proxy server that control, filter, and monitor all content.,Therefore So its functions more than anti-phishing. There is no such a project using proxy server as anti-phishing but it can be really an effective technique to classify and prevent phishing websites.The Proposed ProjectIn this project we propose to create a software tool, called PhishLurk, aiming to classify and blocking phishing links. PhishLurk uses PhishTank as the provider of the blacklist. PhishLurk indicates the risk to users and consumesing as little computation and screen recourses as possible, using coloring scheme and note warning annotation. The process is fully done on the search server side and delivers classified and protected links to the users. Even if the phishing protection was disabled or uninstalled on client-side, PhishLurk still provide protected and classified links to the user. Figure 3 shows explains PhishLurk’s scenario against phishing sites. In addition, PhishLurk has a database which contains records for the visits of each website, and how many times the website has been visited. Figure 3: Diagram explains PhishLurk’s scenario against phishing sitesDesign of PhishLurkPhishLurk Components: Classifier: to assesses and classifies the links based on PhishTank’s blacklist.Logger: records the visits of each link, how many times the link has been visited.Blacklist: an updated blacklist and Live checking using API.Database: to store every single visited link, the number of visits for each link and the link’s class. Figure 4 is a diagram shows the design of PhishLurk.Figure 4: Diagram shows the design of PhishLurkClassifier: PhishLurk’s mechanism assesses and classifies the links based on PhishTank’s blacklist, The mechanism classifies as following:Phishing link (Red): It is an absolute phishing link. The link will be disabled, so even if the user is ignorant or surfing carelessly as we saw in the survey [1], there is no way to access the link.Unknown link (Orange): It is a suspicious link. It might potentially be a phishing link. It could be a link indicate the same name or part of a real company's name asking the user to provide sensitive information. The link is submitted as a phishing link but it hasn’t been verified yet. If the user clicks and gets access to this type of sites, it is their own responsibility. The user gets warned before accessing the link.Safe Link (Blue): These are safe links, totally not phishing. The user can access the link without triggering warning messages. Figure 5 shows the categories of links that PhishLurk classifies.TypeDescriptionColorTreatmentPhishing linkA valid phishing link, high risk.RedDisabled, Users will be warned highly not to access the links.Unknown linkSuspicious links, might be potentially phishing, but not verified yet.OrangeUsers are warned about potential impact. Safe Linklinks that are not blacklisted. Blueuser can access the link without triggering warning messagesFigure 5: Table showing the categories of links PhishLurk classifiesBlackList: PhishLurk utilizes PhishTank’s blacklist. In order to achieve the possible maximum accuracy, PhishLurk updates the blacklist using two different methods: Updating the blacklist periodically: downloading it every 24 hours.Live checking using API.Here the live checking is referred to checking individual url with PhishTank. If you have 10 urls in the web page, 10 queries to PhishTank will be issued. Therefore there are trade-offs between these two approaches.Logger: PhishLurk has a logger that records the number of visits for every single visited link within the web application and stores the data’s logs in PhishLurk’s database including URLs, visits and the current class of the URL. Database: It is a database to store the records of every single visited link including the number of visits for each link and the link’s class. Users can have an access to the database to view the table of the all likes have been visited by PhishLurk’s users; the links are also colored based on their class on the revised webat the view page. ImplementationPhishLurk’s is programmed using in PHP. PHP is wildly widely used in web server side programing and deployed on many web servers. PHP currently is supported by most of web servers including Apache and Microsoft Internet Information Server. PHP works easily with HTML and provides the ability to interact with the user dynamically. Given that PhishLurk’s mechanism is aimed to use as less space and competition as possible in the client side, PhishLurk uses CSS for classifying and indicating the risk level of the links, due to the light computation CSS consumes. I created a database using MySQL to store the logs records. PhishLurk utilizes PhishTank’s blacklist, Therefore; I used two methods to read and to update the blacklist from PhishTank: Live checking and periodic downloaded blacklist.The Information flowThe information flow in PhishLurk starts by receiving the keywords queries from the user. Next, the keyword is transferred to the search engine to execute the queries. Then, the PhishLurk Classifier received query results and classifies them based on PhishTank’s blacklist. After the classification, PhishLurk creates log records for all the visited URLs and registers the visit. Finally, requested URLs are delivered to users’ browsers. Figure 6 explains the information flow in PhishLurk.Figure 6: Flowchart showing the information ?ow in PhishLurk.PhishLurk needs a search engine to process the search queries. I used to process the queries. I will explain why I use Google in Section 8. To send quires to Google, I used the following statements: $gg_url = ''. urlencode($query) . '&start=';$ch = curl_init($gg_url.$page.'0'); curl_setopt_array($ch,$options); $scraped=""; $scraped.=curl_exec($ch); curl_close( $ch ); $results = array(); preg_match_all('/a href="([^"]+)" class=l.+?>.+?<\/a>/',$scraped,$results);ToAnd receive the results back from Google and to show them, through PhishLurk usesing the functionfollowing statements:$ch = curl_init($gg_url.$page.'0'); curl_setopt_array($ch,$options); $scraped=""; $scraped.=curl_exec($ch); curl_close( $ch ); $results = array(); preg_match_all('/a href="([^"]+)" class=l.+?>.+?<\/a>/',$scraped,$results);For each link of the page results, Metadata function is used to show the website’s title and the description related to the URL for each link of the page results.$content = file_get_contents($url);$title = getMetaTitle($content);$description = getMetaDescription($content); Blacklist PhishLurk needs to use the blacklist to classify a link. To check against the blacklist I used two methods: updated blacklist and live checking. Updated BlacklistPhishTank provides a downloadable database “blacklist” in different formats and updated hourly to facilitate utilizing PhishTank’s blacklist and phishing detection in your application. The PHP format of the blacklist is available on: ()., THowever, the blacklist file is big. The average size of the black list is between 13 and 17 MB, which consumes takes more time to process and slowser the performance during the updateting. To improve the performance, I minimize size of the blacklist by first changing its format from I minimize the size of the blacklist by first changing its format from\Phish-idPhish_detail_urlURLSubmission_timeVerifiedVerification_timeOnlineTargetTotoPhish-idURLClassI removed the fields that I don’t use in my prototype. I created a function that reads the list from the file Blist.txt and if the link is blacklisted, and it is classified as “phishing”. If a link are reported as a potential phishing link but not yet verified, it is classified as “unknown”. $class= 0; $file_handle = fopen("blist.txt", "rb"); while (!feof($file_handle) ) { $line_of_text = fgets($file_handle); $parts = explode(',', $line_of_text); if ($url==$parts[0]) { $class= $parts[1];} elsif ($url==$parts[0]) { $class= $parts[2];} } fclose($file_handle);Due to the parsing errors in processing the blacklist, I changed the formatresort to use the ing Excel’s function. which means theThe drawback is that the process is changed process isto partially manual. The problem is solved by using live checking.Checking the URLs Live:I used the API to make a live checking with the blacklist. This method also works with HTTP POST request, the same PhishLurk uses, and responds with the URL's status in the database. I created a parameter called $phishtank that PhishLurk sends it to the PhishTank API-checking: $phishtank = file_get_contents("$url");For example the Link “uccs.edu” has been received from the search results and will be sent to get live checked to PhishTank.$phishtank=file_get_contents("");TAnd the responsend appears in XML format as the following.:<response><meta><timestamp>2011-08-18T04:09:22+00:00</timestamp><serverid>2d5c2cb</serverid><requestid>192.168.0.109.4e4c90729dea26.99932296</requestid></meta><results><url0><url><![CDATA[ ]]></url><in_database>false</in_database></url0></results></response> ClassificationAfter the checking, the links go through the classification function. Tthe process is explained fuarther. Phishing Links: if ($class == 1) { Shows a note = "This web page has been reported as a phishing webpage based on our security preferences"the user redirected to warning.php with class1 and its URL.Scheme color colors the link red and prints small tag next to the title (Phishing Link).Unknown Links:Elseif ($class ==2){ Shows a note ="This web page might potentially be p an phishing page" the user redirected to warning.php with class 2 and its URL.Scheme color colors the link orange and prints small tag next to the title (Known Link).Safe Links:Else ($class == 0){ the user transferred directly to the logger ” log.php” with class 0 and its URL. Then to the targeted URL through}Warning:I created one a dynamic page, “warning.php” for the generateing the warnings. ”warning.php”, Hhaving one dynamic warning page for all classes is useful to control the writing of the log records. First, the warning page recognizes the link’s class using ($_GET['class'] == “class # ”) . Then it shows the warning of that class. The process is as follows : if the class == 1 // phishing link{Print “Phishing Site!”Display: “A warning note: This web page is reported as phishing website. We recommend you to exit, otherwise, click on “Proceed”.This URL has been visited: “visits number” by PhishLurk's users}Elseif if the class == 2 // Unknown link {Print “Unknown page!”Display a warning note: This web page might potentially be a phishing page. If youtrust this page click “Proceed”, otherwise, exit.This URL has been visited: “visits number” by PhishLurk's userselse{ die (); }If the class number is not listed or the user triesy to use unlisted class number, PhishLurk kills the request by using the PHP function Die ();. In order to show the user how many times the link has been visited by PhishLurk’s users, I created visited.php to connect to the database and querying about the link visits, The function as following: <?php$link = mysql_connect('server-name', 'root', 'password');mysql_select_db('visits',$link);$sql = "SELECT * FROM `visits` WHERE `link` = '$url'"; // looking for the link $result = mysql_query($sql);if (mysql_num_rows($result) == 1) // if the link exited, it will have one records {$line = mysql_fetch_array($result) ;// leave message for the userecho "<br>This URL has been visited: $line[2] <td>by PhishLurk's users"; }// zero visits if there is not a record.else{ echo "<br>This URL has been visited: 0 <td>by PhishLurk's users"; }?>LoggerThe logger function is to count the visit of each URL and display the log’s records from the database. Creating and updating the records After the user decides to access a website, the browser will be directed to go.php whether via warning page” warning.php” by clicking on “Proceed” or directly from the results’ page, in case the link was safe.href="go.php?url=<? echo $”url”; ?>&class=”class number">The next step,Next, go.php receives the URL and its class, and . to record the new visit. Then, go.php connects to the database and looks for the URL. If the URL has a record, it adds oneincreases the number of visits; otherwise; it will creates a new records. Since If the URL doesn’t have a record, that means it is a new visited URL.$url = $_GET['url'];$class = $_GET['class'];// Connecting to the DB$link = mysql_connect('server-name', 'root', 'password');mysql_select_db('visits',$link);// Performing SQL query about the visited URL in the DB$sql = "SELECT * FROM `visits` WHERE `link` = '$url'";$result = mysql_query($sql);if (mysql_num_rows($result) == 1) // if there is a records{$line = mysql_fetch_array($result) ;$id= $line[0];$old_visits = $line[2];$new_visits = $old_visits+1; // count one more visit$sql = "UPDATE `visits` SET `visits` = '$new_visits', `class` = '$class' WHERE `id` = '$id'"; mysql_query($sql); // update the DB}else{ // if there is nothing, add a new record for the new URL $sql = "INSERT INTO `visits` VALUES (0, '$url', 1, $class)"; mysql_query($sql);}?> The next step is redirectingFinally the browser is immediately redirected to the requested URL.<META http-equiv="refresh" content="0;URL=<? echo $url; ?>">View the logs The logger uses the pageuses log.php to display the entire logs’ records.<?php// reading the logsif (mysql_num_rows($result) >= 1){ while ($line = mysql_fetch_array($result, MYSQL_NUM)) {?> <tr > <td class="style<?php echo $line[3]; ?>"> <?php echo $line[1]; ?> </td> // 1st Col= URL <td class="style<?php echo $line[3]; ?>"> <?php echo $line[2]; ?></td> // 2nd Col= visits # <td class="style<?php echo $line[3]; ?>"> <?php echo $line[3]; ?></td> // 3rd Col= Class # </tr> }?>Figure 7 shows how the logger will show the records to the user in lLog.php page.LinkVisitsClass 7: A sample of the log’s table at the databaseDatabase: Since PhishLurk needsd to update and write the logs records, I need to create a DB database using MySQL to make it easier to update the records. I called the DB database “visits”, there is one table “visits” to store the URLs, the number of visits, and the class. CREATE TABLE `visits` (`id` INT( 2 ) NOT NULL AUTO_INCREMENT PRIMARY KEY , // URL-ID`link` VARCHAR( 300 ) NOT NULL , // the URL`visits` INT( 2 ) NOT NULL DEFAULT '0') ENGINE = MYISAM ; // number of visits`class` INT( 2 ) NOT NULL DEFAULT '0') ENGINE = MYISAM ; // class numberPerformance Evaluation7.1 Challenges EfficiencyCorrectness: How safe correct is the results PhishLurk sending and how many varieties of accesses can a user to benefit from PhishLurk?.AccuracyTimeliness: How to keep the blacklist most up to date?.Overhead: How long it takes to have the classified results back?, How big is the difference is in the time execution, after using PhishLurk’s mechanism.7.2 Test bed experiment In the test bed experiments, I used the local server “Apache” on Wwindows environment., Apache supports both PHP and MySQL.7.3 Experiment I created a test-bed to examine the functionality of PhishLurk including the efficiency correctness and the accuracytimeliness. Iin order to know how efficient corrrect and updatedaccurate the functionality of PhishLurk is workingperforms., I tested PhishLurk by sending queries which search looking for websites that was assumed to blacklisted, and using most common keywords in phishing websites. In my searches 20 Blacklisted phishing URLs and 13 unknown website appears in the search results, PhishLurk was able to detect and classify all of them. Figure 8 is a chart shows how many links of assumed blacklisted link PhishLurk was able to detect and classify.Figure 8: PhishLurk was able to detect and classify all of them.After I changed the updating process to the live check, there was a slight increase in the time execution., Oon average it is of 0.1238 seconds for each single link between the PhishLurk blacklist and PhishLurk live checking. Figure 9 shows the average of the time exeacution for each link. Figure 10 shows the difference in exeacution time along with 20 queriesFigure 9: The average of the time exaction for each linkFigure 10: The difference in exaction time for 20 queries7.3.1. Impact of the page size with different alerting schemes.In PhishLurk, The total size of the page result is 1.01 KB (1,038 bytes). On the other hand, Norton and McAfFee use image scheme to rate and warn about the results in the search page. In Norton’s scheme, tThe size of single image its self is 3 KB. In McAfFee, The size of a single image its self is 1 KB. Figure 11 shows the size and location of the image alerting scheme in Norton. Figure 12 shows the size and location of the image alerting scheme in McAfFee.Figure 11. Tshows the size of of theimage used in NortonFigure 12. T shows the size of the image used in McAfFeePhishLurk has smallerlees web page size because of the coloring scheme and the text-based warning which have small size. 7.3.2. Impact of the time performance on different browsersIn order to test the impact of the time performance on different browsers, I tested PhishLurk on 5 different browsers: Chrome, IE, Firefox, Opera and Safari, by sending the same 10 queries to 5 different browsers . There was were a light differences in all the queries., Eeach queryies takes the execution time between 0.0014 and 0.0015 seconds. Figure 13 shows the time execution for 10 queries using 5 different browsers . Figure 13 : 10 queries sent by Phishlurks using 5 different browserDiscussion 8.1 Expanding the categories: Lately, it’s been noticed that some official website found hosting phishing websites in their server [29,] [30,] [31]., However; It it is really rare to have such official websites of government, hospital, or university to host a phishing website. When there is such official websites hosting phishing sites or producing attacks, they caused Possiblyit is typically from an insider whofor example has privileges to access and control the system, or the site might actually be attacked by cCross-site scripting attacks, or SQL injection attack [30]. Another possibility is that it might be someone reported the unlikely site trying to denigrate damage the reputation of the organizations. In my opinion, the links to tThese kind of websites should have their own class, it could be called unlikely link. Unlikely link is the same as unknown link, the difference is when the black list gets a report about the link that is unlikely to be a phishing link. Ffor example, the websites that have Top-Level Domain “TLD” ends with (.edu or .gov) are in this categories..The link will maintain the unlikely status until gets verified. It is fair to maintain the unlikely status until it gets verified and changed to be a Safe link.Figure 14. Global Phishing Survey: Trends and Domain Name Use - April 2011As we see in the above chartFigure 14, 60% phishing attacks wereas lunched by servers in these TLDs: .COM, .NET, .TK, and .CC. 8.2 Search EngineUsing your own search engine is very beneficial and hard at the same time. One advantage, you narrow down the rang of the search to be the content you are looking for. In our case we need a widelyld used search engine. , I tried to create a PHP search engine, but for a website looking for phishing it needs a huge database and implementation of crawling functions to search thought and to cover most of the results we look for because search engine will search within the stored data in your database. As result, I used as search engine. Google already has their own phishing protection, so the protection would be doubled as we have two lines ofcombine the protection of Google’s and that of PhishLurk. During the evaluation I had difficulties looking for phishing links. Having Google as search engine at the first line makes really hard to evaluate your prototype. I started with a PHP search engine but it doesn’t work efficiently, it has to be entered with a very huge database of URLs. In fact, creating a complicated search engine is much more difficult than I expected.8.3 Disadvantages of Ajax Ajax provides dynamic interaction s with between the browser and the server, in order to and generate dynamic and preferred results or provides suggestions to the user. Beside Ajax relies on JavaScript which could cause some difficulties to run consistently among different browsers because Javas script cannot be installed the same way in all browsers.The process on the client side causes more interaction with JavaScript and the browser which against one of the ultimate goals of PhishLurk that isto provide the protection from the server side and leave no need forwithout requiring the client side cooperation., In order to be efficient with any device that users use, even the off protection devices. Ajax would be perfect for small application utilizes slight amount of data, unlike the case in PhishLurk “search engine”[32] [33]. It also requires loading or referencing of additional AJAX library which results in the increase of the page size.8.4 Client-Side Protection The idea of PhishLurk can be very useful to implement it on cClient -side as a browser’s plug-in for Smartphone., However; to develop aa smartphone’s web application for smartphones, it takes more effort to design and deploy smartphone web application due to the limitations and the additional considerations. The that smartphones have i.e. the small size of screen and limitatedion in the performancecomputating and memory resources., Ssmartphone can’t function as efficient as the desktop PC., Moreover,; smartphone web development demands more skills about including web server configuration, web browser configuration, and IDE supports the markup languages you use . I’m considering the plugin for smartphones web browser on the Clint-side as future work. Future workThe current version of PhishLurk’s mechanism seems to be working efficiently. , however; I believe with having enough timeframe toIts functions can be further improved and enhanced PhishLurk functionality., PhishLurk’s mechanism will be a good phishing protection toIt can be extended for enhance an spams protection. Therefore, I’m planning to redevelop PhishLurk again tocan be enhanced with the following some of the features and to be implemeanted on more varieties ofothers devices and systems::Tuneing up the code more to have the maximum possible optimizeation its performance.Improve some the features to increase the lient side protection: create a plug-in on smartphone’s browser such as Blackberry web-kit browser.Fixing the conflicting session creating to accomplish the independencyand make it more reliable.Follow-up reporting: create a module to send out email asking the users who decided to visit potential phishing sites and provide their feedbacks. Conduct A a survey on how useful is PhishLurk after the committee give the final permissionafter making it available on internet. Conclusion:I proposed designed and developed PhishLurk, an anti-phishing search website that classifies and prevents phishing attacks. PhishLurk Provides provides the protection from the server side and uses the coloring scheme for classification in order to consume as little computation and screen resource as possible on the client-side. So that itIt can be ported to worki tworking efficiently with varieties of devices have different capabilities. PhishLurk uses PhishTank’s as the blacklist provider and checks the list live to achieve the possible maximum possible accuracy. The efficiency of PhishLurk is aeffected by some factors, they are discussed in section [8] including the accuracy of backlists and search engines.I believe the idea of PhishLurk can be a good enhancement feature to be included in, If it’s adopted by a major search engine such as i.e. Google and Yahoo. Moreover; tThe mechanism can be optimized to be applied and work efficiently in smartphones. Acknowledgment:I would likeove to thank my advisor, Dr. Edward Chow for his support and continual encouragement during my research. I thank Dr. Albert Glock and Dr. Chuan Yue for willing to serve as committee members in my project. References:Rachna Dhamija, J. D. Tygar, and Marti Hearst. 2006. Why phishing works. In Proceedings of the SIGCHI conference on Human Factors in computing systems (CHI '06), Rebecca Grinter, Thomas Rodden, Paul Aoki, Ed Cutrell, Robin Jeffries, and Gary Olson (Eds.). ACM, New York, NY, USA, 581-590. DOI=10.1145/1124772.1124861 HYPERLINK "" Whittaker, Brian Ryner, Marria Nazif, “Large-Scale Automatic Classification of Phishing Pages”, NDSS '10, 2010.< HYPERLINK "" >Gross, Ben. "Smartphone Anti-Phishing Protection Leaves Much to Be Desired | Messaging News." Messaging News | The Technology of Email and Instant Messaging. 26 Feb. 2010. Web. <, Inc. "Smartphone Subscribers Now Comprise Majority of Mobile Browser and Application Users in U.S." ComScore, Inc. - Measuring the Digital World. ComScore, Inc, 1 Oct. 2010. <, Roger. "Smartphones to Overtake Feature Phones in U.S. by 2011." . Nielsen Wire, 26 Mar. 2010. Web. <, Paul L. "How Can We Stop Phishing and Pharming Scams?" CSO Online - Security and Risk. CSO Magazine - Security and Risk, 19 July 2005. Web. <, LLC. PhishTank: an Anti-phishing Site. [Online]. , Y.; Saklikar, S.; Das, D.; Saha, S.; , "PhishGuard: A browser plug-in for protection from phishing," Internet Multimedia Services Architecture and Applications, 2008. IMSAA 2008. 2nd International Conference on , vol., no., pp.1-6, 10-12 Dec. 2008 doi: 10.1109/IMSAA.2008.4753929, URL: HYPERLINK "" - Statistics about phishing activity and PhishTank usage , HYPERLINK "" , Friends of PhishTank, HYPERLINK "" Filter: Frequently Asked Questions." Windows Home - Microsoft Windows. <;."SmartScreen Filter - Microsoft Windows." Windows Home - Microsoft Windows. Web. < - Safari - Learn about the Features Available in Safari." Apple. < Top Technology news, Paypal warns buyers to avoid Safari browser from Apple - < >"Firefox 2 Phishing Protection Effectiveness Testing." Home of the Mozilla Project. <;."AVIRA News - Anti-Virus Users Are Restless, Avira Survey Finds." Antivirus Software Solutions for Home and for Business. < Yue and Haining Wang. 2010. BogusBiter: A transparent protection against phishing attacks. ACM Trans. Internet Technol. 10, 2, Article 6 (June 2010), 31 pages. DOI=10.1145/1754393.1754395 HYPERLINK "" Dhamija and J. D. Tygar. 2005. The battle against phishing: Dynamic Security Skins. In Proceedings of the 2005 symposium on Usable privacy and security (SOUPS '05). ACM, New York, NY, USA, 77-88. DOI=10.1145/1073001.1073009 HYPERLINK "" | DNS-Based Web Security. < - True Web Content Filtering for All. < - Web, Email and Network Security Solutions for SMBs on Premise and Hosted. HYPERLINK "" Working Group, Phishing Activity Trends Report - 2nd Half 2010." Anti-Phishing Working Group (APWG). , Dec. 2010. Web.G. Ollman, The Phishing Guide: Understanding and Preventing Phishing Attacks, 22. September 2004, HYPERLINK "" , Anders. "Exploring Phishing Attacks and Countermeasures." Blekinge Institute of Technology, Dec. 2007. < Richardson. "Brits Fall Prey to Phishing." The Register-Sci/Tech News for the World. May 2005. Web. 07 Aug. 2011. < Condor – Revolutionizing Spam Fighting. "Phishing for Disaster: The Cost of Corporate Ignorance." Red Condor, July 2010. Web. < Ramzan. "A Brief History of Phishing: Part I | Symantec Connect Community." Symantec - Official Blog. 29 June 2009. Web. < Lin; Chih-Wei Jan; Po-Ching Lin; Yuan-Cheng Lai; , "Designing an Integrated Architecture for Network Content Security Gateways," Computer , vol.39, no.11, pp.66-72, Nov. 2006, doi: 10.1109/MC.2006.379 , URL: HYPERLINK "" Kirk. "Sony Server Said to Have Been Hacked to Host Credit-card Phishing Site - . May 2008. Web.. HYPERLINK "" , Jeremy. "Hacked Bank Server Hosts Phishing Sites" Computerworld - IT News, Features, Blogs, Tech Reviews, Career Advice. Mar. 2006. Web. 18 Aug. 2011. < Fisher. "Researchers Find Government Site Hosting Phishing Data | Threatpost." Threatpost | The First Stop for Security News. Apr. 2008. Web. 18 Aug. 2011. <, J.S.; Chapa, S.V.; , "From Desktop Applications Towards Ajax Web Applications," Electrical and Electronics Engineering, 2007. ICEEE 2007. 4th International Conference on , vol., no., pp.193-196, 5-7 Sept. 2007 doi: 10.1109/ICEEE.2007.4345005, URL: HYPERLINK "" Jing; Xu Feng; , "The Research of Ajax Technique Application Based on the J2EE," Database Technology and Applications (DBTA), 2010 2nd International Workshop on , vol., no., pp.1-3, 27-28 Nov. 2010, doi: 10.1109/DBTA.2010.5659073, URL: HYPERLINK "" . Index1.Introduction31.1 Impact of phishing32.Background42.1History of Phishing 42.2Most Targeted Industries52.3Why Phishing Works 52.4Existing Work Anti-Phishing: 53.Related Work 63.1Blacklisting73.2Current Browser’s Phishing Protection 83.3Classification of Phishing Defense94.The Proposed Project95.Design of PhishLurk105.1PhishLurk Components: 106.Implementation126.1The Information flow126.2Blacklist 146.2.1Updated Blacklist146.2.2Checking the URLs Live156.3Classification166.4Warning166.5Logger176.5.1Creating and updating the records 186.6Database197.Performance Evaluation197.1Challenges197.2 Test bed experiment 207.3 Experiment 208.Discussion 238.1 Expanding the categories238.2 Search Engine238.3Disadvantages of Ajax 248.4 Client-Side Protection 249. Future work2510. Conclusion2611. Acknowledgment2612. References2713. Index30Appendix A. User GuidePhishLurk is simple to use. The main page includes the text box where the user input the keywords.Figure shows the main page of PhishLurk After the user enter the keywords, the search result will be shown as links with their title and description.How to know the classification of the links. There three classes :Phishing Link: Phishing Link is a risky link, PhishLurk displays the Phishing link in red color and added text next to the title indicates that the link phishing.Phishing Link appeared in search resultsIf the user click to access the phishing link, PhishLurk assure the user that link is risky by transferring him to the warning page. The warning page alert the user from the risk and shows how many times the websites has been visited.Figure shows the warning page of phishing linksUnknown Link: Unknown Link is a suspicious Link, PhishLurk displays the Unknown link in Orange color and added text next to the title indicates that the link is Unknown. Unknown Links appeared in search resultsIf the user click to access the Unknown link, PhishLurk assure the user that link is suspicious by transferring him to the warning page. The warning page alert the user that the link is potentially risky and shows how many times the websites has been visited.Figure shows the warning page of unknown linksSafe Links: The Link is safe and not blacklisted. The user can access safely. Phishing Link appeared in search resultsAppendix B. Installation and Configuration of PhishLurk<detailed steps on how to install PhishLurk. Reference the source code in HYPERLINK "" and the urls of any software packages you used.> You can assume a virtual machine running Windows 7 or Window 2008 server. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download