Black Hat Spam SEO
Articles,  Blog

Black Hat Spam SEO


>> SOBRIER: The Zscaler security team has a blog
at research.zscaler.com. You will see several like two to five post a week about web security
and several posts about Black Hat Spam SEO. If you have any question after this presentation,
you can always email me at [email protected] I’ve written–I’ve co-written a book called
the ‘Power Security Tools’ and I’ve been working on Black Hat Spam SEO since December 2009.
So during this presentation, I’ll try to answer these questions; what is Black Hat Spam SEO?
How attackers are using Black Hat Spam SEO? How prevalent it is, especially in Google
Search results? What are the different type of attack that they use? What search engines
are currently doing about Black Hat Spam SEO? I hope users can protect themselves and–oh,
so you cannot see the bottom–how search engine can protect their users better? So, what is
Black Hat Spam SEO? So, what attackers are doing is creating thousands of fake pages
for popular search terms and creating it in a way that they will rank high in Google search
results. So to get the list of popular search results, they use Google Hot Trends. And Google
Hot Trends display everyday the list of popular searches in the U.S. for that day. There are
20 Hot Trends popular searches a day. So they get their information from this tool to know
what kind of pages they should write. Then, they need to get the content. And to do that,
they do another Google search for this popular term and along with the links, Google will
show a short paragraph for each webpage. So they actually use this paragraph from the
Google Search to create the spam pages. And they usually add a lot of dates in their content
to show as up-to-date to Google to have a better ranking. And so now that they have
all their spam pages they need to feed them into Google. So they create additional pages
that are links to these spam pages. So this is an example of a spam page that links to
a spam content. You see a lot of popular keywords that were popular in August, I think. So what
is the goal of these fake pages? So, these fake pages are used only to get indexed in
Google, but they actually want–so they want the users to click on one of these spam page
in the search result in order to redirect them to malicious sites. And the way they
differentiate between a search engine caller like Google Boat and a user is by checking
the referrer. So they check if the request is coming from somebody who was in Google,
Bing, or Yahoo or any other search engine, and they also can do the redirection with
their JavaScript or Flash to make sure that this is a real browser hitting the spam page.
So that way, they can feed the spam page to Google in order to get indexed and redirect
the user to the malicious page.
So, to do that, to create their spam pages, they actually don’t use their own website.
They hijack some legitimate websites. And to not be detected by the webmasters, they
don’t modify the existing pages; they add new pages on the website. They try–they always
try to hide their spam pages, so you will see spam pages using xmlrpc.php, which is
for WordPress, a page that’s used to remotely control your blog. They use robots.txt which
is used by, again, search engine callers not seen by regular users. They hide in the image
folder where the webmaster does not expect to see pages. They create hidden folders on
Linux like a dot something; a dot file folder. So they really try to hide their pages from
the webmaster to not be detected. And this is what spam pages can look like. So here
you see just plain text, basically with no link, no CSS, no style sheets, no images;
just contents with a bunch of dates. And the sentences put together don’t actually make
sense. It’s just a bunch of sentences that they got from the Google search result, put
together around one keyword. But now the spam pages are getting a lot smarter and looks
more like a real website. So, this one is not an MTV page. It was taken from some other
hijacked site and you will see a lot of dates. But here, you will actually see real images.
They make links to a legitimate website. I’ve seen links to Flickr, to Facebook, to NSA,
[INDISTINCT] sheets. They really look like a real page. But, of course, the page is actually
very different from the rest of the websites. So now that they have created their webpage,
they have a script on the hijacked site that redirects user to the mentioned site. And,
again, they do it based on their referrer header. They do it based on the user agent
so they may check if you are actually using an index for Firefox or any of the regular
browser. And they can even use Flash and JavaScript to detect a fake browser from real browser.
This is an example of a request, and unfortunately, we–okay, we cannot see the entire slide on
the screen. But the first request is from a Google search, you can see the referrer
here is from Google.com and the search was for ‘meteor shower tonight.’ And the first
request is to a spam page hosted on the spacecoast-trading.ipower.com. And when you hit the webpage, you see–you
actually–the user does not see any content. There is a redirection to another domain a.ws
domain extension where–which does actually all the checks for–which does additional
checks to differentiate between a real user or a caller or a security tool. And what you
don’t see here is that these websites at .ws will then redirect you to a–will redirect
the user to a malicious site that host the malware, in this case, it’s a kill77-virus-co.cc.
But unfortunately, I can’t show the entire page here in this slide. So in a nutshell,
that’s how it works. So a user–sorry, Google first will follow the blue line. So, Google
will connect–will call the hijacked site and it will be redirected to a spam content,
and that’s what will be indexed in Google. That’s what Google will display to the user.
But if a user, the red dot, make a popular search in Google, he will find this hijacked
site in the search result. Click on the link, but instead of going to the spam content,
it will actually be redirected to a malware. There could be more than one redirection,
towards simplified version, when you have the I.D. There’s usually a common control
server involved, which tells the hijacked site how–what keyword they should create
a page on, what content should be there, and to what malicious website they should redirect.
So how prevalent it is? So again, that’s for the 20 daily popular searches that Google
Hot Trend shows. So for these searches, more than 50% of the popular search content, at
least one spam link in the first 10 pages. So all my research–all the research that
I’ve done was only on the 10 first pages, the first 100 results and that was from, I
think, March to August. So, almost 60% of the popular searches contain a spam link in
Google. And I’ve seen up to 90% of the first 100 links being malicious. And this is a breakdown
of all the malicious–all the searches, popular searches that have at least one malicious
link. And you can see here that 10% of the infected search results have more than 50%
of malicious spam link. So, more than 50 spam link in the first 100 pages. So what are the types of attacks? So what
are the types of website that you actually are redirected to after connecting to a spam
page? The most popular one is the fake antivirus page. I have a screenshot here. So it looks
like a desktop antivirus running on your computer, which warns you that your computer is infected
with viruses and nicely prompts you with a download box where you can download the solution,
the fake–the antivirus which is, of course, a malicious executable. I have a video of
it that I can show you if I can–if I know how to use the Mac. Let’s see. See it’s here.
Yeah, this one. So, this was done on April 15, just after Tax Day, which is, I think,
April 14. So one of the popular–many of the popular searches were on the Tax Day. And
in this example I will show you in a minute, I went to Google Hot Trend, looked–saw a
trend for Google’s tax freebie, I believe. I did the–I entered the query in Google and
the first [INDISTINCT], you know, under the Twitter links and the Facebook link and some
of the other real-time links, the first link on the page, on–in Google was a malicious
spam link that’s redirecting me to a fake antivirus page. Okay. I might not be able
to do it full screen. So this is Google Hot Trends. This is one of the hot trends for
that day. I do a search on Google and gets to the first link, click on it, and here it
is. This is a fake antivirus page. So, you see it looks like a desktop antivirus. It
shows me that the PC is infected. And whatever you do, even if you don’t click, you actually
get prompted for executable to download. If you try to leave the page, you get prompted
for the executable again to download. So that’s, by far, the most popular type of attack these
days with spam SEO. Okay, where is my OpenOffice here? But, of course, they are changing their
attacks all the time. So the other popular type of attacks is a fake video player. You
go to a page that often looks like a YouTube page, and they warn you that you either don’t
have the right codec to read the video or you don’t have the right ActiveX in this case
or you don’t have the right video player. And again, you’re asked to download and install
the video player. And it’s, of course, a malicious file. They do the same thing with Flash updates.
They tell you your Flash version is out of date and you need to update to be able to
read the video. They also do more targeted attacks for specific browser. So, for example
for Firefox, they may tell you that your Firefox version is out of date. So on this page, I
was actually running Firefox 3.6.8, but it told me my version was too old. They also
told me my Flash version was too old and I was–I had to download a new version. And
if you are Firefox, you will see that the page looks really like the official Firefox
page. They use the same fonts, the same image background, the same type of icons, and even
the domain names are pretty good. For example, for dimensions executable for the Flash update,
the domain name was flashupdate.co.cc, so very easy to fool a user. They are pretty
well done. And, of course, there are non-malicious spam. You may be more familiar with it. So
this presentation is not on this harmless spam but really on malicious spam, but I thought
I will mention it. So the most popular non-malicious spam is a fake search engine. So they do the
same thing as a malicious spam. They create fake–they hijack fake page–legitimate websites,
they create a new spam page, but you get redirected to a fake search engine and all the links
that they show to the user are actually advertising. So they get money every time the user click
on them. And there’s, of course, just plain old spam. So what are the search engines doing
about it? So in my opinion, not enough when you see that 50% of the popular search results
are actually infected. I think there’s some improvement that can be done. So Google Safe
Browsing is not, I believe, the answer. So if you’re not familiar with Google Safe Browsing,
Google maintains a list of–a blacklist of malicious and phishing sites. This blacklist
is used by most of the browser; Firefox, Opera, Chrome, Safari. But there are limitations.
I will go into more details. One of them is that Internet Explorer does not use Google
Safe Browsing. Google will sometimes show warning in the search result, if they know
that the link is malicious. They will actually warn the user. And I’ll show you an example
a little bit later. Well, what was very surprising for me is in Google you have an option called
SafeSearch which is on by default. And with the word ‘Safe’ in it, I will be–I will–I
saw that it will be used to fret out dangerous, unsafe result. But this one, I think, only
used for adult contents. So this is an example of a warning that Google might show in a search
result, at the top here. So under the link, they say this site may harm your computer
and if you still decide to click on the link, they will not direct you to the website; they
will redirect you to another Google page, warning you again that you should not visit
this website. And there’s no direct link actually on this page. You have to copy and paste the
URL from the page to go to the manager site. Yeah. So–but what about this warning? How
many warnings can you see in a Google search? So I’ve made a comparison of–at the top the
number–the distribution of warnings among search results and the number of spam links
which don’t have malicious–spam links which don’t have a Google warning. So you see here
in purple that 41 search results had more than 50 spam links without any warning, but
there were only three popular searches that had more than 50 warnings. So the number of
warnings does not–is much lower than the number of actual malicious spam links. And
if you see how they evolve over time, I’ve looked at two examples, two popular searches
at some point and I’ve seen–I’ve checked how many malicious links–that’s the blue
line–there are in the first 100 results. So, this one don’t have any warning and how
many warning Google is showing. So you see here for Anacostia River, a pop singer, we
started day two, so the second day after it gets–it got popular and got listed on Google
Hot Trends, there were about 75 malicious spam links in the first 100 results. And there
were only–the second day, only one or two spam links. And you see over time the number
of warnings goes up; the number of malicious links goes down. But you see, even after ten
days, there are, yes, there are more warnings than spam links without warning, but there
are still about seven spam links in the search result. And in the second example, it’s even
worse. So this one, I’ve looked at the search, 16 days after it was popular to 20 days after
it was popular. And we started with around 37 spam links without warning. And after 20
days, we still have more spam links that don’t have any warnings than Google warnings. So,
even though over time Google is doing some cleanup in the search result, there are still
a lot of spam links that are not being blocked by Google. And then we get the other issue
with spam links. So here was, I think, the first popular search that I saw that had more
than 90 dangerous link in the search result. So this time, Google did the good job of cleaning
out the search results after a few days–after a few days of issuing warning. The problem
is you had to go to page seven to see the first link without a warning. So the first
seven pages of result were links, where you could not click on it because they had the
warning and you may not see very well on the screenshot, but the safe search is on. So
I would agree that there’s no value for the user to see seven pages of dangerous results
when he’s looking for something, because these pages are dangerous and even if you’re not
getting redirected to the malicious websites there’s absolutely no interest in content
on these pages. So Google Safe Browsing, what are the limitation of Google Safe Browsing?
So number one I think it’s a big one; it’s not part of Internet Explorer. So, whatever
you’re doing with Google Safe Browsing, you’re not protecting more than 70% of the users
who are using Internet Explorer and which do not have Google Safe Browsing in their
browser. Google Safe Browsing focus on detecting the malicious sites. So they try to detect
the malicious sites, not the spam pages, but the malicious site that you are redirected
to after the spam page. So as a result, you’re always a step behind. Spam pages may redirect
you to different malicious site at different time, depending on your browser, depending
on the time of the day or depending how many times you clicked on them. And to me, it does
not always make sense. So I’ll show you one example of–sorry, diagnostic page for Goggle
Safe Browsing. So consul.net is a website that apparently is not maintained by their
webmaster at all. You can actually go there. If you just go to the roots directory, it’s
fine. You won’t be missing what will happen to your computer. So what does it say here?
So this website, this has been hijacked. It’s redirecting users to malicious sites all the
time. So what does Goggle Safe Browsing say about it? So they clearly state here in the
first line here that it’s not suspicious, which means it’s not part of Goggle Safe Browsing.
But then in the first box they say that in the last 90 days, they actually found a lot
of redirection from this domain to malicious pages and then they list a couple of domains,
of malicious domains, that the website was redirecting to. And you’ll notice excel.pl,
that was the first domain that was hosting 95% of the fake 80 pages in the first months
this year. So, Google knows that this website is redirecting users and it’s still doing
it as of today, two malicious pages, even not [INDISTINCT] domain, but it’s not part
of Google Safe Browsing. So, what about the other search engine, namely, Bing and Yahoo?
Bing is surprisingly very clean. I’ve seen very, very few spam pages in the search result
for the popular searches. I’m not sure why. I don’t know if it’s because it don’t index
new contents very fast. I don’t know if they actually intentionally cleanout their search
result, but it is very clean. Yahoo is pretty bad. They also have a lot of malicious links,
but it takes them a lot of time to index new content. So the spam content will show up
on Google the next day after they were–they are created, but they will show up in Google
maybe a few weeks later. And by the time they show up on–in a Yahoo search result, the
hijacked site may have been fixed, so these pages don’t exist anymore. The malicious site
may be down, so the redirection does not happen or–and simply the search is not popular anymore,
so users are not looking for these keywords and it doesn’t really matter if they have
malicious pages in there. So now, what about protection, how our users can protect themselves?
First thing will be–that you may think of is antivirus, right, because all these malicious
pages try to download and run a malicious executable, your antivirus should be able
to catch it. Unfortunately, that’s not true. Usually–typically, less than 25% and pretty
often less than 10% of the antivirus vendor will actually detect anything malicious. There’s
a website called VirusTotal, where you can upload files over there and they will run
the files through 40 or 42 different antivirus including McAfee, Symantec, any of the antivirus
you could buy at Fry’s or other stores. And if you can see here in red, this is antiviruses
that found the executable as being malicious and here’s the only one, two, three, four,
five, six out of the 40 antivirus vendor found something malicious. And so, it’s very likely
that your antivirus will not find anything. My antivirus, I think, that I’m using is AVG,
and pretty much, they didn’t find anything wrong with all the executable I download.
Of course, that actually makes my life easier as a security researcher, not having to disable
and re-enable my AV all the time. So the second thing will be Google Safe Browsing, again,
part of Firefox, Chrome, Opera, IE–sorry, not Internet Explorer, that’s the problem.
Internet Explorer has its own technology called the Smart Cone Filter. I did a quick comparison
about what they detect and what Google Safe Browsing detect, and there’s not too much
overlap. So, I don’t know which one is better than the others, but clearly, none of them
catch everything. And, of course, they cannot detect all the types of security issues. So
at the Zscaler, we’ve tried to approach this problem in a different way, and we released
a Firefox extension called Search Engine Security to protect Firefox users. And it’s a different
approach from AV and Google Safe Browsing in the sense that, remember in the process,
you go from the Google search result to a spam page, the spam page does a couple of
checks and figure out if you are a regular user or if you are a boat or some kind of
automation tool. So if you can fool the spam page into thinking that you are not a regular
user, then you will not be redirected to the fake–to the malicious site. You will either
see the spam page content or sometimes you get a blank page. So what this plug-in does
is very simple. It actually changes the referrer. When you click on the link, when you leave
Google, Bing or Yahoo, from a search result, when you click on the link to go to the spam
page it will actually change the referrer and the spam page will not redirect you to
the malicious site. So, this is a quick screenshot of the plug-in. So it works–many from–it
works for Bing, Google and Yahoo, and the reason is 90% of–a lot of them actually,
at least of the spam pages look–check if you’re coming for one of these three websites
and you can change your referrer to anything or you can leave it blank. If you install
the plug-in on Firefox and you do a search on any of the three engines, you will see
under the search bar a small notification to show you if the plug-in is on or off for
this plug-in. So this plug-in is helping Firefox user to not get infected–not get redirected
to the malicious website. Okay. It was kind of a joke here, but you cannot see it. So
the other solution might be to simply switch to Bing since they have much cleaner search
results. I didn’t switch to Bing, I still like Google, but you have to be aware that
if you’re looking for everything other people are also looking for, you are at risk. And
that’s probably something we always say about security. User education is really is the
only way to fight this. Once users understand that they should not download and install
any executable that they find in the Internet unless it’s from trusted sources they will
always run into issues. So what could search engine and especially Google do to protect
their users better? I think the best way to protect–best way to protect users is actually
to focus on the spam pages, that way, you’re always a step ahead, right? If you don’t go
to the spam page from Google search result, then you will not end up on the malicious
site. Also, the way spam pages work, they are all–the types of URLs they use are always
the same. So, if you find one spam page on one website, you know what are all the spam
pages for the previous Google Hot Trends and you know what will be the URL for the next–for
any of the Google or the Google Hot Trends. And you could also–you can also know what
are the URL that will be used on other domains and you can–-and the spam–the spam page
always look the same. I mean, there are probably two or three type of spam pages. So once you
are able to detect one type of spam page you should be able to recognize other spam pages.
One other test you can do to double-check if it is a spam page is simply access the
page in two ways. First, you go directly without any referrer header. You should end up on
the same domain as what you saw in the link or another domain, you could also, but if
you–one of the things that they do sometime is if they figure that you are not–you are
a regular user but you’re not a Google Boat or search engine boat, they actually redirect
you to some random website like CNN.com or Google.com. So you will end up on domain-a.com.
And if you follow the link with the correct referrer, meaning a referrer that show that
you were coming from a Google search will end up on a different site. So that’s how
actually I do most of the pre-featuring for malicious domains, is I go to the same webpage
in two different ways and check if I’m–and if I end up on the same domain in both case
or on different domains. And if you’re on different domains, that’s a good indication
that something bad is going on. Of course, one other problem with Google is just a scale
of everything you do, right? You probably index millions of pages a day, so even any
simple kind of featuring will take up a lot of resources. So, I’ve produced like about
four posts about trend and statistics about spam SEO. And the one thing I was trying to
do for my own research is trying to figure out quickly what URL I should focus on. So,
by–I came up with a simple regular expression that allows me to inspect only 2.3% of all
the search result that I see and still get 80% of the spam. And that gets me as well,
70% of the spam. So once I apply this regular expression just on the URL, more than 50%
of the URLs I get are malicious and then I can use the other trick to–with the, I think,
the URL in different places to feel if it is actually malicious or not. So my point
is I think it will not be that hard for Google to get rid of at least 80% to 90% of the spam
SEO. So that’s not a hard problem. And I know the security team at Google is well aware
of spam SEO. They published white papers about fake antivirus pages. They have Google Safe
Browsing which is full of these malicious sites and after redirections. So I think it’s
really a matter of getting this information upstream and actually cleaning out search
results. And that’s the end of the presentation. I want you to have enough time for questions.
And so again, if you are interested in the web security and Black Hat spam SEO, you can
check our blog at research.zscaler.com. If you have questions, you can always talk to
me in a few minutes or just email me at [email protected] So I don’t know if you have any questions.
Yes?>>I think you used to say this is a dangerous
link if you actually download it.>>SOBRIER: Right.
>>[INDISTINCT]>>SOBRIER: So, let me go back to the slide
which explained–so I started out by using a web browser, Firefox and Selenium to do
all the automation. And I was just clicking, going to Google Hot Trend, entering search
result and going through the links. And what I was checking is, does a page redirect me
to another domain? So if you go–if you’re–I get the list of links from my browser. I visit
the link in two different ways. Once I click on the Google link in the search results,
so I’m like a regular user, so I go to the spam page and if it’s malicious, I’m redirected
to another website. So, I can find out which domain I end up in the end. And the other
way is I just type the URL in my browser and I go directly without any refer and the vast
majority of the spam page will say, ‘You’re not a regular user. Here is the spam page,’
or redirect me to a different domain, a non-malicious domain. So by comparing the two domains and
the two ways of doing the redirection, I’m able to tell that something is probably wrong.
Then for the company itself, we actually create signatures that scan the content and tell
the user if there are–if they were going to fake page or fake video pages, so that’s
how I can double-check. Either I can look at it manually if nothing triggered, none
of the signature triggered, I will look at it manually, but also I’ll disregard the content
because most of these malicious pages are always the same. I mean, they make changes
from time to time, but the fake 80 pages that I found a few months ago are still existing
right now. They have different variations of the page but did not–not that much change,
in general. Yes?>>So you’re absolutely right, [INDISTINCT].
>>SOBRIER: So–sorry.>>So, this is continually an arms race if
you ask me between good guys and the bad guys and so let’s assume that your tool gets traction
and it’s out there and you’ll rewrite the programs. What happens as soon as you start
getting that data with your tool and the bad guys notice it and that they stop looking
for the referrer and start looking at something else? So, you know, how is this getting…
>>SOBRIER: Sorry?>>Can you repeat his question?
>>SOBRIER: Okay. So the question was, ‘Are you sure that they are looking at the referrer
to redirect people from the spam page to a malicious site?’ And the follow-up was, ‘Since
the bad guy–since–it’s always a race between good guys and bad guys, so if your Firefox
extension is getting popular, will the bad guy change the tactic and use a different
trick to redirect users?’ So, how do we know that they are using the referrer to redirect
users? So, most of them do it, not 100%, but most of them do it. So, the way we know it
is I was able to see some of the PHP script that I used to do this redirection and I saw
which–what they’re actually doing for making a difference between users and regular users.
>>All right, I agree with you there.>>SOBRIER: Okay. So, I mean, the reason for
this tool is really because Google is not doing the featuring, right? What I hope will
happen is that Google will actually be able to filter out the search results and there
won’t be any need for this tool anymore. So it’s true that–no, this tool is protecting
for one type of attack, but there are already attacks that are not being detected by this
tool because they don’t actually check for the referrer. I did a post a few days ago
about the hot video issue. So this one is similar, but what they do is they don’t actually
check for the referrer. They have a Flash all over the page and they do the redirection
with Flash only. So, what they look at is, do you have a full browser with Flash support?
So Google–again, Google Boat will only see the spam page. It won’t follow the Flash redirection,
but the user will get redirected to this Flash. Is there any other question? Yes?
>>What was the time variance you set for your study? Was it March or…
>>SOBRIER: So for the studies takes, it was from, like, March to July.
>>Okay. Did you notice or have you been keeping track of–over time if, you know, if the [INDISTINCT]
getting more or less spammy or did you still get it all in there?
>>SOBRIER: So–yeah, I did look at this. It’s very hard to tell because you might get
10 searches with just one link and then one of them gets 90 links. So it really depends–it
really depends on how much good content there is for this keyword. So, most of the progressive
search are about celebrities. If the celebrity is not that well-known, there’s not too much
content over there, so the spam pages will be pretty high in the list of Google search
result. If it is for Britney Spears or some other popular search result, the good content
will come pretty fast and it will push out the spam pages from the first 100 links. So,
it’s hard to tell how it’s evolving over time, but it’s still there. I’m pretty much finding
as many malicious links as I was a few months ago. No other question? Yes?
>>You mentioned that a lot of these queries come from Google Hot Trends.
>>SOBRIER: Yeah.>>Do you think it will help if Google Hot
Trends were to keep refreshing in different ways, you know?
>>SOBRIER: That’s…>>Or do you think they just come up with
queries with some other [INDISTINCT]>>So that’s a very interesting question.
So the question was, ‘They use Google Hot Trends to get the list of keywords. So should
we change the way we actually display information?’ That’s a very interesting because two days
ago something changed in Google Hot Trends and all my automation to get the list of Google
Trends actually broke. So I don’t know if it was intentional or not, but basically they–the
type of URL you can enter in Google Boat became much more strict. So if you go to Google Trends,
the date will be, for example, 2010-8-8, and before I was entering 2010-08-08 and that
was working fine but two days ago it actually–I was getting error for that and the same thing
actually before accessing Google. It looks like something changed a few day ago and I
was able to do that. You can actually not–it’s not that hard to find the source code for
some of these spam pages; how they do the redirections. So you can actually look at
how they connect to–how they connect to Google Hot Trends or even how they do the redirections
and make some modification to do that. But I think, you know, by the time you make the
modification there, it’s–for me, from my point, they made this small modification to
Google Hot Trend, it didn’t take me much time to go around it. So as long as regular users
are able to see this page, Boat will be able to see it as well. But this page only exists
for U.S. I was trying to see if this type of spam SEO is popular in other countries
and I could not find any of this. Another tool that used trends for other country in
the world but it’s very different. And the result, the Hot Trends you get for U.S., for
example, are very different from Google Hot Trends. Yes?
>>[INDISTINCT] with the first 10 page of results [INDISTINCT]
>>SOBRIER: So did I try with other pages? Did I try to check more pages? So I didn’t
really want to go over 10 pages because I don’t think a user will actually go that far.
I have some statistics about how many links you see per page and that’s pretty well distributed.
So you pretty much see as much bad pages in the first page as the 10th page. But over
time, yeah, it gets–it does get pulled down to later pages. I’ve also looked at how much
time it takes for the spam to get into the Google index, and first day, you don’t get
anything and actually most of the spam pages will not–will only create page–spam pages
for the day before. But usually, if you look for a trend five days to 12 days after it
was popular, that’s where you get the most number of spam pages. Yes?
>>So, what do you know about people who are doing all these pages? How many there are
and what kind of tools are they are using and are they mostly using something they enrolled
themselves that may have [INDISTINCT] or have actually purchased in some scripts from other
parties [INDISTINCT]?>>SOBRIER: Okay. So what about the people
who are behind this? So I don’t know that much. What I know is actually mainly from
what my colleagues have done, other security researches have done. It looks like there
are only a few groups who are doing this. If you see the fake ID pages, they look very
similar. And even when there are some small variations, the source code is pretty much
always the same. I’ve seen the source code at some of the bot nets and what you can see
is all code they commanded out and the revision being done. So it really looks like it’s a
small number of people in Eastern Europe, mostly Russia. One of the problem is it’s
very hard to track down even who is hosting these sites because spam pages are on legitimate
websites. The malicious domains are easier, free domains like a sub-domain of .co.cc which
is a free register. Anybody can get this domain for free, no icon registration. And–or they
also hijack other legitimate website to host their malicious sites. So it looks like it
is a small group of people doing this. But I’m not 100% sure. Okay. Any other questions? Otherwise, we can
either talk later or you can send me an email if you have a question. There should be a
video available, so if you want to forward this to any friend or colleague, you can do
that. Thank you.

6 Comments

  • Eli Gundry

    This a very interesting video. I love Google posting videos like this so that they can help educate the next generation of web developers. Keep up the good work guys.

  • ROCKNTV1

    K, so hopefully this is a real google site, Yeah, I thought that this was a perfect reason to have blogger handle my sites, and do the custom domain thing, except the Fuzzy Logic apologies don't cut it with me, especially since their seems no official way to contact support sens the forum, which has skechy outcome at best. And reputation demeaning outcome with google indexing these as help. CAN WE PLEASE HAVE A BLOCK IP BY COUNTRY TOOL FOR BLOGGER,,,PLEASE
    sorry hijack NOT ON PURPOSE

Leave a Reply

Your email address will not be published. Required fields are marked *