How do Google algorithms identify paid backlinks from natural ones?

Google's launched a new algorithm update called Penguin 2.1 - that punishes spammy websites and puts high-quality websites at a higher rank in search results. More specifically, what it does is find out which websites have generated unnatural backlinks by paying other websites and which are the websites that have natural backlinks.

I was thinking of possible ways Google's algorithms could identify paid links from the natural backlinks from natural ones?

Of course, Google won't tell us how they do it; but we can try to come up with technical as well as non-technical reasons.

Replies

  • Nayan Goenka
    Nayan Goenka
    Well Google bots also analyse all your ad words and headers wherever they traverse pages. It gets detected that multiple compositions of same words are repeatedly used in headers pointing to same IP or Mask. And it can be detected from there which is paid and which is organic. Google Bots are just marvelous little creatures who are keeping the traffic running on the internet. With that weapon, Google is still maintaining its highest spot in advertisements and Optimization tools. And you won't believe but even Bing Bots give Google a higher priority. You wanna test it? go to Bing.com and search for Webmaster. It will show Google webmaster first, then Bing webmaster. That is power of the bots. Their report generation algorithm points right to the favor of its advertisers and users. So even if paid back links are generated, they have no way to modify or bounce its header address, which ultimately gets detected by Bots. Plus their content is always the same. Since Google has strong plagiarism check for content, that comes under scanner as well. For organic back links, the header is same, but plagiarism check cleans it out. Since it is organic and un-intentional, the content never backfires. So it is clean.
  • Kaustubh Katdare
    Kaustubh Katdare
    So you are suggesting that repetition of words is one of the important factor. I'm not sure what you mean by headers being a parameter as well - would like to know more. A smart manipulator can create different set of words to for the links and foil Google's efforts to make sure legit links aren't treated as spam.
  • Nayan Goenka
    Nayan Goenka
    Headers means the HTTP-request which are all sent when every page is loaded. During advertisement backlinks / backlinks, everything points to one place, so everyone shares same header. That is how it can be detected.

    For manipulation, even if you change words, it converts into new link with similar header, so its not advertise which is repeated anymore, it becomes organic link. that is acceptable. But still the main role is HTTP-request or headers. they tell the complete story to the Google bots
  • Kaustubh Katdare
    Kaustubh Katdare
    I do have a fair idea about HTTP headers; but wasn't sure how'd they contribute to the overall backlink profile. Majority of websites will usually have almost identical header on almost all of its web pages. Usually, the SEO companies would hire the link creators to post linkbacks - and they have absolutely no control over manipulating any website's header information.

    For example, if a webmaster has hired a SEO company to generate backlinks to 'Flipkart.com' (just an example), the SEO company will usually have their agents create fake accounts on various websites and forums to post a link to 'Flipkart.com'. The anchor text for the link could be anything - from 'buy mobile phones online' or 'the online megastore', or 'buy perfumes online' etc.

    On the other hand, there may be bloggers who have written about their experience of shopping from flipkart and have created natural links like 'I recommend you to buy mobile phones online from flipkart' with 'buy...online' keywords forming the anchor text for the backlink!

    Now, in such cases, the whole sorting out business becomes difficult; because you can't say for sure which links are natural ones and which ones aren't.
  • Nayan Goenka
    Nayan Goenka
    Back linking concept is usually under the Black-hat Marketing domain. The examples you are giving me are natural ones. These are natural or what we say organic linking. How many websites/webpages/forums/accounts you think you can cover, 10? 20? 50? 100? 1000?? Well it helps but not in mass for giants like Flipkart. What they want is mass marketing. Now creating fake accounts and deliberately trying to mention Flipkart here and there on restricted or lets say open area is black-hat. For eg: I m talking here on Crazy Engineers and I am trying to publicize other Engineering community here is black hat. You as an admin have an objection to it. The examples you gave are related to this. And not more than 100 back links are covered in this per source. This is not at all an issue for Google bots or the company who hired. What actually matters is when you can mass manipulate the entire key word search. Anyone types shopping, no matter what results pop-out, everywhere Flipkart may come. Now this is real advertising skill. It obviously needs one to go beyond Access limits to do so but still it is a marvel. And such type of advertising is only funded by the company i.e Flipkart itself. Why would anyone else do that?

    For relation to deliberate black hat marketing under fake ids and posts, Google bot doesn't differentiate between that. Its impossible to detect whether the user behind that is a marketing agent or a real person. But yes, repeated content is marked.

    In order to catch this, you have to make some sort of revolutionary algorithm if to be used in public.

    OR OR OR

    you can still do it. But companies like Google will never do so since it involves illegal technique. Lets say, you track the IP address of the post which was made with back links. With same IPs multiply posting same headers, marked then and there. But gaining access to IPs in third party environment is against law. That is why it cannot be deployed in spiders and is the main reason of Piracy or Black Hat Marketing as we say. You devise a way to extract IP address just by looking at the content posted or whatever we give you to judge, you are eligible for a noble prize.

    There are more treasures on this topic hidden in my brain. I can spit it out whole but I cant. I have sharp eyes on new advertising concept Google is rolling out with Google Glass. Something about, if you see an advertise, only then it gets sanctioned in ad words. We are more concerned on the ill effects. In wake of doing business, they are compromising a lot of things.

    PS: I have already racked brain on creating such crawler script. Got successful even but I cannot release it since it violates Internet policies and rules. I m sure Google might have a similar one with it ages ago. Thing is, they cannot deploy it too.

You are reading an archived discussion.

Related Posts

Adobe Systems has undergone a huge setback after being hacked. After the Adobe website came under a cyber attack yesterday, the Photoshop and Acrobat creator company shared that the credit...
This is a short that almost all engineers shall relate to! In those four years of college life, every dreadful viva was made fun of, and this short film beautifully...
The best camera in a smartphone, the Nokia Lumia 1020 will be in your hands by third week of October in India. The news has now been confirmed on the...
CEans, Did you know that there's a 'report link at the bottom of every post and resource? It's a tiny link, but quite powerful one. If you notice anything objectionable...
Hi friends , CE has been the best portal to find latest happenings in the science or engineering world but this alone cannot build us into a good citizen/human being.So...