Bill Slawski is the president and founder of SEO by the sea, and joined the professional SEO consulting and Internet marketing since 1996. With a degree in English from the University of Delaware and a Juris Doctor from the Widener University School of Law, Bill worked for the district court in Delaware highest level for 14 years as court manager and administrator, as a coach and analyst / management. While working for the Court, the bill also began to build and promote websites, and SEO became a full-time in 2005. Working on a wide range of sites, the Fortune 500 pages of small businesses, the draft patent law blog search engine and also the white papers on its seobythesea.com blog.
What are the signals that can be used by Panda?
Eric Enge: Let's talk about some patents, which may play a role in the Panda 1, 2, 3, 4, 5, 6, 7 and beyond. I wish I had ideas about what the signals are used to measure either the quality or content of user participation.
Bill Slawski: I looked at the sites affected by Panda. I started early corrective SEO. I went through the sites explored through them, so for the problems of duplicate content within the same domain, then for things that are not indexed, there was, and went through the list provided by Google Base Tools Webmaster in their field.
In an interview with Wired Amit Singhal and Matt Cutts about this update, they said an engineer named Panda. I found his name on the list shows Googlers and read through his material. I also found three other tools and system engineers, named Panda and the other an engineer who writes about architecture and information retrieval. I concluded that the panda was someone who worked PLANET paper (more on this later).
For signals with respect to quality, we can see that the lists of questions from Google. For example, what your site is like reading a magazine? Would you trust people with their credit cards? There are many things on a website that could indicate the quality and make the page seem more credible and trustworthy search engine and leading to believe it was written by someone who has more than expertise.
Things are usually presented on the pages, for example if you get eight blocks, there may be signals. If we look at the planet brochure "Learning Tree massively parallel bands with MapReduce" his focus is not so much to look at the quality of the signal, or feedback from users, but rather as Google is able to take the machine learning deals with Decision trees and scaling it up to use multiple computers simultaneously. You could put a lot of things in memory and compare one against the other page, if certain characteristics and signals are instantly displayed on these pages.
Eric Enge: So, PLANET brochure describes how to make the process, which previously had a computer in the process of machine learning, and put it in a distributed environment will have much more power. And 'this a fair assessment?
Bill Slawski: It would be a fair assessment. Use the Google File System and Google MapReduce. It draws a lot of stuff in memory to compare to each other and changing variables simultaneously. For example, an approach to regression model type.
Something that could have been very difficult to use a very large file is much easier when you can be scaled. It 'important to think about what your web page, the signal quality.
Your approach is to manually identify which pages have high quality, content quality, presentation, etc. and use as seed set to use automatic learning process. To identify other pages, and how they can be classified according to these different characteristics, it is harder for us to determine which signals specifically search engines are looking for.
If they follow this with Panda PLANET-type machine learning, there may be other stuff thrown in. It's hard to say. Google may not be exclusively used this approach. They may have tightened indexing based on phrases and the highest in a way that helps rank and search results re-ranking.
Panda can be filtered on which sites should be promoted and demoted other websites are based on a kind of score in signal quality.
It seems that Panda is an approach to rehabilitation. This is not a substitute for the importance and rank of the page and two hundred more signs that we are accustomed to hearing about Google. It can be a filter on top of those where certain websites are promoted and demoted other websites are based on some kind of score signal quality.
Eric Enge: That's my feeling too. Google uses the term classifier, so you can imagine, either before running the basic algorithm or later, is like a ladder or a factor of up or down.
Bill Slawski: Right. This is what you hear.