I started a little miniseries last week talking about the types of online personas and yesterday I continued with why Google would want to know if you were human or not. I just want to finish the thought concerning online personalities here by cataloging a few of the potential ways that humans are separated from automated processes in the online world.
The beginnings of separating artificial intelligence from human intelligence starts in the late 40s with Alan Turing and the Turing Test. What was a somewhat fantastical notion then is commonplace today. Often when entering form data you are challenged to solve a small puzzle to prove you are human. In an extremely interesting twist, researchers at Carnegie Mellon developed reCAPTCHA which not only identifies humans but slyly uses them to translate scanned text. For example, a great deal of the New York Times is not in digital format. We can try to scan this text but because computers can’t always read text (or the print may be slightly garbled) some of it doesn’t get translated. These bits are placed alongside CAPTCHAs for human translation, a brilliant bit of crowdsourcing developed in part by Luis von Ahn.
The strange coincidence of simultaneously using computers to test humans and humans to test computers has led to some interesting modern applications of process filtering:
- Ad companies can use CAPTCHA for branding.
- Twitter users can force followers to validate through an email response mechanism.
- Of course, everyone loves kittehs.
- CAPTCHA has even led to entire communities of mock-worship and a new religion.
- Not everybody likes CAPTCHA, and there are alternatives.
- But spammers have attempted to harness cheap labor to circumvent Turing Tests.
- They’ve also harnessed simple motivation techniques.
- And CAPTCHA has shown flaws maintaining a consistent filter of automated processes.
Once again, we see that using machines to identify humans turns into an arms race. Spammers even fight back using the exact same conceptual approaches, like bypassing email spam filters using images.
I love this quote from an interview in Walrus Magazine with Luis von Ahn, that he had “unwittingly created a system that was frittering away, in ten-second increments, millions of hours of a most precious resource: human brain cycles”.
This is exactly what Google and other search engines need in order to find the best content on the web: the real value of human brain cycles. We’ve seen the flaws of humans identifying machines, and likewise machines identifying humans. Until automated processes become indistinguishable from humans, the solution may simply be to just let humans to identify humans through actual interaction, one human being at a time.
I just finished reading this great post on the spam arms race problem that Google and other search engines have. This is a great summary of the problem in general. A really interesting implied point of the article is that in some ways the algorithmic approach to search indexing is impossibly flawed. It puts algorithm gamers eternally in front of great content because people focused on great content aren’t gaming algorithms (or more realistically they are afraid to because they might get black listed or penalized). So there will always be some latency between the great content coming to the top of search results while search engines look to filter out gamed results.
Near the end of the post he says:
“The good news is that webmasters who don’t invest in gaming will still see the best long-term results. Focusing on quality, basic promotion through guest blogging or social sites, and honestly providing value will get organic results over time – and won’t be tossed to the sidelines with algorithm updates.”
Why? I mean, we all hope this is true, but what is the actual next step in addressing the problems Rob raises? What advantage could search engines get back over the algorithm gamers? There are human indexes, but they can’t seem to keep up with all the content on the web, specially emerging, fresh, or daily content. One solution might be to crowdsource the problem. I think Google has attempted to do this a little bit with it’s +1 button. But this could be gamed as well by sending out bots to vote up content.
In every case, the problem keeps coming back to automated processes. And there are only two solutions to this problem that I can see:
1. Build an automated process to identify automated process. Presumably automated processes have patterns or signals to them that can be identified. Content that is promoted or generated by an automated process could potentially have lower value with a search ranking system.
2. Somehow harness the abilities of real human beings to help identify valuable content.
The problem with #1 is that is remains an arms race. With each development comes a new battleground. With every adjustment Google makes comes a counter-adjustment by anyone looking to game them. #1 will presumably always be a part to the search indexing process, but it alone is not enough,
#2 is the answer: because at the end of the day humans know what they want. They may even *want* what spammers are trying to get to them. But Google doesn’t necessarily care about that, it just wants to get searchers paired with what they are looking for. So by threading real human feedback into the quality of the results you can assure that stuff people don’t want is getting marginalized or de-prioritized.
Of course easier said than done.
The gamers may create an army of things that look human, or even employ cheap labor in order create networks of people that are directed to promote their product, creating an artificial demand that would get noticed by Google.
Last Friday I talked about the different kinds of personas that may exist on the web. Tomorrow I will take a look at the different ways human begins can be validated and therefore used to improve search quality.
The way we interact with each other within a digital space is sort of incredible if you take a step back and think about it. It removes all physical time and space and replaces these concepts with new definitions, new ways of moving around, and new types of engagement, all of which were impossible before.
On Google Plus I can socialize with someone who doesn’t even speak my language as if they were just another friend. On a place like Reddit, I can sign up for an account with a minimum amount of detail and instantly have a brand new persona, history-free. A new start on digital life or a way to separate my real life from my false one. Lately we’ve seen groups of users harness the concept of an anonymous mob into the form of a powerful, anti-corporate vigilante.
The arguments over who you are and the ethics of purposefully shedding your “true” identity are of primary importance to how we interact digitally. On the front lines of this at the moment are Google, it’s new social network Google Plus and a vocal group of users.
In this context, there are essentially four different kinds of identity within the digital world:
1. Named Persons – this would be a direct mapping of a real word person to their online presence. The level of detail given might be extremely minimal, gratuitously unnecessary or anywhere in between. An offline parallel would be a driver’s license or a passport.
2. Pseudonyms – an identity taken up that is usually intentionally separated from the offline person. These are useful when discussing sensitive topics, protecting identity or just providing a new outlet of expression wholly unattached to an offline history. Pseudonyms generally intend to have their own, separate history, almost like the creation of a new person. In the offline world, these are often pen names, dummy corporations or front organizations.
3. Anonymous Personas – unlike named persons, anonymity means no tie to an offline personality but also unlike pseudonym there is no intent to establish a new personality or even have a history. Anonymous could be anyone. Websites like 4chan demonstrate what anonymous communities with extended lifetime can develop into. Offline examples include symbolic uses, anonymous donation, and protection of sources in media.
4. Non-humans, Bots, or Automated processes – In this context, these are programs that are written in order to accomplish tasks that are either well-defined, repetitive or technological. In most cases within this context, these programs are made to look like humans. An offline example that I found yesterday is this actroid, which at the moment is better at acting human than performing tasks. Automated technology is certainly improving, though. Meanwhile AI is a little behind.
What does any of this have to do with SEO? I think identity is important to SEO for two reasons: search indexing and search quality.
If you didn’t know already, web crawlers index the web for search engines. Since the web is so big, automating the job of indexing what’s out there makes sense. These web crawlers are part of group #4, especially in that they are an attempt to mimic human behavior: how would a human navigate the page and how would a human understand what the page is all about. In this context web forms are an excellent illustration of the problem behind this methodology. Bots don’t have personal data. They aren’t human. So how can a search engine truly index the content on the web that requires human information or personal data? Back in 2008 Google announced it was experimenting with forms, but we haven’t heard much about this since.
One cool side note for technical SEOs, you can now fetch your website *as* Googlebot to see what your website looks like to the crawler.
But the real interest here is search quality. Search engines have a vested interest in identifying the four identity types above.
The most obvious of these is #4 again. Generally speaking, if people can write bots that look like humans, they can create a false market of demand for content. If the search engines don’t identify these false markets, their search results could be things that aren’t what people really want, or simply biased toward one product or service.
At the moment however, Google is particularly interested in identifying the difference between the first two: named people and pseudonyms. Google has stated they are an identity service and that Google “works best in an identified state“. Twitter, Facebook and other networks have also encountered this problem especially with celebrity impersonators, often resorting to “verified accounts”.
But I think Google has a bigger problem on its hands that it isn’t discussing openly. My belief is that Google wants to use individual opinions as a major search ranking signal. This could potentially significantly help with search quality: if real humans are helping shape the results in addition to bots, we can index the web much better (even the stuff behind forms!). Individuals are becoming more prominent on the web (think about blogs becoming as popular as commercial websites). If links from websites were a way of understanding value in the last decade, links (or referrals) from individuals could become a big part of understanding value during the next decade.
But what if people make TWO profiles? Or five? Or 500? Will they have a greater voice? If Google allows pseudonyms it has to account for this problem. Google has already dealt with this problem with the proliferation of websites to create links. I don’t think they want to deal with it again.
It seems to me pseudonyms, like websites, could have their own PageRank. Who cares if it’s a pseudonym? What really matters is whether or not that pseudonym is human.
Are you scouring the Internet for quality backlinks, looking for websites with good PageRank and possible linking opportunities? Are you TIRED of all the HASSLE of sending off emails requests for links, forming real relationships, and socializing with actual people who share your interests?
Well look no more. I promise you GUARANTEED results for ZERO effort on your part. No strings attached! Hundreds of thousands of free backlinks to your site at NO COST TO YOU. You’ll be on the top of every search result for every keyword search term ever.
Ahhh, what’s the catch? How do I do it? NO CATCH. Here’s how to do it. Click on the links below:
Then just start signing up.
Then, while you are doing this, start wondering to yourself: why doesn’t everyone do this? What if my competitor used the exact same “free backlinks” tool as I did, who would rank better? Hmmm. Not sure. I need the one tool my competitor doesn’t have. But, if all these tools are easy to find on Google, how can you possibly get ahead in the game? I guess you can just hope it’s a little secret between you and… everyone doing searches on Google.
You see where I am going with this. We need to find ANOTHER way to automatically generate backlinks without any work that our competitors don’t know about.
Or, maybe not. Because if you think about it, using an automated process to generate links means you are doing something that has a pattern to it. Sure, you can randomize three different ad copies and three different keywords, but it becomes self-defeating because you want a LOT of backlinks. The more backlinks you get, the more easily identifiable the pattern. The more “value” you get out of automated generation, the more likely it is that the cheap promotion of your site will be seen by Google as not a real promotion or relationship. It may even be seen as just paid links. Because that is essentially what it is, right? You are either paying for a product or using a service that generates wealth off of generating links on the Internet. They aren’t providing anything of real value here. It’s a proliferation of links, nothing more.
All of this backlink generation really kind of reminds me of snake oil. Or get rich quick schemes. If you just stop for a second and think it through, this probably isn’t going to work. OK well maybe it sometimes works for a short period of time. But there are quite a few risks to consider:
1. The links you are generating are not carefully examined relationships. On some of these tools you don’t even know where your links are coming from. Are you really comfortable with that? Blindly acquiring links from potentially bad websites that may actually make your site look *worse* than it did prior?
2. What if a lot of the websites you are getting links from are blacklisted by Google all of sudden? That can’t be good. Whether or not that happens is irrelevant. The point is you simply don’t know who you are associating yourself with.
3. Automated processes are the search engines’ specialty. They have elite engineers (getting paid big cheddar) to use algorithmic processes to identify good content. You don’t think they can identify a process some guy built in his free time?
4. Google and Bing don’t want links generated this way to promote content in their search engines. Why? Because it is not a real signal about the value of your website. It’s a generic promotion that anybody can use from the crappiest of websites to the best of websites. It isn’t a differentiation. Why would they assign any value to links acquired this way? It lowers the quality of their business model.
5. Opportunity cost. Why spend even an extra thought messing around on this when you know it can’t be real? Spend time on growing the value of your business. Developing real online relationships will bear more long term fruit than these schemes ever will. Yes, it takes some work. So what? Just a thought here: if you are not happy doing what you are doing maybe you should rethink a few things.
I am not here to say this won’t move you up in search ranking. But if you’ve read this far and are still not convinced, drop me an email and I will set you up with a super secret automated backlinking tool that only you and I know about. Just $5.99 a month and a small startup fee.
And by opportunity I mean the situation was pretty bad. It was actually a great example of how to royally screw up your search visibility so I thought I would catalog the things that were wrong. Most of these were really easy things to fix as well.
Problem: Duplicate Content
Severity: Very High
Cost: Very Low
Originally, their brand URL (let’s say it was www.brand.com) was not available so they bought something close, like www.mybrand.com. When their brand URL became available they copied the exact entire website over to the new URL without doing anything with the old website. So now there are two exact duplicate versions of their website out on the net.
Duplicate content should be avoided because it waters down the value of *both* versions of the page. It raises search indexing questions of originality, authorship, and ownership of content. Here is Google’s official page on various kinds of duplicate content and how to deal with potential scenarios.
In this scenario, you want to 301 redirect all pages on the old site to the corresponding new page on the new site. This will let the search engines know the content has moved permanently.
Problem: Homepage Redirect
Severity: Very High
Cost: Very Low
When you type in the URL there is a redirect to another page. Something like:
http://www.brand.com/ redirects to http://www.brand.com/brand/content_web_default.html
There is no need for a redirect in this situation. Just host your homepage content at the root. The way that you organize your content and put it into folders is important to both users and search engines. It helps them understand generally where they are on your website in relation to your homepage. Further, this website is using a 302 redirect, awkwardly informing the search engines that the homepage content is in a temporary location. The homepage is generally your most important page, there is nothing temporary about it or its location. Finally, when you do put content into folders on your website, avoid redundant path names. In this case there is the folder “brand” off the root. Each folder should categorize your pages if possible using keywords to describe the content. It is helpful to users and search engines alike. Here is a great SEO Moz page on appropriate use of redirects.
Drop the redirect entirely and host the homepage content at the root.
Problem: Only 5 inbound links, 4 of them link to a PDF that doesn’t exist
Severity: Very High
Google built its original algorithm on backlinks. Since then we’ve only seen how important they are. In 2007, Google’s webmaster tools started providing tracking tools for backlinks. So having only 5 backlinks is tough, and having 4 of them point to a PDF that is missing is even worse, especially since this document is a critical part of their business. Easy enough to fix the 404s. Getting more backlinks is a tougher problem that I won’t get into on this post.
Problem: No Meta tagging or H1 tagging, only brand in the title
This is one of the most common SEO best practices. Your title tags should contain keywords that you want to associate with your company. Make use of Meta tags to describe your page content. The Meta description tag can end up being ad copy for your search listing. Describing your images using the ALT attribute is an opportunity to provide more descriptors about your content. And H1 tags are the right way to highlight and summarize the copy on page.
In this case, only the brand was in the title. I would add keywords to each page title getting a little more specific about what each page is talking about. The brand is fine but it doesn’t help searchers understand what each page is about.
Problem: No unique copy
This company’s copy was generic and repeated throughout the web. There was nothing unique about the website. It essentially offered no value beyond other sites on the web. This is another tough problem to address similar to backlinks, but it’s probably the most important thing about any website: what value are you offering? In this case spending some time talking through what the company is bringing to the table and developing that content for presentation on the website is definitely something they should be doing in order to improve search visibility.
Problem: No robots.txt
Cost: Very small
The robots.txt file is an easy way to help make your website search friendly. It allows you to tell search engines what you do and don’t want indexed. Being clear upfront with a robots file is simply best practice and it is very easy to create and add.
Problem: No XML sitemap
The website had an HTML sitemap, one for humans to use, but no XML sitemap. Googlebot and other search engine web crawlers will look for an XML sitemap to see your content and how it is laid out. It is really easy to generate an XML sitemap.
Problem: No Google analytics
If you aren’t tracking your visitor activity you don’t know if you are doing things well or not. Google Analytics is easy to implement and use. And best of all, it is free.
The guy who circled him commented that he thought they had some things in common, to which my friend replied “How’d you even find me?”.
A while back another friend, we’ll call her Ms. Prim, commented that I was the only one posting in Google Plus, and she didn’t understand why anyone would be into it. (Apparently I’m not interesting enough to be someone’s entire social stream. That is a tough thing to realize.)
What I am noticing here from some of my friends is that their expectations have been not only set but they’ve been *solidified* by other forms of social media (Facebook in particular, but also Twitter and blogs, which I’ll get to shortly). For Ms. Prim, nobody that she has circled is posting anything except me. It does not mean there isn’t content out there, it just means she doesn’t know where to find it, or who to circle. Or maybe she doesn’t even care to find it. Essentially nobody in her circles have “moved” from Facebook onto Google Plus, so Google Plus has no value for her at the moment. The value is entirely based on community usage (see MySpace).
For Mr. Onion, Facebook’s walled garden has been so imprinted that he likely didn’t realize how public his posts/comments really were. Since Google indexes public G+ posts (your posts don’t have to be public, btw), anybody can search the web and find what you are talking about. Facebook very carefully keeps this data hidden from “public” consumption. Facebook recognizes this data as their exclusive value and they view opening it up as a dilution.
However if Mr. Onion had posted on Twitter and got a reply from someone he didn’t know, would he be surprised? Or if he posted his thoughts on his blog and some anonymous person surfed through and commented I don’t think he would have blinked. The dynamics of Twitter and blogging are from the onset a conversation with the entire web. Facebook is just a conversation with your friends. Facebook is of course very slowly growing the size of that conversation. You can interact with brands, restaurants, politicians, etc… But it is still very much a one-way selection of content. The Facebook user carefully chooses what they want to see, and generally does not want to *be seen*.
Thinking along these lines Facebook is like a television set. You have your friends over and are watching TV. Since it’s your house, you have the remote.
When you microblog on Twitter though, it is a much more public activity. The expectations are that you are standing on a podium in Public Square, and anybody can stop by to hear you or they can simply choose to ignore you. One potential problem with Twitter is that it is a bit like doing this in the middle of an outdoor concert or an amusement park. There are many conversations and forms of entertainment happening simultaneously. Everyone is talking to everyone else about everything at once. It is pretty hard to hear a focused conversation.
The blogger gets to keep the conversation focused on his comment thread. Like Twitter it is public however blogging is more like a town hall meeting where there is a smaller, identified group of people having a focused conversation. Anonymous people can submit questions if they like through a forum moderator. But who are these anonymous people? Why should anyone care what their thoughts are if they are not willing to take ownership of their contribution to a focused conversation?
This leads to the currently very hot topic on Google Plus at the moment: Nymwars. As I have talked about before, in Google’s eyes reputation is an increasing concern. They need *some* kind of reputation when attempting to identify relevancy and eliminate spam. People with nothing at stake have nothing to lose and therefore zero accountability for their actions. But this is not to say there is no value in anonymity or pseudonymity. Those concepts are just really difficult for Google to solve. This post on Google Plus is an amazing summary of a lot of the issues in play and how Google might be struggling with them. UPDATE: Kee Hinckley makes a very strong case here for pseudonyms.
I don’t see any reason why Google can’t eventually solve the pseudonym problem. Pseudonyms can still be assigned reputation, just like any twitter account or Facebook account. Whether social media can or even needs to handle the anonymous issue is another topic of conversation.
Thanks to LucidChart.com, I drew a venn diagram summarizing what I think are the dynamics Google Plus. This chart of course presumes Google eventually solves the Nym problem.
By having an “Interactive Nym” (essentially a public persona like a Facebook login, twitter handle or Google profile), we are able to level the field of credibility between online social interactions. We can each assign some sort of value to that persona, allowing us to take recommendations from it, or know more about their intent, or essentially have more context when we engage. I don’t think online personas are required for anybody. On the contrary there is a huge need in many cases to take advantage of anonymity as well as pseudonymity. Ideally we’d be able to take advantage of all forms of identity for any given context.
The way I see it, Google Plus does all of the things that the previous modes of social publishing did not cover entirely. I guess it remains to be seen whether or not Ms. Prim’s friends care.
Yesterday a friend of mine had some questions about installing the newer asynchronous version of Google Analytics and our conversation raised a few concepts that I thought were interesting.
First off, I’d like to say that Google Analytics is a great tool for anyone involved with websites, but especially if you are interested in SEO. You can access traffic source data like what keywords people are using to get to your site, or segment your search customers by any metric you can think of. Plus it’s free. Well, it’s Google’s version of free in that you turn over your data to them in exchange for the service.
Anyway, my friend was trying to find out if the newer script would only be compatible with browsers that support AJAX. (In case you were wondering, AJAX is a way to make server calls without reloading the entire webpage.) The thing about it is, the asynchronous tracking code is not using AJAX at all. The “asynchronous” aspect of the tracking code has to do with the way the page loads.
Page loading is an important part of SEO. Websites that load fast quite obviously offer a much better user experience than slow sites that take forever to load content. And Google announced that site speed was one of their search ranking factors in April of last year.
The new asynchronous tracking code attacks this problem in several ways:
- It starts packing user data up into a queue so when the javascipt code is finally ready it can send everything immediately to Google. This will help address the problem of the tracking code not always being rendered.
- It dynamically adds the <script> element to the DOM, which works around the download bottleneck.
- It adds the async=true attribute to the script tag. This is a new attribute that was added in the HTML5 spec, so only browsers that can render HTML5 can make use of it. It redundantly works around the download bottleneck, but Google is obviously looking forward here, hoping to keep the tag as efficient as possible before having to make another change to it.
One thing to note, Google recommends putting this tracking code in the bottom of the <head> element. This is an important point from a page load time perspective in itself. You should always load your CSS, Meta tags and other header elements before any script resources so that they don’t slow down the page load process.
While preparing this post I found a cool post from a guy who decided to further optimize the actual tracking code itself. While I am not sure this gains most sites too much, especially if you compress your resources, I think it’s a nice walkthrough for anyone interested in how the tracking code works.
Finally, here’s a usage guide for asynchronous tracking.
I have noticed some recent shifts in Black Hat Marketing and wanted to offer my take on how black hat tactics affect the broader world of online marketing.
Google and other search engines have a strangely unique product that they offer:
- It is dynamic in nature as opposed to static products like movies, toys, or power drills. Every search can be different.
- It is near real-time as opposed to billboards, phone books, hamburgers or grocery stores which offer dynamic products (in that they can vary in quality or value) but at a very slow rate of change.
- The composed whole of their offering can be contributed to by anyone with a website. This differentiates it from other real-time dynamic services like airline ticket or insurance shopping websites where the product is predefined by a closed group of individuals. In a strange sense, a search engine’s offering is “open source”.
Search engines are real time brokers with a nearly flat supply chain. The algorithms handle all the hard work that once had to be done by humans: fetching all of the major products and presenting a tiered recommendation, like Consumer Reports.
Because of this, Google’s *brand* is open source. And in more ways than one. Not only is their offering affected by the moment’s choice of websites, but also by the way the algorithm does that selection. If the algorithm is known, Google’s brand is at the mercy of outside actors. And guess what, people won’t use Google if it doesn’t fetch them what they are looking for.
I happened upon a great discussion of white vs black hat with Rand Fishkin where they each argue the merits of their particular discipline. The one thing I think Rand never fully articulates is how important brand is in all of this. Both Google’s and everyone else’s. Because Black Hat is the enemy of brand. Both Google’s and everyone else’s.
In economic terms, Black Hat activity essentially creates an artificial demand around a website. With link buys, there aren’t actual people out there “voting” for a website with links. And with content scraping the website isn’t the one actually producing the valuable content. So its value is probably not great. Sure, you can have use for so-called black hat tactics when your product has actual, real value. This article does a great job cataloging legitimate SEO techniques that might be considered black hat. But most of the time we’re looking at get rich quick schemes. And there’s a reason the snake oil salesman’s got to get out of town fast.
He is about to have a terrible brand.
Identifying and classifying brand is of course tough, but necessary. It’s how Google keeps the snake oil out of the results. This is what I think Rand was getting at: a cultivated brand is worth mountains more than any number of individual cash grabs. If you treat your customers right, your level of effort decreases over time – your customers start promoting for you. Earl Grey’s counter-argument, that you can make a long term strategy of many cash grabs, falls pretty hollow against this. Who wants to spend their entire life hustling endlessly?
All of this brings me to the new black hat, which I believe looks more and more like it will side step this entire argument. Instead of worrying about brand or cash grabs, these guys spend their time coming up with technical ways to just outright steal the customer. This presents a strange new quandary for both Google and their customers. Google has not been gamed. Nor did it return a bad result. But it still will reflect poorly upon Google even though their involvement was highly indirect (the most targeted websites are the ones Google currently favors).
Overall, the black hat effect creates a difficult problem for small business and startups (who don’t have as much brand), and companies that cannot afford great IT support to protect themselves from search-redirection. The more these companies can focus on brand development the better off they are going to be in the long run. I mean, with your great brand can attract some great IT, right?
As an interesting side note, it seems like most black hat activity happens around darker markets like pharmaceuticals, gambling and pornography. Perhaps the conversion rates are better when the customers don’t care about brand…
Today I started out with last Friday’s Whiteboard Friday on SEO Moz. This was a good recap on the value of PageRank and what it is useful (or not useful) for. Rand likes to emphasize PR is not a Key Performance Indicator but rather an overview type metric. He also emphasizes how it measures inbound links without really speaking to their quality or relevancy. I like PR as a first look at the general value of a website. Mixed with something like Alexa Rank or Compete you can get a great first look at the popularity of a website.
But while surfing seomoz blog index the words Link Building caught my eye. All SEO’s seek this elusive art form. It reminds me of get rich quick schemes: they never really work. Link building is hard work. I started with this really great but long post on tips for good link building. It led me (through Justin Brigg’s website) to a good short interview on link building with Melanie Nathan. Brief but informative, she talks about the paid links and offers advice to new SEOs.
Finally, she pointed to another article form two years ago on Reciprocity Linking. I really like this approach and I generally do this myself. Note this is not just *reciprocal links*. It’s almost more of an attitude or way of being. The line of thinking is like this: offer value and get value in return.
It’s funny because this exact same thing happened to me last week when I blogged about Spam. I mentioned how Jay Purner got his content nabbed. He occassionally blogs for AdHub.com, a local advertising network for the northeast United States. The next day, Walter Ketcham from AdHub calls wanting chat about what happened. We exchanged thoughts, I gave him some feedback on his website and he gave Productive Edge a listing on his website. Aside from all this we simply connected and had a nice chat. Now AdHub might not be a link from DMOZ, but it still reflects an offline connection and online relationship. That wasn’t so hard was it? Maybe he’ll let me guest blog on his site next?
As time goes by on the Internet there is an increasing intensity surrounding online reputation and privacy. Since the Internet is interactive in nature, you can hardly make use of it without generating some form of footprint about your online behavior. This frightens some people because they are used to one-way media: TV, radio, magazines, things to be consumed in one direction. We frequently see a level of paranoia around how much of our personal life is available online.
But a lot of the value of the Internet comes from an exchange of information. If you shop at Amazon they need to know where to ship your stuff. If you want to find your friends on Facebook they need to either know where you went to school or what email addresses are in your contacts or something about you. These links and connections are what make the web. In SEO, link text and meta data are important ways of not only putting together a connection but also information about that connection. In fact as we go along the web will only seek to have more information surrounding the meaning of these connections.
But there also seems to be an equally strong increase in the amount of anonymous behavior on the internet as well. Groups like Anonymous and websites like Wikileaks seek a web that strips associative meta data away from content. There is also the ever-present concern over how much personal data or financial data businesses retain about them.
Despite these diverging directions, both kinds of content have serious value on the web. Content that is connected to real people can yield a rich experience driven by trusted sources and common interest. We see this all the time with personalized web experiences being offered as a desirable feature.
Anonymous information and behavior allows people to explore controversial topics, provide transparency, or empower individuals. Protection of sources is a prime example of the value of anonymous information.
At the intersection of these two concepts is SEO. Search engines will have a hard time knowing what content is valuable to users without knowing the reputation of that content or its authors. For most information the meta data surrounding the content helps to determine this: who created it, what are people saying about it, and what do people normally say about the producer of the content. In this sense it’s important to not only know the author, but the people who recommend the content. We trust our friends and heroes.
For anonymous information, since we don’t know the author, the reputation of the recommending source becomes more crucial. As I continue this blog, I will continue to explore this theme: online user profiles will become an increasingly important element in helping to determine search results.