Filed under: General, LSI/LSA, SEM Research by Stuntdubl SEO at 10:08 am, 12/21/2005
Thanks to Rand for pointing me in the direction of the staff of Fortune Interactive (including Andy Beal, and Mike Marshall). SEM Logic is a brilliant piece of programming done by a guy nice enough to talk to me for 25 minutes even though he knew I was just “kicking his tires” (and I really dislike when people to me).
How sweet is it to be able to conduct semantic analysis on and off pages. Really nice graphic and statistical implementation of SEO principles. What could be better than this? It’s a brilliant piece of technology from someone who has been trained to reverse engineer scientifically and take a lot of the guesswork (gutfeel) out of the optimization process. Very nice work that I’m sure would be very useful data to near any site.
Among other things, the software is able to give information on:
- Document relevancy and
- Supporting terms relevancy
- Competitive analysis
This in and of itself doesn’t really do the tool justice, and information can be gathered for both on page AND off page optimization. If you could benefit from this software (and aren’t just a filthy tire kicker like me) be sure to give these guys a jingle.
Disclaimer: This is not an advertisement, just being non-biasedly impressed with both someone’s SEO knowledge and application and business acumen.
Filed under: General, LSI/LSA, Mr. Ploppy, SEM Research, Tools by Mr. Ploppy at 8:02 am, 4/11/2005
You want higher rankings, but you just want to press a magical button and see the traffic roll in. The truth is you’re probably not gonna get anywere near that magic button, so you had better start building two things - content and links. Here’s some tools that will help aid in the process. Links are King! …but content is still important. Learn how to use your links wisely - lots of links - lots of deep links - lots of anchor text randomization….LOTS of quality ON-TOPIC LINKS!
Brand New Tool! - WebuildPages Query Combination Tool
Link Tools
Content Tools
Extras: A DMOZ keyword tool list
B-spam goes mainstream
These tools brought to you by the number 5 (tool that is)
If you have tools that you would like to see Mr. Ploppy list, please e-mail: ploppy@stuntdubl.com
Tag: SEO Tools
Filed under: LSI/LSA by Stuntdubl SEO at 7:53 am, 4/6/2005
With all the talk of patents and search technology development lately, I wanted to have this to go back and reference. The Applied Semantics White Paper is fairly readable, and definitely worth a good perusing. If you can’t seem to get it, try through the Google cache
Thanks to Marcia for the tipoff.
ResourceShelf Original Press Release
Tags: Applied Semantics
This has been a post I’ve been procrastinating on for quite some time now, but with the release of the most important SEO related document in the last 5 years according to WebGuerrilla, I thought it was about time to forge ahead, and at least write SOMETHING on this topic. Here is the Google patent that is creating the big stir. A cold war has been raging for quite some time, and it is only getting more intense.
There are some definite gems being extracted from the newly released patent, and I imagine more good working hypothesis will come from it. Here’s the roundup from around my favorite webs of information:
The thesis of this post is that there is an escalating information arms race between SEO’s and search engines. The latest patent release from Google is a prime example. While it still reads like a glorified wishlist of functionality, I would likely believe that a lot of the functionality has already been implemented or will be shortly.
Some of my favorite quotes so far about the patent:
- 63. The method of claim 62, wherein adjusting the ranking includes penalizing the ranking if the link churn is above a threshold.
Game on.
Later thoughts: [0090] Additionally, or alternatively, search engine 125 may monitor time-varying characteristics relating to “advertising traffic”…
Let me summarize: We will favor pages that contain AdSense, and we will penalize pages that contain ads from our competitors. don’t even think you can cloak in order to prevent us from seeing our competitor’s code on your site. That won’t work because we will be using our millions of toolbars to track what your visitors are clicking on.
-WebGuerrilla
- Maybe the patent is a double sly. Good way to throw everyone off the real scent

Filed it 1 day too early. - SlyOldDog (who also did a very nice roundup in msg #36
- I don’t think any one thing like a short registration is going to penalize you. The problem would come in if you have a series of things that look “less than desireable” from the algo’s point of view.
The legal firm Harrity & Snyder LLP is registered thru 2010, coincidence?
- graywolf
- I’ve already got a plan on bookmark spam
but I’ll keep that sucka to myself
- seomike
- Knowing there are others as paranoid as me is assuring. - caveman
To properly understand the scale to which this race has evolved to, it is necessary to understand the history behind it.
SEO History through 2001 is a good start in this study. SEO’s are the parasites that force SE’s to evolve to become more effective. A few of many examples:
- Alt tags - SE’s use them, SEO’s abuse them, SE’s devalue them.
- Internal anchor text - SE’s use them, SEO’s abuse them, SE’s devalue them
- Links power rankings - SEO’s create “link directories” - reciprocal links are devalued
- Anchor Text - SE’s use them, SEO’s abuse them, well…you get the idea.
- Content is king - DMOZ clones and scraper sites appear - duplicate content filtering is improving
The race to be more informed is becoming even more intense as information is traded over dinner and drinks at conferences, and the information becomes more sophisticated and less likely for the lay person to understand. Of course their is also the potentional for disinformation (misinforming webmasters with phoney PR - both kinds, and untimely, unreliable, incomplete information. This, however, is probably a different topic for a different day.)
While as an SEO, I definitely respect the intelligence and hard work of the search engines, the limitations for cooperation are huge. I saw this well exemplified when Threadwatch discussed building the perfect link analysis tool.
Barry Schwartz summed it up quite nicely:You can’t have a win - win situation on this topic. You either go the WPG route and don’t care (SEOs win - Google Loses) OR you go the API route and limit the users big time (SEOs lose - Google Wins). – rustybrick
Why Will the Race Continue?
Different Goals
SEO Goals
-Create a subset of data and test to produce optimal results (high ranking serps)
-Use this information to maximize profits from commercial sites that benefit from high rankings
SE Goals
-Create a subset of data and test to produce optimal results (relevant serps).
-Profit from clearly identified advertising
SEO Methodology
Use any and all techniques that are most important in creating relevant results by isolating and testing variables, then creating pages that are optimal for the results.
SE Methodology
Study and identify the ways the algorithms are being manipulated and adjust accordingly.
The goals and methodologies of the two camps make them natural adversaries (even though we get along well and like each other often times) It’s an unfortunate, but inevitable occurance. Anyone who has been to a conference and seen a search engine rep sitting down and enjoying several late night pints with some of the most well known and self-professed black hat SEO’s in the industry will understand the true definition of irony. They will also understand the “quid pro quo” nature of the search industry that gives the information you hold an increasingly diminishing value inversely proportional to the time you hold it. The information is ony valuable if you monetize it, or if someone else can monetize it and return the favor when they can. This makes the battle against time and for dollars more intense as well.
SEO is getting more difficult by the day, and evolving constantly. Arm yourself with good information, and defend yourself against the disinformation to stay in the know. Remember that SEO’s are the number one SE variable. Reading through it now, it’s pretty obvious they are focusing on reducing the effectiveness of SEO techniques. - one last bit of advice by graywolf who also has a great roundup analysis of the new Google patent
Added resources:
Tags: SEO
Filed under: General, LSI/LSA, Mr. Ploppy, Tools by Mr. Ploppy at 10:32 am, 3/13/2005
So I have a section of this site that is for tools, and I have a ton of tools in my bookmarks. Creative use of tools, afterall, is one of the aspects that makes for a good SEO. I figured I’d make a weekly SEO/ Internet marketing tool post, and list a few of my favorite online tools each week for others to play with. So here goes…volume I of the Monday Tool List.
If you have some good tools you’d like to share with Mr. Ploppy, be sure to drop Mr. Ploppy a line.
*command originates from a member at WMW.
Tag:SEO Tools
Filed under: General, LSI/LSA, SEM Research, Search Engine Optimization, Tools by Stuntdubl SEO at 6:12 am, 2/10/2005
Inverse Document Frequency/ Term Vector Theory
Well, I haven’t had as much chance I would like to do research on this latest update, but I did see Jake mention IDF in this post at SEW (which also has a ton of other good information (msg #50 from xan among others) in it. I’ve also heard it mentioned a handful of times, and figured it was high time to sit down and do some dedicated research on at least one of the speculative new technologies (LSI, Hilltop, and all the incredible information orion is bombarding us with these days)…while everything seems to point back to “quality relevant links”, I think it’s good to broaden one’s horizons and understand what determines “quality” in a changing environment.
Inverse Document Frequency - term used to help determine the position of a term in a vector space model.
Formula for IDF:
IDF = log(D/d) where D = collection size and d = number of documents containing a given term.
weight of a term, w=tf*IDF
- see alsoTerm Vector Theory
According to orion at the above mentioned TVT thread, the formula for term vector theory is as follows:
w(i) = tf(i)*IDF = tf(i)*log[D/df(i)]
where
tf(i) = term frequency, number of times a term i occurs in a document
IDF = Inverse document frequency = log[D/df(i)]
D = database size or number of documents available
df(i) = number of documents containing term i
I wish they’d do more pictures of this stuff for the slower people in the crowd:

More on Term Vector Theory at Webmasterworld and an Art vs. Science discussion - on term weight formula from HighRankings.
Not sure if I digested all this, but at least now I have some good bookmarks for later. My take is that you may start seeing more pages (if you haven’t already;)…that will show up without the actual keyphrase you searched on the pages that are returned in the serps.
Filed under: General, Google, LSI/LSA, SEM Research by Stuntdubl SEO at 12:24 pm, 2/6/2005
“Welcome to LSI hell, folks” - BakedJake - webmasterworld.com
So the last thing was pretty worthless, so I figured I’d take a little time and comment on the new bane of SEO existence. It isn’t really NEW per se…we’re just starting to see it in action more with the new Google update. Most SEO’s realize this trend is only going to continue…the simple essence is if you want to rank for “internet marketing consulting”, you are going to have to use lots of other related words in order to prove to the SE’s that you are really about what you say you are about, and not just a dirty keyword stuffer. For the above mentioned phrase, most of the words on this site are good for the “theme”…search engine optimization, internet advertising, pay-per-click, all may be seen in the same theme as internet marketing consulting. This applies to anchor text as well as on page factors. I’m sure I’ll be commenting on this more in the near future, since this is going to be the hotbed of SEO conversation for quite some time. If you have any good ontology tools…be sure to e-mail me;) In the meantime, heres some of the latent semantic indexing/ latent semantic analysis resources I’ve been perusing
Wow…this stuff makes my bleed from the ears…I wish I could still just buy ROS text links and rank high:)