XNSIO
  About   Slides   Home  

 
Managed Chaos
Naresh Jain's Random Thoughts on Software Development and Adventure Sports
     
`
 
RSS Feed
Recent Thoughts
Tags
Recent Comments

Archive for the ‘SEO’ Category

The Periodic Table Of SEO Ranking Factors

Friday, March 30th, 2012

Found this SEO Raking Factors represented in a periodic table refreshing. Thanks to Search Engine Land for creating this.

How to Name your Software Product?

Wednesday, March 28th, 2012

You are a startup and you’re building a product. It all sounds exciting until you sit down to decide the product name. Coming up with a public name for your product is one of the early decisions you’ll need to make.

What criteria do you use for naming your product?

I’ve used the following:

1. AdWords: When people want to find something similar, what keywords are they searching for? I would use Google AdWords to find keywords/phrases that people are already searching for. Look for related searches.

2. Competitors: If there are similar products in the market, what have they named their product and what keywords are they focusing on?

3. Unique Name: Based on keywords from the first 2 steps and your own preference, pick a few unique name that communicates the outcome achieved by using your product.

For ex: if I was building a product which helps me search and find my files, I would call the product Found instead of File-Searcher or something else.

Sometimes, you might need to search for synonyms or replace certain characters in your name to make it distinctly unique.

Choose an appealing name. Something that appeals not only to you but also to your target audience. Choose a comforting or familiar name that conjures up pleasant memories so customers respond on an emotional level. Usually long or confusing names are not favourable.

Also try to avoid names that are spelled differently than they sound.

4. Domain Name: Is a .com domain available for this name? Also what about other popolar TLDs? Personally I prefer getting a .com, unless your product naturally blends with some other TLD. Like talk.to You want to make sure your domain name is different enough from your competitors’ domain name.

People generally make mistakes while typing URLs, you need to make sure there are no stupid websites with small variations of you domain name.

5. Trademark: Might be worth checking if your product name is already a registered trademark owned by someone else in the same business domain. Esp. in the country where you plan to sell your product. In the US you can search trademarks on USPTO’s website.

6. Test your name: Its generally a good idea to present your shortlisted names to a few people and see their reaction.

7. App Stores: Even though all popular App Stores allow duplicate app names, it might be worth checking if other apps use the same name. .i.e. if you plan to build an app as part of your product.

Wikipedia has an excellent article on this topic called: Product Naming

 

All Your Money Are Belong To Us [Spam from FBI Headquarters, Washington]

Thursday, March 22nd, 2012

Recently I got this really “stupid” spam from some claiming to be FBI. It really made me laugh. Do people really fall for this crap?

  1. First of all: Someone from FBI sends an email from sbcglobal.net and has [email protected] in the reply-to address.
  2. Phone number given is of a number in Nigeria (country code +234)
  3. Email was sent to an undisclosed recipients list (which means, many people won lottery)
  4. There is no division in FBI called “Anti-Terrorist And Monetory Crimes Division”. If you google for “Anti-Terrorist And Monetory Crimes Division” you’ll find all kinds of interesting scams.
  5. If paying $300 USD I could get 8 Million USD, everyone in this world would be all over it.

Email follows:

From: FBI Headquarters, Washington <[email protected]>

Subject: Treat As Urgent

To: undisclosed-recipients:;

Reply-To: [email protected]

Anti-Terrorist And Monetory Crimes Division

FBI Headquarters In Washington, D.C.

Federal Bureau Of Investigation

J. Edgar Hoover Building

935 Pennsylvania Avenue, NW Washington, D.C. 20535-0001

Website: www.fbi.gov

Telephone Number : (206) 973-2572

Attn: Beneficiary,

This is to Officially inform you that it has come to our notice and we have thoroughly completed an Investigation with the help of our Intelligence Monitoring Network System that you legally won the sum of $800,000.00 USD from a Lottery Company outside the United States of America. During our investigation we discovered that your e-mail won the money from an Online Balloting System and we have authorized this winning to be paid to you via a Certified Cashier’s Check.

Normally, it will take up to 10 business days for an International Check to be cashed by your local bank. We have successfully notified this company on your behalf that funds are to be drawn from a registered bank within the United States Of America so as to enable you cash the check instantly without any delay, henceforth the stated amount of $800,000.00 USD has been deposited with Bank Of America.

We have completed this investigation and you are hereby approved to receive the winning prize as we have verified the entire transaction to be Safe and 100% risk free, due to the fact that the funds have been deposited at Bank Of America you will be required to settle the following bills directly to the Lottery Agent in-charge of this transaction whom is located in Lagos, Nigeria. According to our discoveries, you were required to pay for the following –

(1) Deposit Fee’s ( Fee’s paid by the company for the deposit into an American Bank which is – Bank Of America )

(2) Cashier’s Check Conversion Fee ( Fee for converting the Wire Transfer payment into a Certified Cashier’s Check )

(3) Shipping Fee’s ( This is the charge for shipping the Cashier’s Check to your home address and this fee includes Insurance )

The total amount for everything is $300.00 (Three Hundred-US Dollars). We have tried our possible best to indicate that this $300.00 should be deducted from your winning prize but we found out that the funds have already been deposited at Bank Of America and cannot be accessed by anyone apart from you the winner, therefore you will be required to pay the required fee’s to the Agent in-charge of this transaction via Western Union Money Transfer Or Money Gram.

In order to proceed with this transaction, you will be required to contact the agent in-charge ( Mr. Jack Williams) via e-mail. Kindly look below to find appropriate contact information:

CONTACT AGENT NAME: MR. JACK WILLIAMS

E-MAIL ADDRESS: [email protected]

Telephone Number : +234-704-566-7523

You will be required to e-mail him with the following information:

FULL NAME:

ADDRESS:

CITY:

STATE:

ZIP CODE:

DIRECT CONTACT NUMBER:

You will also be required to request Western Union or Money Gram details on how to send the required $300.00 in order to immediately ship your prize of $800,000.00 USD via Certified Cashier’s Check drawn from Bank Of America, also include the following transaction code in order for him to immediately identify this transaction : EA2948-910.

This letter will serve as proof that the Federal Bureau Of Investigation is authorizing you to pay the required $300.00 ONLY to Mr. James Wellington via information in which he shall send to you, if you do not receive your winning prize of $800,000.00 we shall be held responsible for the loss and this shall invite a penalty of $3,000 which will be made PAYABLE ONLY to you (The Winner).

Mr. Bill Nicholson

Special Agent.

Washington DC FBI.

Room, 7367

J. Edgar Hoover Building

935 Pennsylvania Avenue, NW

Washington, D.C. 20535-0001                                                                 

NOTE: In order to ensure your check gets delivered to you ASAP, you are advised to immediately contact Mr. Jack Williams via contact information provided above and make the required payment of $300.00 to information in which he shall provide to you

Various Prefixes for Ngxin’s Location Directive

Thursday, November 3rd, 2011

Often we need to create short, more expressive URLs. If you are using Nginx as a reverse proxy, one easy way to create short URLs is to define different locations under the respective server directive and then do a permanent rewrite to the actual URL in the Nginx conf file as follows:

http { 
    ....
    server {
        listen          80;
        server_name     www.agilefaqs.com agilefaqs.com;
        server_name_in_redirect on;
        port_in_redirect        on; 
 
        location ^~ /training {
            rewrite ^ http://agilefaqs.com/a/long/url/$uri permanent;  
        }
 
        location ^~ /coaching {
            rewrite ^ http://agilecoach.in$uri permanent;  
        }
 
        location = /blog {
            rewrite ^ http://blogs.agilefaqs.com/show?action=posts permanent;  
        }
 
        location / {
            root   /path/to/static/web/pages;
            index   index.html; 
        }
 
        location ~* ^.+\.(gif|jpg|jpeg|png|css|js)$ {
            add_header Cache-Control public;
            expires max;
            root   /path/to/static/content;
        }
    } 
}

I’ve been using this feature of Nginx for over 2 years, but never actually fully understood the different prefixes for the location directive.

If you check Nginx’s documentation for the syntax of the location directive, you’ll see:

location [=|~|~*|^~|@] /uri/ { ... }

The URI can be a literal string or a regular expression (regexp).

For regexps, there are two prefixes:

  • “~” for case sensitive matching
  • “~*” for case insensitive matching

If we have a list of locations using regexps, Nginx checks each location in the order its defined in the configuration file. The first regexp to match the requested url will stop the search. If no regexp matches are found, then it uses the longest matching literal string.

For example, if we have the following locations:

location ~* /.*php$ {
   rewrite ^ http://content.agilefaqs.com$uri permanent; 
}
 
location ~ /.*blogs.* {
    rewrite ^ http://blogs.agilefaqs.com$uri permanent;    
}  
 
location /blogsin {
    rewrite ^ http://agilecoach.in/blog$uri permanent;    
} 
 
location /blogsinphp {
    root   /path/to/static/web/pages;
    index   index.html; 
}

If the requested URL is http://agilefaqs.com/blogs/index.php, Nginx will permanently redirect the request to http://content.agilefaqs.com/blogs/index.php. Even though both regexps (/.*php$ and /.*blogs.*) match the requested URL, the first satisfying regexp (/.*php$) is picked and the search is terminated.

However let’s say the requested URL was http://agilefaqs.com/blogsinphp, Nginx will first consider /blogsin location and then /blogsinphp location. If there were more literal string locations, it would consider them as well. In this case, regexp locations would be skipped since /blogsinphp is the longest matching literal string.

If you want to slightly speed up this process, you should use the “=” prefix. .i.e.

location = /blogsinphp {
    root   /path/to/static/web/pages;
    index   index.html; 
}

and move this location right at the top of other locations. By doing so, Nginx will first look at this location, if its an exact literal string match, it would stop right there without looking at any other location directives.

However note that if http://agilefaqs.com/my/blogsinphp is requested, none of the literal strings will match and hence the first regexp (/.*php$) would be picked up instead of the string literal.

And if http://agilefaqs.com/blogsinphp/my is requested, again, none of the literal strings will match and hence the first matching regexp (/.*blogs.*) is selected.

What if you don’t know the exact string literal, but you want to avoid checking all the regexps?

We can achieve this by using the “^~” prefix as follows:

location = /blogsin {
    rewrite ^ http://agilecoach.in/blog$uri permanent;    
}
 
location ^~ /blogsinphp {
    root   /path/to/static/web/pages;
    index   index.html; 
}
 
location ~* /.*php$ {
   rewrite ^ http://content.agilefaqs.com$uri permanent; 
}
 
location ~ /.*blogs.* {
    rewrite ^ http://blogs.agilefaqs.com$uri permanent;    
}

Now when we request http://agilefaqs.com/blogsinphp/my, Nginx checks the first location (= /blogsin), /blogsinphp/my is not an exact match. It then looks at (^~ /blogsinphp), its not an exact match, however since we’ve used ^~ prefix, this location is selected by discarding all the remaining regexp locations.

However if http://agilefaqs.com/blogsin is requested, Nginx will permanently redirect the request to http://agilecoach.in/blog/blogsin even without considering any other locations.

To summarize:

  1. Search stops if location with “=” prefix has an exact matching literal string.
  2. All remaining literal string locations are matched. If the location uses “^~” prefix, then regexp locations are not searched. The longest matching location with “^~” prefix is used.
  3. Regexp locations are matched in the order they are defined in the configuration file. Search stops on first matching regexp.
  4. If none of the regexp matches, the longest matching literal string location is used.

Even though the order of the literal string locations don’t matter, its generally a good practice to declare the locations in the following order:

  1. start with all the “=” prefix,
  2. followed by “^~” prefix,
  3. then all the literal string locations
  4. finally all the regexp locations (since the order matters, place them with the most likely ones first)

BTW adding a break directive inside any of the location directives has not effect.

Who is viewing this blog?

Saturday, August 6th, 2011

Pleasantly surprised to see my blog attracted 86,616 viewers from 170 countries.

Thanks to Google Analytics for helping me analyze all this info for free.

Pharma Hack: Spammy Links visible to only Search Engine Bots in WordPress, CMS Made Simple and TikiWiki

Sunday, April 10th, 2011

Over the last 6 months, I’ve been blessed with various pharma hacks on almost all my site.

(http://agilefaqs.com, http://agileindia.org, http://sdtconf.com, http://freesetglobal.com, http://agilecoachcamp.org, to name a few.)

This is one of the most clever hacks I’ve seen. As a normal user, if you visit the site, you won’t see any difference. Except when search engine bots visit the page, the page shows up with a whole bunch of spammy links, either at the top of the page or in the footer. Sample below:

Clearly the hacker is after search engine ranking via backlinks. But in the process suddenly you’ve become a major pharma pimp.

There are many interesting things about this hack:

  • 1. It affects all php sites. WordPress tops the list. Others like CMS Made Simple and TikiWiki are also attacked by this hack.
  • 2. If you search for pharma keywords on your server (both files and database) you won’t find anything. The spammy content is first encoded with MIME base64 and then deflated using gzdeflate. And at run time the content is eval’ed in PHP.

This is how the hacked PHP code looks like:

If you inflate and decode this code it looks like:

  • 3. Well documented and mostly self descriptive code.
  • 4. Different PHP frameworks have been hacked using slightly different approach:
    • In WordPress, the hackers created a new file called wp-login.php inside the wp-includes folder containing some spammy code. They then modified the wp-config.php file to include(‘wp-includes/wp-login.php’). Inside the wp-login.php code they further include actually spammy links from a folder inside wp-content/themes/mytheme/images/out/’.$dir’
    • In TikiWiki, the hackers modified the /lib/structures/structlib.php to directly include the spammy code
    • In CMS Made Simple, the hackers created a new file called modules/mod-last_visitor.php to directly include the spammy code.
      Again the interesting part here is, when you do ls -al you see: 

      -rwxr-xr-x 1 username groupname 1551 2008-07-10 06:46 mod-last_tracker_items.php

      -rwxr-xr-x 1 username groupname 44357 1969-12-31 16:00 mod-last_visitor.php

      -rwxr-xr-x 1 username groupname 668 2008-03-30 13:06 mod-last_visitors.php

      In case of WordPress the newly created file had the same time stamp as the rest of the files in that folder

How do you find out if your site is hacked?

  • 1. After searching for your site in Google, check if the Cached version of your site contains anything unexpected.

  • 2. Using User Agent Switcher, a Firefox extension, you can view your site as it appears to Search Engine bot. Again look for anything suspicious.

  • 3. Set up Google Alerts on your site to get notification when something you don’t expect to show up on your site, shows up.

  • 4. Set up a cron job on your server to run the following commands at the top-level web directory every night and email you the results:
    • mysqldump your_db into a file and run
    • find . | xargs grep “eval(gzinflate(base64_decode(“

If the grep command finds a match, take the encoded content and check what it means using the following site: http://www.tareeinternet.com/scripts/decrypt.php

If it looks suspicious, clean up the file and all its references.

Also there are many other blogs explaining similar, but different attacks:

Hope you don’t have to deal with this mess.

Basics of making Webpages Search Engine Friendly

Tuesday, January 25th, 2011

I’m just learning the basics of how to make webpages easily searchable. Search Engine Optimization (SEO) is a vast topics, in this blog, I won’t even touch the surface.

Following are some simple things I learned today that are considered to be some basic, website hygiene stuff:

  • Titles: The title of a web page appears as a clickable link in search results and bookmarks. A descriptive, compelling page title with relevant keywords can increase the number of people visiting your site. Search engines view the text of the title tag as a strong indication of what the page is about. Accurate keywords in the title tag can help the page rank better in search results. A title tag should have fewer than 70 characters, including spaces. Major search engines won’t display more than that.
  • Description Meta-tags: The description meta-tag should tell searchers what a web page is about. It is often displayed below the title in search results, and helps people decide if they want to visit that website. Search engines will read 200 to 250 characters, but usually display only 150, including spaces. The first 150 characters of the meta description should contain the most important keywords for that web page.
  • H1 Heading: The H1 heading is an important sentence or phrase on a web page that quickly and clearly tells people and search engines what they can expect to find there. The H1 heading for a page should be different from its title. Each can target different important keywords for better SEO.
  • Outbound Links: Outbound links tell search engines which websites you find valuable and relevant. Including links to relevant sites is good for your website’s standing with search engines. Outbound links also help search engines classify your site in relationship to others.
  • Inbound Links: More number of website linking to your site is always better. Most search engines look at the reputation of the sites linking to your site. They also consider the anchor text (keywords) used to link to your site.
    • Self Links: Link back to your archives frequently when creating new content. Make sure your webpages are all well connected with proper anchor text (keywords) used to link back.
  • Create a sitemap: A site map (or sitemap) is a list of pages of you web site accessible to crawlers or users. The fewer clicks necessary to get to a page on your website, the better.
  • Pretty URLs: Easy to understand URLs, esp. the ones that contain the correct keywords are more search engine friendly compared to cryptic URLs with many request parameters. Favor mysite.com/ablum/track/page over mysite.com/process?albumname=album&trackname=track&page=name
  • Avoid non-Linkable Content: Some things might look pretty, but it might not good from SEO point of view. For example some flash based content or some javascript based content to which you can’t link.
  • Image descriptions: AKA alt text – is the best way to describe images to search engines and to visitors using screen readers. Describing images on a web page with alt text can help the page rank higher in search results if you include important and relevant keywords.
  • Keywords Meta-tag: Search engines don’t use the keyword meta-tag to determine what the page is about. Search engines detect keywords by looking at how often each word or phrase occurs on the page, and where it occurs. The words that appear most often and prominently are judged to be keywords. If the meta keywords and detected keywords match, that means the desired keywords appear frequently enough, and in the right places.
  • First 250 words: The first 250 words of on a web page are the most important. They tell people and search engines what the page is about. The two to three most important keywords for any web page should appear about five times each in the first 250 words of web page copy. They should appear two to three times each for every additional 250 words on the page.
  • Robots.txt file: A website’s robots.txt file is used to let search engines know which pages or sections of the site shouldn’t be indexed.
  • Canonical URL: A canonical URL is the standard URL for a web page. Because there are many ways a URL can be written, it’s possible for the same web page content to live at several different addresses, or URLs. This becomes a problem when you’re trying to enhance the visibility of a web page in search results. One factor that makes a web page rank higher in search results is the number and quality of other websites that link to it. If a web page is useful enough that lots of people create links to it, you don’t want to dilute the value of those links by having them spread across two or more URLs. Use a 301 redirect on any other version of that web page to get people – and search engines – to the standard version. Some common mistake people do:
    • Leave both www.mysite.com and mysite.com in place.
    • Leave default documents directly accessible. (mysite.com/ and mysite.com/index.html) More details: Twin Home Pages: Classic SEO Mistake
  • Web Presence: Having as much information and links about your website on the web as possible is key. Let it me other people’s website, news sharing and community sites, various social media sites or any other site which many people refer to. Alexa and Compete are two companies which give you a pretty good analysis of your web presence.
  • Fresh Content:  The best sites for users, and consequently for search engines, are full of often-updated, useful information about a given service, product, topic or discipline. Social media distribution via Blogs, Microblog (Twitter), Discussion forums, User Comments, etc. are great in this regard.

Big thanks to AboutUs.org for helping me understand these basic concepts.

    Licensed under
Creative Commons License