Our glossary contains over 100 popular terms used in the Web Marketing, Search Engine Marketing, and Search Engine Optimization (SEO) field. For larger glossaries, we recommend the following:
- Glossary of Internet and Web Jargon by U of Berkley
- Web Site Building Glossary by Lycos
- Web Site Building Glossary by W3Schools
- Glossary of the Open Directory Project (dmoz.org)
Popular Search Engine Marketing Acronyms
- CPM - Cost Per Thousand Impressions
- CPA - Cost Per Action
- CPC - Cost Per Click
- CPS - Cost Per Sale
- CTR - Click-Through Rate
- CPM - Cost Per Impression
- CPL - Cost Per Lead
- PPL - Pay Per Lead
- PFI - Pay For Inclusion
- PFP - Pay For Performance
- PPC - Pay Per Click
Web Marketing and Search Engine Optimization Glossary
An advertising program that helps Google earn billions of dollars of revenue. It is a keyword purchasing program whereby you display Google's AdWords ads on your site and share revenue with Google. Premium AdSense services are available for large sites, i.e. sites that receive 5 million+ search queries or 20 million+ content page views a month.
A blog (short for "web log") is a type of web page that serves as a publicly accessible personal journal (or log) for an individual. Typically updated daily, blogs often reflect the personality of the author. Blog software usually has archives of old blogs and is searchable. Frequently, blogging software is used by web pages providing excellent information on many topics, although very frequently the content is personal and requires careful evaluation.
To follow links in a page, to shop around in a page, exploring what's there, a bit like window shopping (but you can't type keywords to search). The opposite of browsing a page is searching it. When you search a page, you find a search box, enter terms, and find all occurrences of the terms throughout the site. When you browse, you have to guess which words on the page pertain to your interests. Searching is usually more efficient, but sometimes you find things by browsing that you might not find because you might not think of the "right" term to search by.
Browsers are software programs that enable you to view web documents. They "translate" HTML-encoded files into the text, images, sounds, and other features you see. Microsoft Internet Explorer (called simply IE), Netscape, Mosaic are examples of browsers that enable you to view text and images and many other web features. They are software that must be installed on your computer.
A bot is a software tool for digging through data. You give a bot directions and it bring back answers. The word is short for a 'robot'. Google's crawler is nicknamed Googlebot, for example.
In browsers, "cache" is used to identify a space where web pages you have visited are stored in your computer. A copy of documents you retrieve is stored in cache. When you use GO, BACK, or any other means to revisit a document, the browser first checks to see if it is in cache and will retrieve it from there, because it is much faster than retrieving it from the server.
In search results from Google, Yahoo! Search, and some other search engines, there is usually a cached link which allows you to view the version of a page that the search engine has stored in its database. The live page on the web might differ from this cached copy, because the cached copy dates from whenever the search engine's spider last visited the page and detected modified content. Use the cached link to see when a page was last crawled and, in Google, where your terms are and why you got a page when all of your search terms are not in it.
See also Link
CSS (Cascading Style Sheets)
A W3C recommended language for defining style (such as font, size, color, spacing, etc.) for web documents. This file is a hub that allows you to control style of all your site's pages from one place.
CGI stands for 'Common Gateway Interface' - a standard interface between web server software and other programs running on the same machine.CGI Program is any program which handles its input and output data according to the CGI standard. In practice, CGI programs are used to handle forms and database queries on web pages, and to produce non-static web page content.
A computer, program or process which makes requests for information from another computer, program or process. Web browsers are client programs. Search engine spiders are (or can be said to behave as) clients.
The number of times visitors click on a hyperlink (or advertisement) on a page, as a percentage of the number of times the page has been displayed. Good ranking may be useless if visitors do not click on the link which leads to the indexed site. The secret here is to provide a good descriptive title and an accurate and interesting description.
The hiding of page content. Normally carried out to stop page thieves stealing optimized pages.
Information from a web server, stored on your computer by your web browser. The purpose of a cookie is to provide information about your visit to the website for use by the server during a later visit. It's like us getting a ticket or a customer card at a shop, spa or movie cinema when we go there for the first time. This card ensures that we are remembered when we come back and helps the service provider give us a better (i.e. personalized) service. In technical terms, a message from a web server computer, sent to and stored by your browser on your computer. When your computer consults the originating server computer, the cookie is sent back to the server, allowing it to respond to you according to the cookie's contents. The main use for cookies is to provide customized Web pages according to a profile of your interests.
This is a measure of what you pay Google and other search engines for displaying your ad. As a rule, every time someone clicks on your ad, you get charged a certain amount, i.e. your CPC.
See also AdSense, AdWords
That part of a search engine which surfs the web, storing the URLs and indexing the keywords and text of each page it finds. Google's crawler is called Googlebot.
Data stored in a computer in such a way that a computer program can easily retrieve and manipulate the data. A database system is a computer program (like MS Access, Oracle, and MySQL) for manipulating data in a database.
An Internet link which doesn't lead to a page or site, probably because the server is down or the page has moved or no longer exists. Most search engines have techniques for removing such pages from their listings automatically, but as the Internet continues to increase in size, it becomes more and more difficult for a search engine to check all the pages in the index regularly. Reporting of dead links helps to keep the indexes clean and accurate, and this can usually be done by submitting the dead link to the search engine.
See also Link
The removal of pages from a search engine's index. Removal can occur for various reasons, including unreliability of the machine that hosts a site or because of perceived attempts at spamdexing. Note that search engines contain an automatic de-listing function in their algorithm, and errors do occur. If you think that your site has been unjustly de-listed, contact the search engines' support department, explain your story, be very nice and polite, and request to be re-included in the index. You may need to so do a number of times before you get to hear from them.
A name that identifies one or more IP addresses. For example, the domain name microsoft.com represents about a dozen IP addresses. Domain names are used in URLs to identify particular web pages. For example, in the URL http://www.google.com/about.html, the domain name is google.com.
Information on web pages which changes or is changed automatically, e.g. based on database content or user information. You can spot that that dynamic content is being used, if the URL ends with .asp, .cfm, .cgi or.shtml. But it is also possible to serve dynamic content using standard (normally static) .htm or .html type pages. Search engines will currently index dynamic content in a similar fashion to static content, although they will not usually index URLs which contain the ? character.
Pages created as the result of a search are called dynamically generatedpages. The answer to your query is encased in a web page designed to carry the answer and sent to your computer. Often the page is not stored anywhere afterward, because its unique content (the answer to your specific query) is probably not of use to many other people. It's easier for the database to regenerate the page when needed than to keep it around.
The opposite of a dynamic page is a "static" page. Static pages reside on servers, each identified by a unique URL, and waiting to be retrieved when their URL is invoked. Spiders can find a static page if it is linked to in any other page they "know" about. They follow links to it and retrieve it much as you would by clicking if you knew the link. Static pages are not invisible, although search engines might choose to omit them for policy reasons discussed below.
An HTML technique for combining two or more separate HTML documents within a single web browser screen. Compound interacting documents can be created to make a more effective web page presented in multiple windows or sub-windows.
A framed web site often causes great problems for search engines, and may not be indexed correctly. Search engines will often index only the part of a framed site within the <NOFRAMES> section, so make sure that the <NOFRAMES> section includes relevant text which can be indexed by the spiders.
If your site uses frames, include proper scripting to allow search engines "see" the framed content. Submit the main page, the one containing the <FRAMESET> tag to the search engines. If you use a gateway page, submit this separately.
See also NOFRAMES tag
File Transfer Protocol. Ability to transfer rapidly entire files from one computer to another, intact for viewing or other purposes.
Google is play on "Googol" - the mathematical term for a 1 followed by 100 zeros. Google was the name Larry Page and Sergey Brin selected for their future company in September 1998. Google Inc. is the developer of the award-winning Google search engine, which is designed to provide a simple, fast way to search the Internet for information. Offering users access to an index comprising more than 8 billion URLs, Google is the largest search engine on the World Wide Web. In 2004, Google became a publicly traded company, with over 5 billion dollars in revenue. Google created the largest and the best search engine in the world calledGooglebot.
Many search engines give extra weight and importance to the text found inside HTML heading sections. It is generally considered good advice to use headings when designing web pages and to place keywords inside headings.
Text on a web page which is visible to search engine spiders but not visible to human visitors. This is sometimes because the text has been set the same colour as the background, because multiple TITLE tags have been used or because the text is an HTML comment. Hidden text is often used forspamdexing. Many search engines can now detect the use of hidden text, and often remove offending pages from their database or lower theirpositioning.
HyperText Markup Language - the (main) language used to write web pages.
HyperText Transfer Protocol - the (main) protocol used to communicate between web servers and web browsers (clients).
IE stands for 'Microsoft Internet Explorer' browser. Browsers are software programs that enable you to view web documents.
Some types of pages and links are excluded from most search engines by policy. Others are excluded because search engine spiders cannot access them. Pages that are excluded are referred to as the Invisible web, i.e. what you don't see in search engine results. The Invisible Web is estimated to be two to three or more times bigger than the visible web.
IP is an Internet Protocol number or Internet Protocol address. This is a unique number consisting of 4 parts separated by dots, e.g. 188.8.131.52 Every machine on the Internet has a unique IP address. If a machine does not have an IP number, it is not really on the Internet. Most machines also have one or more domain names that are easier for people to remember.
See also Domain name
A computer programming language whose programs can run on a number of different types of computer and/or operating system. Used extensively to produce applets for web pages.
A simple interpreted computer language used for small programming tasks within HTML web pages. The scripts are normally interpreted (or run) on the client computer by the web browser. Some search engines have been known to index these scripts, in some cases presumably erroneously.
A word which forms (part of) a search engine Query. The keywords are picked up by search engines based on their evaluation of your website's content and their "understanding" of your focus, i.e. what's most important to you as a business. Higher ranking means that when someone performs a search using keywords within your business or area of expertise, search engines display a web link to your site among the first few search results, i.e. on the first three pages. Depending on the size, specialization and location of a business, keywords can be as broad as one term, e.g. 'sports', or as specific as a phrase (hence, the term 'key phrase'), e.g. 'healthcare jobs in Canada'.
See also Search engines
A property of the text in a web page which indicates how close together the keywords appear. Some search engines use this property for positioning. Analyzers are available which allow comparisons between pages. Pages can then be produced with the similar keyword densities to those found in high ranking pages.
The use of keywords as part of the URL to a website. Positioning is improved on some search engines when keywords are reinforced in the URL.
A phrase which forms (part of) a search engine query.
The buying of search keywords from search engines, usually to control banner ad placement.
See also AdWords, AdSense
The repeating of keywords and keyword phrases in links, META tags, copy of the page or elsewhere.
Link is a colloquial for a hyperlink. A pointer to another document. Most often a pointer to another web page. A hyperlink is also a synonym for a hotlink and sometimes called a hypertext connection to another document or web page.
HTML tags that are not visible on a formatted page, but are used by search engines for indexing purposes. The most common META tags (and those most relevant to search engines) are KEYWORDS and DESCRIPTION.
The KEYWORDS tag allows the author to emphasize the importance of certain words and phrases used within the page. Some search engines will respond to this information - others will ignore it. Don't use quotes around the keywords or key phrases.
The DESCRIPTION tag allows the author to control the text of the summary displayed when the page appears in the results of a search. Again, some search engines will ignore this information. The HTTP-EQUIV meta tag is used to issue HTTP commands, and is frequently used with the REFRESH tag to refresh page content after a given number of seconds. Gateway pages sometimes use this technique to force browsers to a different page or site. Most search engines are wise to this, and will index the final page and/or reduce the ranking. Infoseek has a strong policy against this technique, and they might penalize your site, or even ban it by removing it from Index.
Other common meta tags are GENERATOR (usually advertising the software used to generate the page) and AUTHOR (used to credit the author of the page, and often containing e-mail address, homepage URL and other information).
<TITLE>PulseHR: Recruitment of Foreign Nurses</TITLE>
<meta name="DESCRIPTION" CONTENT="PulseHR is a recruiting agency specializing in recruitment of foreign nurses into the United States, Canada, and the United Kingdom">
<meta name="KEYWORDS" CONTENT="recruitment of foreign nurses in USA, foreign nurse recruitment, foreing nurse recruitment, foreign nurses to Canada, foreign RNs, recruitment of foreign RNs, recruit foreign nurses, hiring foreign nurses, hiring international nurses, international nurses, international recruitment of foreign nurses, requirements to recruit foreign nurses, nursing immigration, employment based immigration of nurses, immigration of foreign nurses, foreign nurses immigration”>
<meta name="robots" content="noarchive">
Also known as Doorway pages or Doorway sites. Multiple copies of identical web sites or web pages, often on different servers. The process of registering these multiple copies with search engines is often treated asspamdexing, because it artificially increases the relevancy of the pages. Many search engines now remove multiple mirrors from the indexes.
It used to be possible to repeat the HTML title tag in the header section of a page several times to improve search engine positioning. Most search engines now detect this trick. Below is an example of what would be considered multiple Titles or Title tags.
<TITLE>PulseHR: Recruitment of Foreign Nurses</TITLE>
<TITLE>PulseHR: Foreign Nurse Recruitment</TITLE>
<TITLE>PulseHR: Recruitment of International Nurses</TITLE>
The use of more than one Keywords META tag in order to try to increase the relevancy of the best keywords on a page. This is not recommended. It may be detected as a spamming technique, or all but one of the tags may simply be ignored.
The NOFRAMES tag allows no-frames browsers see what's inside the frame. This is particularly important for search engine optimization. It is best not to use frames when designing a site, but if frames are used, the following example shows how to open the framed content for search engines:
<frameset border="1" cols="200,*" frameBorder="0" frameSpacing="4">
You should include HTML here to support webcrawlers and browsers that don't support frames.
You may want to include a second copy of your index and set your colors in the BODY statement above the same as you would in your index file.
<frame name="left" src="/htmlindex.html">
<frame name="right" src="/htmlintroduction.html">
The NOSCRIPT tag allows browsers to "see" what users would see when they push a button (i.e. view dynamic content). NOSCRIPT is shown by script-aware browsers if scripting is disabled or a scripting language which it did not understand was used. NOSCRIPT supports all core attributes, international attributes, and events, though they are not needed.
If you wish to make pages which are widely compatible and will work with the next generation of browsers without breaking, it is wise to use the Type AND Language attributes, while avoiding the Src, For and Event attributes. You should also provide alternate content. For example:
<SCRIPT TYPE="text/vbscript" LANGUAGE="VBScript">
A measure of the number and quality of links to a particular page (inbound links). Many search engines (and most noticeably Google) are increasingly using this number as part of the positioning process. The number and quality of inbound links is becoming as important as the optimization of page content. A free service to measure page popularity can be found athttp://www.linkpopularity.com.
A small box that appears over a visited page to deliver information or display an ad.
See also NOSCRIPT tag
The process of ordering web sites or web pages by a search engine or a directory so that the most relevant sites appear first in the search results for a particular query. There are a number of software programs that can be used to determine how a URL is positioned for a particular search engine when using a particular search phrase.
A method of modifying a web page so that search engines (or a particular search engine) treat the page as more relevant to a particular query (or a set of queries).
A word, a phrase or a group of words, possibly combined with other syntax used to pass instructions to a search engine or a directory in order to locate web pages.
The process of informing a search engine or directory that a new web page or web site should be indexed.
The method a search engine or directory uses to match the keywords in a query with the content of each web page, so that the web pages found can be ordered suitably in the query results. Each search engine or directory is likely to use a different algorithm, and to change or improve its algorithm from time to time.
The most common method for determining the order in which search results are displayed. Each search tool uses its own unique algorithm. Most use "fuzzy and" combined with such factors as how often your terms occur in documents, whether they occur together as a phrase, and whether they are in Title or how near the top of the text. Popularity is another ranking system.
Repeating the search engine registration process one or more times for the same page or site. Under certain circumstances, this is regarded with suspicion by the search engines, as it could indicate that someone is experimenting with spamming techniques.
The Infoseek and Altavista search engines are particularly vulnerable to spamming because they list sites very quickly, and are thus easy to experiment with. Both engines de-list sites for repeated re-submission and Infoseek, for example, does not allow more than one submission of the same page in a 24 hour period. Occasional re-submission of changed pages is not normally a problem.
Any browser program which follows hypertext links and accesses web pages but is not directly under human control. Examples are the search engine spiders, the "harvesting" programs which extract e-mail addresses and other data from web pages and various intelligent web searching programs. A database of web robots is maintained by Webcrawler.
A text file stored in the top level directory of a web site to deny access byrobots to certain pages or sub-directories of the site. Only robots which comply with the Robots Exclusion Standard will read and obey the commands in this file. Robots will read this file on each visit, so that pages or areas of sites can be made public or private at any time by changing the content of robots.txt before re-submitting to the search engines.
|Take a look this simple example provided by Google:||<META NAME="Googlebot" CONTENT="nofollow">|
|In this example, a robot should neither index this document, nor analyse it for links.||<META NAME="ROBOTS" CONTENT="noindex, nofollow">|
|In this example, you are asking robots not to archive your pages, so that your old pages, the ones you have removed, for example, do not get displayed in search results.||<META NAME="ROBOTS" CONTENT="noarchive">|
For more information about robots.txt see also HTML Author's Guide to the Robots META tag. Note, however, currently only few robots support the robot tag.
RSS stands for Rich Site Summary or RDF Site Summary. A machine-readable file format designed for syndicating content of websites. It is a format for distributing and gathering content from sources across the Web, including newspapers, magazines, and blogs. News sites publish via RSS, and then individuals and websites automatically get the updated content.
A server or a collection of servers dedicated to indexing Internet web pages, storing the results and returning lists of pages which match particular queries. Some of the major search engines are Google, Altavista, MSN, Excite, Hotbot, Infoseek, Lycos, and Webcrawler. Note that Yahoo is a directory, not a search engine.
The term Search Engine is also often used to describe both directories and search engines.
Search Engines for the general web (like all those listed above) do not really search the World Wide Web directly. Each one searches a database of the full text of web pages selected from the billions of web pages out there residing on servers. When you search the web using a search engine, you are always searching a somewhat stale copy of the real web page. When you click on links provided in a search engine's search results, you retrieve from the server the current version of the page.
Crawler-based search engines have three major elements. Search engine databases are selected and built by computer robot programs called spiders, the first major element. Spider is that part of a search engine which surfs the web, storing the URLs and indexing the keywords and text of each page it finds. Google's spider, also called crawler, is calledGooglebot. Although it is said they "crawl" the web in their hunt for pages to include, in truth they stay in one place. They find the pages for potential inclusion by following the links in the pages they already have in their database (i.e., already "know about"). They cannot think or type a URL or use judgment to "decide" to go look something up on the Internet.
If a web page is never linked to in any other page, search engine spiders cannot find it. The only way a brand new page - one that no other page has ever linked to - can get into a search engine is for its URL to be sent by some human to the search engine companies as a request that the new page be included. All search engine companies offer ways to do this.
After spiders find pages, they pass them on to another computer program for "indexing." The index is the second major element of a search engine. Sometimes called the catalog, the index is like a giant book containing a copy of every Web page that the spider finds. If a Web page changes, then this index is updated with new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a Web page may have been "spidered" but not yet "indexed." Until it is indexed – (added to the index) -- it is not available to those searching with the search engine.
The third part of a search engine is the search engine softwarethat sifts through the millions of pages recorded in the index to find matches to a search query and rank them in order of what it believes is most relevant.
Some types of pages and links are excluded from most search engines by policy. Others are excluded because search engine spiders cannot access them. Pages that are excluded are referred to as the "Invisible Web", i.e. what you don't see in search engine results. The Invisible Web is estimated to be two to three or more times bigger than the visible web. (Source: Berkeley University).
More information about how different search engines work:http://comparesearchengines.dogpile.com/OverlapAnalysis.pdf
Usually, search engine optimization (SEO) of a website is defined as a set of techniques applied to the website to improve its ranking and positioning with search engines, with the goal of helping visitors or potential customers find a website through a keyword search on search engines. Optimization may involve design/layout changes, new text for the TITLE tags, METATAGS, alt- attributes, headings, and above all, changes to the main text. Additionally, it is also advisable to supplement "organic" site optimization with paid-per-click campaigns, such as Google's AdWords, for example.
Below are a few SEO rules:
- Quality links, especially incoming links, should be created.
- Keyword-rich copy (content) should be present on all pages.
- Content of the Title tag and META tags should be page relevant, i.e. these tags should contain keywords that are repeated throughout the body (content) of a particular page.
- Large sites with heavy long pages rank higher than small sites.
- Supplementary documentation in .PDF and other formats add great SEO value (articles, manuals, white papers).
- A link to a 'Site Map' should be placed on every page (avoid using 'include' files, if possible).
- Frames should be avoided (unless proper coding is implemented).<
A computer, program or process which responds to requests for information from a client. On the Internet, all web pages are held on servers. This includes those parts of the search engines and directories which are accessible from the Internet.
The use of various means to steal another site's traffic. Techniques used include the wholesale copying of web pages (with the copied page altered slightly to direct visitors to a different site, and then registered with the search engines) and the use of keywords or keyword phrases "belonging" to other organizations, companies or web sites.
The alteration or creation of a document with intent to deceive an electronic catalog or filing system. Any technique that increases the potential position of a site at the expense of the quality of the search engine's database can also be regarded as spamdexing, also known as spamming or spoofing.Note that search engines contain an automatic de-listing function in their algorithm, and errors do occur. If you think that your site has been unjustly de-listed, contact the search engines' support department, explain your story, be very nice and polite, and request to be re-included in the index. You may need to do so a number of times before you get to hear from them.
Spamming is also used more generally to refer to the sending of unsolicited bulk electronic mail, and the search engine use is derived from this term.
See also Spamdexing
That part of a search engine which surfs the web, storing the URLs and indexing the keywords and text of each page it finds. Please refer to the Search Engine Watch SpiderSpotting Chart for details of individual spiders. See also Robot.
The process of surfing the web, storing URLs and indexing keywords, links and text. Typically, even the largest search engines cannot spider all of the pages on the net. This is due to the huge amount of data available, the speed at which the new data appears, the use of politeness windows and practical limits on the number of pages that can be visited in a given time. The search engines have to make compromises in order to visit as many sites as possible, and they do this in different ways. For example, some only index the home pages of each site, some only visit sites they're explicitly told about, and some make judgment about the importance of sites (from number and quality of inbound links) before "digging deeper" into the sub-pages of a site.
See also Spamdexing
Static pages reside on servers, each identified by a unique URL, and waiting to be retrieved when their URL is invoked. Spiders can find a static page if it is linked to in any other page they "know" about. They follow links to it and retrieve it much as you would by clicking if you knew the link. Static pages are not invisible, although search engines might choose to omit them for policy reasons discussed below.
The opposite of a static page is a "dynamically generated" page. Pages created as the result of a search are called dynamically generated pages. The answer to your query is encased in a web page designed to carry the answer and sent to your computer. Often the page is not stored anywhere afterward, because its unique content (the answer to your specific query) is probably not of use to many other people. It's easier for the database to regenerate the page when needed than to keep it around.
See also Dynamically generated page
In database searching, "stop words" are small and frequently occurring words like and, or, in, of that are often ignored when keyed as search terms. Sometimes putting them in quotes " " will allow you to search them. Sometimes + immediately before them makes them searchable. As a rule, it is advisable to check with search engines themselves as to what they omit and what not.
See also how Google works
Notifications or commands written into a web document. Tags is a colloquial name for 'HTML Tags', a code to identify the different parts of a document so that a web browser will know how to display it. Tags are usually included in the brackets: <tag content>. The end of a tag is identified as </tag content>.
Some of the most important tags from the SEO perspective are the following:
The title tag is used to identify a page. Title text is important because it normally forms the link to the page from the search engine listings, and because the search engines pay special attention to the title text when indexing the page. Don't confuse this text with heading text within the web page which often looks like the title. Usually this will be rendered either using the HTML heading tags or just rendered with a large font size. Note that placing multiple Title tags is considered spam (see Multiple titles).
<TITLE>Anna Tulchinsky: Helping you attract customers through Search Engine Optimization</TITLE>
See also Tags
The visitors to a web page or web site. Also refers to the number of visitors, hits, accesses etc. over a given period.
A real visitor to a web site. Web servers record the IP addresses of each visitor, and this is used to determine the number of real people who have visited a web site. If for example, someone visits twenty pages within a web site, the server will count only one unique visitor (because the page accesses are all associated with the same IP address) but twenty page accesses.
Universal Resource Locator. An address which can specify your website uniquely. For example: www.AnnaTulchinsky.com.
The writing of text particularly for a web page. Similar to the writing of copy for any other type of publication, good web copywriting can have a great effect on search engine positioning, and it forms a major part of search engine optimization.