2009 年 9 月 – Samwebman info(New)

2009 年 9 月 16 日

Windows – Auto Backup Mysql Database

1. Save command lines below as .bat file

2. Set Task schedule in windows.

@echo off
SET MysqlBinPath=MySQL\bin的路徑( eg. D:\Mysql\bin )
SET BackupPath=backup folder (eg. D:\backup)
SET DBhost=machine ip ( eg. 211.58.1.2)
SET DBuser=database username (eg. root)
SET DBpass=database password (eg. 12345)
SET DBname=databaseName
SET Argument=–opt –compress –force –default-character-set=utf8
REM
/* no need to fix below */
REM
/* Get date。
FOR /F “tokens=1-4 delims=/ ” %%a IN (“%date%”) DO (
SET _MyDate=%%a-%%b-%%c %%d
)
REM /* Export .sql filename with date。 */
echo database %DBname% during the backup……
“%MysqlBinPath%\mysqldump” –host=%DBhost% –user=%DBuser% –password=%DBpass% %Argument% %DBname% >; “%BackupPath%\%_MyDate%.sql”
REM
/* Debug Messages */
IF
NOT %ERRORLEVEL% == 0 (
del “%BackupPath%\%_MyDate%.sql
echo.
echo Something wrong! Plz refer to the messages above！
pause
)

Reference from: http://www.ge.net.tw/?q=node/521

2009 年 9 月 15 日

Tracking Zero Result Searches in Google Analytics

September 8, 2009 by Justin Cutroni

I <3 Google Analytics Site Search reports. There’s amazingly actionable data in those reports. But they’re missing one vital piece of information: searches that don’t produce any results.

Why is this important? Don’t you want to know when visitors search and don’t get any results? Zero result searches can help your identify missing content on your site or a problem with your site search engine.

fenway-scoreboard

Many search solutions will provide this information for you. For example, I use Search Meter for WordPress and it shows me which search queries generate zero results. But I thought it would be interesting to add this data to Google Analytics. That way all my site search information would be in one place.

Unfortunately there is no easy way to add this data to GA. You need to do some programming to collect the data. So this post is really meant for those folks with programming resources AND for those developers that maintain GA plugins. Like my buddy Joost, who has a great GA plugin for WordPress.

If you’re interested in the data and analysis, skip to the bottom of this post.

Conceptual Overview

Our goal with this hack is to modify site search data in two ways. First, we’re going to put all search queries with zero results in a category. This will allow us to use the Search Categories report to easily find all the search terms that yielded zero results.

Second, we’ll modify the actual search terms to indicate that a term yielded zero results. This will make it easy to scan a list of all the search terms and identify which generated no results.

Before we get into the implementation, a big THANK YOU to Charles Miller, one of the lead consultants here. He wrote the JavaScript below. Thanks Charles.

Step 1: Identify No Result Search

The first step is to identify a zero results search page. Most websites have the same search results page regardless of the number of results. You need to identify some something that differentiates a zero results search page from a non-zero results search page.

This must be done programatically and is the hardest part of the implementation.

For example, a zero results search page on this blog has the text “No posts found. Try a different search?”

No Posts Found

I can create code (or more specifically, Charles can create code) to look for the text “No posts found. Try a different search?” If the code finds this text in the page then I can identify that the visitor’s search yielded zero results and than I can send the data sent to GA. Here’s the code that I’m using on this blog:


var content = document.getElementById('content');
if (content.innerHTML.search('No posts found.')) {

The code looks for a section of the page called ‘content’ and then searches that section for the phrase ‘No posts found.’. If ‘no posts found.’ is found (oh, the irony!) then we will modify the data sent to GA.

Important! The way you detect a zero result search page may be different. It’s VERY difficult to create an example that will work for everyone. Take this as a conceptual overview.

Step 2: Tweak GA Tracking Code

Once we know what differentiates a zero results search page we can add some code that tweaks the data. Remember, we want to modify the data in two ways: 1. by placing it in a special search category and 2. by modifying the search term to indicate it did not yield any results.

To create the category all we need to do is add an extra query string parameter to the URL.

To manipulate the search term we need to split apart the page URL and then put it back together with the phrase no-results.

Here’s the complete code.


<script type='text/javascript'>
var pageTracker = _gat._getTracker("UA-XXXXXX-1");
var content = document.getElementById('content');
if (content.innerHTML.search('No posts found.')) {
     // These lines get the search data from the URL and
     //  deconstruct the URL into parts
     var sn = "s";
     var sr = new RegExp(sn+"=[^\&]+"),
      p = document.location.pathname,
      s = document.location.search,
      sm = s.match(sr).toString(),
      srs = sm.split("="),
      // The next line is where we add the category and add
      // the phrase no-results to the search term.
      sre = sm.replace(sr,srs[0]+"=no-results:
 "+srs[1]+"&cat=no-results"),
      sf = s.replace(sr,sre);
      // Send the data to Google as a Pageview
      pageTracker._trackPageview(p+sf);
} else {
      // If this is a regular page on the site, use the standard GA code.
      pageTracker._trackPageview();
}
</script>

The code starts with the section that identifies a zero result search page.

Then we deconstruct the URL to identify the search term. Finally we add the category named ‘no_results’ and the phrase ‘no-results’ to the search term.

If the code does NOT find the term ‘No posts found.’ then a pageview is created as normal.

That’s it for the coding part (thank goodness!)

Step 3: Configure Site Search Settings

The last step is to add the new category parameter to the Site Search settings so GA can identify the no-results search category. This is easy, it’s in the profile setting section of Google Analytics.

How to set a search Category parameter in Google Analytics

I also like to set the ‘Strip Query Parameter’ to YES. This removes the category parameter after site search is done processing and normalizes your pageview data.

That’s it for the configuration! We’re cleared for insight-hunting!

Analyzing The Data

When a visitor performs a search that yields zero results the search term will be placed in a category named ‘no_results’. To find this data navigate to the Content>Site Search>Categories Report:

Immediately you’ll be able to see what percentage of your searches yield zero results. Hopefully it’s very low! Want to see if this impacts conversions or revenue? Click the Goals or Ecommerce tab to check the conversion rate:

This is a bad picture, but you get the point.

Next you can click on the no-results line in the data and see exactly which search terms yielded zero results.

This is super-actionable data. Now you know where you may be missing content or if your site search engine might be broken. You should be asking yourself, “Why are there no results for these terms? Is there missing content or is there a problem with my site search engine?”

You’ll also notice that the search terms now have ‘no-results’ in them. This provides a lot of flexibility for view the search data other ways. Example, let’s use the Search Terms report:

Here we can see the search terms ranked by searches. What percent of your top 10, 20 or 50 are no-result searches? How is that impacting your bottom line?

This is just the start. You can use other metrics, like %Search Exists to understand if visitors who receive zero results refine their search or exit.

While this is not the easiest thing to configure, I hope you see the value of the data. More so, I hope that all those folks that maintain plugins add this type of feature to their GA plugins. Joost, you listening!?

Resource from: http://www.epikone.com/blog/2009/09/08/tracking-ero-result-searches-in-google-analytics/

2009 年 9 月 15 日

SEO – Common Tag 通用標籤與搜尋引擎優化

搜尋引擎的最主要的終極任務就是能夠「讀」網路上的資料, 如此才能真正處理資料的意義, Google跟Yahoo等業者已經早就能夠處理RDF的資料標籤, 這幾天Yahoo又宣佈支援Common Tag … 這些轉變在SEO代表什麼? 會影響SEO哪些作業?

什麼是RDF(Resource Description Framework)? RDF就是一種描述資料的模式, 比如我們說：「王先生是一位藝術家」, 這個句子就包括三個元素：「王先生」、「是一位」、「藝術家」

「王先生」是主詞(subject), 「是一位」是述詞(predicate), 「藝術家」是受詞(Object)

如果我們也說：「李先生是一位藝術家」, 則「王先生」與「李先生」在「是一位」這個述詞之下有相同的屬性, 白話來說就是「他們都是藝術家」啦!

如果資料具有描述上類的結構, 搜尋引擎就可以找出文件的內容含意

如下左圖是瀏覽軟體看到文件的樣子, 右圖是人類看到文件的樣子

現在這些技術就是要把上面左右圖都讓電腦看清楚,並瞭解這些內容到底是什麼

如上圖紅色部分就是RDFa(RDF-in-attribute)的描述, RDFa可以看成是RDF的簡易版, 以便跟目前的XHTML一起使用, dc就是都柏林碼(Dublin Code)…如此就可以知道”The trouble with Bob”是主題, 作者是Alice

OK…那什麼又是Common Tag (通用標籤)?
Common Tag就是使用RDFa的架構, 讓一些已經被定義的常用標籤可以拿來定義你的內容

如Commontag.org所說:

Unlike free-text tags, Common Tags are references to unique, well-defined concepts, complete with metadata and their own URLs. With Common Tag, site owners can more easily create topic hubs, cross-promote their content, and enrich their pages with free data, images and widgets.

這個Common Tag不是free-text tags, 你必須找到可以描述你的內容, 並且已經定義好的, 才能使用來描述你的文件, 架構如下: 透過如AdaptiveBlue, DERI (NUI Galway), Faviki, Freebase, Yahoo!, Zemanta, , Zigtag … 這些公司來開發可以制定tag的平台來定義

例如下面就透過Common Tag來定義圖片是: 鳳凰號的火星任務

這些描述的方式暫時不會對SEO產生太大的影響, 但是慢慢影響會越來越大, 沒有清楚描述的資料, 搜尋引擎就無法抓到文件的真正意思, 僅能用字面意義去分析, 看到「王先生」並不會知道他與「張先生」都是「藝術家」, 只認得他是一位「先生」, 姓「王」而已 … 更可能只是「王先生」三個字, 對搜尋引擎完全沒有任何意義!

這個也就是上篇”SEO SEM 的未來 : 3.0的到來” 所說的未來的SEO將不再只是簡單幾個外部聯結可以搞定, 現在的SEO廠商靠著人力密集建立了大量的部落格, 然後複製一堆五花八門的內容, 再指回自己操作的網站, 仔細看的話, 可以發現都是不相關的內容胡亂連結, 這樣的作法只是製造一堆網路垃圾罷了, 能夠有效多久就不得而知了 …

其他更詳細的參考
ReadWriteWeb: Common Tag Bring Standards to Metadata
Freebase
DBPedia

2009 年 9 月 15 日

What is RDFa

RDFa (or Resource Description Framework – in – attributes) is a W3C Recommendation that adds a set of attribute level extensions to XHTML for embedding rich metadata within Web documents. The RDF data model mapping enables its use for embedding RDF triples within XHTML documents, it also enables the extraction of RDF model triples by compliant user agents.—– Wikipedia

Reference:

RDFa Primer (中文), About RDFa – Google, SearchMonkey support RDFa Enabled

RDFa让你的页面更好的被机器所理解

视频：如何使用 RDFa 标记指定图片的授权类型

2009 年 9 月 15 日

Intruduce Rich Snippets

Tuesday, May 12, 2009 at 12:00 PM

Webmaster Level: All

As a webmaster, you have a unique understanding of your web pages and the content they represent. Google helps users find your page by showing them a small sample of that content — the “snippet.” We use a variety of techniques to create these snippets and give users relevant information about what they’ll find when they click through to visit your site. Today, we’re announcing Rich Snippets, a new presentation of snippets that applies Google’s algorithms to highlight structured data embedded in web pages.

Rich Snippets give users convenient summary information about their search results at a glance. We are currently supporting data about reviews and people. When searching for a product or service, users can easily see reviews and ratings, and when searching for a person, they’ll get help distinguishing between people with the same name. It’s a simple change to the display of search results, yet our experiments have shown that users find the new data valuable — if they see useful and relevant information from the page, they are more likely to click through. Now we’re beginning the process of opening up this successful experiment so that more websites can participate. As a webmaster, you can help by annotating your pages with structured data in a standard format.

To display Rich Snippets, Google looks for markup formats (microformats and RDFa) that you can easily add to your own web pages. In most cases, it’s as quick as wrapping the existing data on your web pages with some additional tags. For example, here are a few relevant lines of the HTML from Yelp’s review page for “Drooling Dog BarBQ” before adding markup data:

and now with microformats markup:

or alternatively, use RDFa markup. Either format works:

By incorporating standard annotations in your pages, you not only make your structured data available for Google’s search results, but also for any service or tool that supports the same standard. As structured data becomes more widespread on the web, we expect to find many new applications for it, and we’re excited about the possibilities.

To ensure that this additional data is as helpful as possible to users, we’ll be rolling this feature out gradually, expanding coverage to more sites as we do more experiments and process feedback from webmasters. We will make our best efforts to monitor and analyze whether individual websites are abusing this system: if we see abuse, we will respond accordingly.

To prepare your site for Rich Snippets and other benefits of structured data on the web, please see our documentation on structured data annotations.

Now, time for some Q&A with the team:

If I mark up my pages, does that guarantee I’ll get Rich Snippets?

No. We will be rolling this out gradually, and as always we will use our own algorithms and policies to determine relevant snippets for users’ queries. We will use structured data when we are able to determine that it helps users find answers sooner. And because you’re providing the data on your pages, you should anticipate that other websites and other tools (browsers, phones) might use this data as well. You can let us know that you’re interested in participating by filling out this form.

What about other existing microformats? Will you support other types of information besides reviews and people?

Not every microformat corresponds to data that’s useful to show in a search result, but we do plan to support more of the existing microformats and define RDFa equivalents.

What’s next?

We’ll be continuing experiments with new types (beyond reviews and people) and hope to announce support for more types in the future.

I have too much data on my page to mark it all up.

That wasn’t a question, but we’ll answer anyway. For the purpose of getting data into snippets, we don’t need every bit of data: it simply wouldn’t fit. For example, a page that says it has “497 reviews” of a product probably has data for 10 and links to the others. Even if you could mark up all 497 blocks of data, there is no way we could fit it into a single snippet. To make your part of this grand experiment easier, we have defined aggregate types where necessary: a review-aggregate can be used to summarize all the review information (review count, average/min/max rating, etc.).

Why do you support multiple encodings?

A lot of previous work on structured data has focused on debates around encoding. Even within Google, we have advocates for microformat encoding, advocates for various RDF encodings, and advocates for our own encodings. But after working on this Rich Snippets project for a while, we realized that structured data on the web can and should accommodate multiple encodings: we hope to emphasize this by accepting both microformat encoding and RDFa encoding. Each encoding has its pluses and minuses, and the debate is a fine intellectual exercise, but it detracts from the real issues.

We do believe that it is important to have a common vocabulary: the language of object types, object properties, and property types that enable structured data to be understood by different applications. We debated how to address this vocabulary problem, and concluded that we needed to make an investment. Google will, working together with others, host a vocabulary that various Google services and other websites can use. We are starting with a small list, which we hope to extend over time.

Wherever possible, we’ll simply reuse vocabulary that is in wide use: we support the pre-existing vCard and hReview types, and there are a variety of other types defined by various communities. Sites that use Google Custom Search will be able to define their own types, which we will index and present to users in rich Custom Search results pages. Finally, we encourage and expect this space to evolve based on new ideas from the structured data community. We’ll notice and reach out when our crawlers pick up new types that are getting broad use.

Written by Kavi Goel, Ramanathan V. Guha, and Othar Hansson

Reference from:

http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html

Google強化搜尋引擎-新增選項功能

2009 年 9 月 15 日

Media RSS Module – RSS 2.0 Module

“Media RSS” Specification Version 1.1.2An RSS module that supplements the <enclosure> element capabilities of RSS 2.0 to allow for more robust media syndication.

Change Notes

12/01/2004 – Created

02/21/2005 – Major consolidation of all requested changes: [1.0.0]

Added <media:group> element for grouping <media:content> objects.
Added framerate, height, width attributes to media:content.
Added additional type of expression for continuous streams.
Added <media:adult> element to distinguish content of an adult nature.
Added <media:title> element.
Modified to have a url attribute for better module consistency.
Added scheme attribute to <media:category> to allow specifying categorization scheme.
Added label attribute to <media:category> for a human readable label.
Added <media:hash> element for media binary hashing.
Added <media:player> element and removed playerUrl attributes from <media:content>.
Overhaul of <media:people> to become <media:credit>.
Added type attribute to <media:text> to distinguish formatting of text.
Improved descriptions of various elements and attributes.

08/22/2005 – Improved global syndication capabilities: [1.1.0]

Corrected spelling mistakes.
Add lang attributes to media:text and media:content.
Deprecated <media:adult>, and added <media:rating> as a replacement.
Modified <media:credit> to give more flexibility to represent things other than people or companies.
Added <media:description> element.
Added medium attribute to <media:content> to explicitly determine what type of media is expressed.
Added channels, samplingrate attributes to <media:content>.
Added <media:restriction> element.
Added <media:keywords> element.
Added a “Best Practices” section to encourage use of the Feed History module, and Dublin Core’s expiration capability.
Add time code information to <media:thumbnail> and <media:text>.

10/22/2005 – Minor improvements: [1.1.1]

Added capability for elements to appear at the <channel> level.
Flushed out <media:restriction> to allow explicit global relationships to be expressed.

03/12/2008 – Namespace corrections: [1.1.2]

Added trailing slash to namespace

Namespace declaration

The namespace for Media RSS is defined to be: http://search.yahoo.com/mrss/

For example:

NOTE: There is a trailing slash in the namespace, although there has been confusion around this in earlier versions.

Description

“Media RSS” is a new RSS module that supplements the enclosure capabilities of RSS 2.0. RSS enclosures are already being used to syndicate audio files and images. Media RSS extends enclosures to handle other media types, such as short films or TV, as well as provide additional metadata with the media. Media RSS enables content publishers and bloggers to syndicate multimedia content such as TV and video clips, movies, images, and audio.

Primary Elements

<media:group>

<media:group> is a sub-element of <item>. It allows grouping of <media:content> elements that are effectively the same content, yet different representations. For instance: the same song recorded in both the WAV and MP3 format. It’s an optional element that must only be used for this purpose.

<media:content>

<media:content> is a sub-element of either <item> or <media:group>. Media objects that are not the same content should not be included in the same <media:group> element. The sequence of these items implies the order of presentation. While many of the attributes appear to be audio/video specific, this element can be used to publish any type of media. It contains 14 attributes, most of which are optional.

 
        <media:content 
               url="http://www.foo.com/movie.mov" 
               fileSize="12216320" 
               type="video/quicktime"
               medium="video"
               isDefault="true" 
               expression="full" 
               bitrate="128" 
               framerate="25"
               samplingrate="44.1"
               channels="2"
               duration="185" 
               height="200"
               width="300" 
               lang="en" />

url should specify the direct url to the media object. If not included, a <media:player> element must be specified.

fileSize is the number of bytes of the media object. It is an optional attribute.

type is the standard MIME type of the object. It is an optional attribute.

medium is the type of object (image | audio | video | document | executable). While this attribute can at times seem redundant if type is supplied, it is included because it simplifies decision making on the reader side, as well as flushes out any ambiguities between MIME type and object type. It is an optional attribute.

isDefault determines if this is the default object that should be used for the <media:group>. There should only be one default object per <media:group>. It is an optional attribute.

expression determines if the object is a sample or the full version of the object, or even if it is a continuous stream (sample | full | nonstop). Default value is ‘full’. It is an optional attribute.

bitrate is the kilobits per second rate of media. It is an optional attribute.

framerate is the number of frames per second for the media object. It is an optional attribute.

samplingrate is the number of samples per second taken to create the media object. It is expressed in thousands of samples per second (kHz). It is an optional attribute.

channels is number of audio channels in the media object. It is an optional attribute.

duration is the number of seconds the media object plays. It is an optional attribute.

height is the height of the media object. It is an optional attribute.

width is the width of the media object. It is an optional attribute.

lang is the primary language encapsulated in the media object. Language codes possible are detailed in RFC 3066. This attribute is used similar to the xml:lang attribute detailed in the XML 1.0 Specification (Third Edition). It is an optional attribute.

These optional attributes, along with the optional elements below, contain the primary metadata entries needed to index and organize media content. Additional supported attributes for describing images, audio, and video may be added in future revisions of this document.

Note: While both <media:content> and <media:group> have no limitations on the number of times they can appear, the general nature of RSS should be preserved: an <item> represents a “story”. Simply stated, this is similar to the blog style of syndication. However, if one is using this module to strictly publish media, there should be one <item> element for each media object/group. This is to allow for proper attribution for the origination of the media content through the <link> element. It also allows the full benefit of the other RSS elements to be realized.

Optional Elements

The following elements are optional and may appear as sub-elements of <channel>, <item>, <media:content> and/or <media:group>.

When an element appears at a shallow level, such as <channel> or <item>, it means that the element should be applied to every media object within its scope.

Duplicated elements appearing at deeper levels of the document tree have higher priority over other levels. For example, <media:content> level elements are favored over <item> level elements. The priority level is listed from strongest to weakest: <media:content>, <media:group>, <item>, <channel>.

<media:adult>

[NOTE: This is deprecated, and has been replaced with the more flexible <media:rating>]

<media:rating>

This allows the permissible audience to be declared. If this element is not included, it assumes that no restrictions are necessary. It has one optional attribute.

               <media:rating scheme="urn:simple">adult</media:rating>
               <media:rating scheme="urn:icra">r (cz 1 lz 1 nz 1 oz 1 vz 1)</media:rating>
               <media:rating scheme="urn:mpaa">pg</media:rating>
               <media:rating scheme="urn:v-chip">tv-y7-fv</media:rating>

scheme is the URI that identifies the rating scheme. It is an optional attribute. If this attribute is not included, the default scheme is urn:simple (adult | nonadult).

<media:title>

The title of the particular media object. It has 1 optional attribute.

        <media:title type="plain">The Judy's - The Moo Song</media:title>

type specifies the type of text embedded. Possible values are either ‘plain’ or ‘html’. Default value is ‘plain’. All html must be entity-encoded. It is an optional attribute.

<media:description>

Short description describing the media object typically a sentence in length. It has 1 optional attribute.

        <media:description type="plain">This was some really bizarre band I listened to as a young lad.</media:description>

type specifies the type of text embedded. Possible values are either ‘plain’ or ‘html’. Default value is ‘plain’. All html must be entity-encoded. It is an optional attribute.

<media:keywords>

Highly relevant keywords describing the media object with typically a maximum of ten words. The keywords and phrases should be comma delimited.

        <media:keywords>kitty, cat, big dog, yarn, fluffy</media:keywords>

<media:thumbnail>

Allows particular images to be used as representative images for the media object. If multiple thumbnails are included, and time coding is not at play, it is assumed that the images are in order of importance. It has 1 required attribute and 3 optional attributes.

        <media:thumbnail url="http://www.foo.com/keyframe.jpg" width="75" height="50" time="12:05:01.123" />

url specifies the url of the thumbnail. It is a required attribute.

height specifies the height of the thumbnail. It is an optional attribute.

width specifies the width of the thumbnail. It is an optional attribute.

time specifies the time offset in relation to the media object. Typically this is used when creating multiple keyframes within a single video. The format for this attribute should be in the DSM-CC’s Normal Play Time (NTP) as used in RTSP [RFC 2326 3.6 Normal Play Time]. It is an optional attribute.

Notes:

NTP has a second or subsecond resolution. It is specified as H:M:S.h (npt-hhmmss) or S.h (npt-sec), where H=hours, M=minutes, S=second and h=fractions of a second.

A possible alternative to NTP would be SMPTE. It is believed that NTP is simpler and easier to use.

<media:category>

Allows a taxonomy to be set that gives an indication of the type of media content, and its particular contents. It has 2 optional attributes.

        <media:category scheme="http://search.yahoo.com/mrss/category_
        schema">music/artist/album/song</media:category>

        <media:category scheme="http://dmoz.org" label="Ace Ventura - Pet 
        Detective">Arts/Movies/Titles/A/Ace_Ventura_Series/Ace_Ventura_
        -_Pet_Detective</media:category>

        <media:category scheme="urn:flickr:tags">ycantpark 
        mobile</media:category>

is the URI that identifies the categorization scheme. It is an optional attribute. If this attribute is not included, the default scheme is ‘http://search.yahoo.com/mrss/category_schema’.

label is the human readable label that can be displayed in end user applications. It is an optional attribute.

<media:hash>

This is the hash of the binary media file. It can appear multiple times as long as each instance is a different algo. It has 1 optional attribute.

        <media:hash algo="md5">dfdec888b72151965a34b4b59031290a</media:hash>

algo indicates the algorithm used to create the hash. Possible values are ‘md5’ and ‘sha-1’. Default value is ‘md5’. It is an optional attribute.

<media:player>

Allows the media object to be accessed through a web browser media player console. This element is required only if a direct media url attribute is not specified in the <media:content> element. It has 1 required attribute, and 2 optional attributes.

        <media:player url="http://www.foo.com/player?id=1111" height="200" width="400" />

url is the url of the player console that plays the media. It is a required attribute.

height is the height of the browser window that the url should be opened in. It is an optional attribute.

width is the width of the browser window that the url should be opened in. It is an optional attribute.

<media:credit>

Notable entity and the contribution to the creation of the media object. Current entities can include people, companies, locations, etc. Specific entities can have multiple roles, and several entities can have the same role. These should appear as distinct <media:credit> elements. It has 2 optional attributes.

        <media:credit role="producer" scheme="urn:ebu">entity name</media:credit>

role specifies the role the entity played. Must be lowercase. It is an optional attribute.

scheme is the URI that identifies the role scheme. It is an optional attribute. If this attribute is not included, the default scheme is ‘urn:ebu’. See: European Broadcasting Union Role Codes.

Example roles:

        actor
        anchor person
        author
        choreographer
        composer
        conductor
        director
        editor
        graphic designer     
        grip
        illustrator
        lyricist
        music arranger
        music group
        musician
        orchestra
        performer
        photographer
        producer
        reporter
        vocalist

Additional roles: European Broadcasting Union Role Codes

<media:copyright>

        <media:copyright url="http://blah.com/additional-info.html">2005 FooBar Media</media:copyright>

url is the url for a terms of use page or additional copyright information. If the media is operating under a Creative Commons license, the Creative Commons module should be used instead. It is an optional attribute.

<media:text>

Allows the inclusion of a text transcript, closed captioning, or lyrics of the media content. Many of these elements are permitted to provide a time series of text. In such cases, it is encouraged, but not required, that the elements be grouped by language and appear in time sequence order based on the start time. Elements can have overlapping start and end times. It has 4 optional attributes.

        <media:text type="plain" lang="en" start="00:00:03.000" 
        end="00:00:10.000"> Oh, say, can you see</media:text>
        <media:text type="plain" lang="en" start="00:00:10.000" 
        end="00:00:17.000">By the dawn's early light</media:text>

type specifies the type of text embedded. Possible values are either ‘plain’ or ‘html’. Default value is ‘plain’. All html must be entity-encoded. It is an optional attribute.

lang is the primary language encapsulated in the media object. Language codes possible are detailed in RFC 3066. This attribute is used similar to the xml:lang attribute detailed in the XML 1.0 Specification (Third Edition). It is an optional attribute.

start specifies the start time offset that the text starts being relevant to the media object. An example of this would be for closed captioning. It uses the NTP time code format (see: the time attribute used in <media:thumbnail>). It is an optional attribute.

end specifies the end time that the text is relevant. If this attribute is not provided, and a start time is used, it is expected that the end time is either the end of the clip or the start of the next <media:text> element.

<media:restriction>

Allows restrictions to be placed on the aggregator rendering the media in the feed. Currently, restrictions are based on distributor (uri) and country codes. This element is purely informational and no obligation can be assumed or implied. Only one <media:restriction> element of the same type can be applied to a media object – all others will be ignored. Entities in this element should be space separated. To allow the producer to explicitly declare his/her intentions, two literals are reserved: ‘all’, ‘none’. These literals can only be used once. This element has 1 required attribute, and 1 optional attribute (with strict requirements for its exclusion).

        <media:restriction relationship="allow" type="country">au us</media:restriction>

relationship indicates the type of relationship that the restriction represents (allow | deny). In the example above, the media object should only be syndicated in Australia and the United States. It is a required attribute.

Note: If the “allow” element is empty and the type is relationship is “allow”, it is assumed that the empty list means “allow nobody” and the media should not be syndicated.

A more explicit method would be:

        <media:restriction relationship="allow" type="country">au us</media:restriction>

type specifies the type of restriction (country | uri) that the media can be syndicated. It is an optional attribute; however can only be excluded when using one of the literal values “all” or “none”.

“country” allows restrictions to be placed based on country code. [ISO 3166]

“uri” allows restrictions based on URI. Examples: urn:apple, http://images.google.com, urn:yahoo, etc.

Best Practices

The following are encouraged “best practices” when using Media RSS:

Feed History Specification

If your feed is not an “incremental” in the traditional RSS sense, but rather an entire snapshot of all media available, please take note of the <fh:incremental> element. This element when set to “false” correctly informs the RSS reader that the current feed replaces the previously fetched feed.

If you prefer to syndicate media more along the lines of traditional RSS, this specification also allows you to daisy chain multiple feeds together to compose a history of media that is available on your site.

Expirations Using Dublin Core

To the best of your ability, media that is scheduled to expire after a given time should be duly noted through Dublin Core’s <dcterms:valid> element.

Examples

A recently created movie, using the RSS 2.0 <enclosure> element and without the use of the Media RSS module.

<rss version="2.0">
<channel>
<title>Title of page</title>
<link>http://www.foo.com</link>
<description>Description of page</description>
    <item>
        <title>Story about something</title>
        <link>http://www.foo.com/item1.htm</link>
        <enclosure url="http://www.foo.com/file.mov" 
        length="320000" type="video/quicktime"/>
    </item>
</channel>
</rss>

A movie review with a trailer, using a Creative Commons license.

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/"
xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">
<channel>
<title>My Movie Review Site</title>
<link>http://www.foo.com</link>
<description>I review movies.</description>
    <item>
        <title>Movie Title: Is this a good movie?</title>
        <link>http://www.foo.com/item1.htm</link>
        <media:content url="http://www.foo.com/trailer.mov" 
        fileSize="12216320" type="video/quicktime" expression="sample"/>
        <creativeCommons:license>
        http://www.creativecommons.org/licenses/by-nc/1.0
        </creativeCommons:license>
        <media:rating>nonadult</media:rating>
    </item>
</channel>
</rss>

A music video with a link to a player window, and additional metadata about the video, including expiration date.

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/"
xmlns:dcterms="http://purl.org/dc/terms/">
<channel>
<title>Music Videos 101</title>
<link>http://www.foo.com</link>
<description>Discussions of great videos</description>
    <item>
        <title>The latest video from an artist</title>
        <link>http://www.foo.com/item1.htm</link>
        <media:content url="http://www.foo.com/movie.mov" fileSize="12216320" 
        type="video/quicktime" expression="full">
        <media:player url="http://www.foo.com/player?id=1111" 
        height="200" width="400"/>
        <media:hash algo="md5">dfdec888b72151965a34b4b59031290a</media:hash>
        <media:credit role="producer">producer's name</media:credit>
        <media:credit role="artist">artist's name</media:credit>
        <media:category scheme="http://blah.com/scheme">music/artist 
        name/album/song</media:category>
        <media:text type="plain">
        Oh, say, can you see, by the dawn's early light
        </media:text>
        <media:rating>nonadult</media:rating>
        <dcterms:valid>
            start=2002-10-13T09:00+01:00;
            end=2002-10-17T17:00+01:00;
            scheme=W3C-DTF
        </dcterms:valid>
        </media:content>
    </item>
</channel>
</rss>

Several different songs that relate to the same topic.

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
<channel>
<title>Song Site</title>
<link>http://www.foo.com</link>
<description>Discussion on different songs</description>
    <item>
        <title>These songs make me think about blah</title>
        <link>http://www.foo.com/item1.htm</link>
        <media:content url="http://www.foo.com/band1-song1.mp3" 
        fileSize="1000" type="audio/mpeg" expression="full">
        <media:credit role="musician">member of band1</media:credit>
        <media:category>music/band1/album/song</media:category>
        <media:rating>nonadult</media:rating>
        </media:content>
        <media:content url="http://www.foo.com/band2-song1.mp3" 
        fileSize="2000" type="audio/mpeg" expression="full">
        <media:credit role="musician">member of band2</media:credit>
        <media:category>music/band2/album/song</media:category>
        <media:rating>nonadult</media:rating>
        </media:content>
        <media:content url="http://www.foo.com/band3-song1.mp3" 
        fileSize="1500" type="audio/mpeg" expression="full">
        <media:credit role="musician">member of band3</media:credit>
        <media:category>music/band3/album/song</media:category>
        <media:rating>nonadult</media:rating>
        </media:content>
    </item>
</channel>
</rss>

Same song with multiple files at different bitrates and encodings. (Bittorrent example as well)

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
<channel>
<title>Song Site</title>
<link>http://www.foo.com</link>
<description>Songs galore at different bitrates</description>
    <item>
        <title>Cool song by an artist</title>
        <link>http://www.foo.com/item1.htm</link>
        <media:group>
            <media:content url="http://www.foo.com/song64kbps.mp3" 
            fileSize="1000" bitrate="64" type="audio/mpeg" 
            isDefault="true" expression="full"/>
            <media:content url="http://www.foo.com/song128kbps.mp3" 
            fileSize="2000" bitrate="128" type="audio/mpeg" 
            expression="full"/>
            <media:content url="http://www.foo.com/song256kbps.mp3" 
            fileSize="4000" bitrate="256" type="audio/mpeg" 
            expression="full"/>
            <media:content url="http://www.foo.com/song512kbps.mp3.torrent" 
            fileSize="8000" type="application/x-bittorrent;enclosed=audio/mpeg" 
            expression="full"/>
            <media:content url="http://www.foo.com/song.wav" 
            fileSize="16000" type="audio/x-wav" expression="full"/>
            <media:credit role="musician">band member 1</media:credit>
            <media:credit role="musician">band member 2</media:credit>
            <media:category>music/artist name/album/song</media:category>
            <media:rating>nonadult</media:rating>
        </media:group>
    </item>
</channel>
</rss>

Acknowledgements

Thank you to everyone who has contributed to this specification, and to all those that sent suggestions and corrections. The Yahoo! Group “rss-media” has been instrumental in helping transform the initial Media RSS proposal into a working specification. While there have been many helpful individuals from this community, special thanks go to Danny Ayers, Marc Canter, Lucas Gonze, Vadim Zaliva, Greg Smith, Robert Sayre, Suzan Foster, Erwin van Hunen, Greg Gershman, Jennifer Kolar, Bill Kearney, and Andreas Haugstrup Pedersen.

On the Yahoo! team: David Hall, John Thrall, Eckart Walther, Jeremy Zawodny, Andy Volk, and Bradley Horowitz.

On the Google team: David Marwood and Peter Chane.

Reference from: http://video.search.yahoo.com/mrss

2009 年 9 月 15 日

Google – Media Sitemap

關於影片 Sitemap

「Google 影片 Sitemap」是 Sitemap 通訊協定的擴充套件，可讓您將線上影片內容及相關的中繼資料發佈至 Google 並公開，進而讓人們能夠在 Google 影片索引搜尋到這些內容。您可以使用「影片 Sitemap」新增說明資訊，例如影片的標題、說明、片長等，讓使用者更容易找到特定的內容段落。當使用者透過 Google 找到您的影片後，他們將連結到代管影片的環境，以觀看完整的播放內容。

提交「影片 Sitemap」至 Google 時，我們會將其中包含的影片網址放在「Google 影片」上供人們搜尋。搜尋結果中會有一個影片內容縮圖 (由您提供或 Google 自動產生)，以及「影片 Sitemap」所含的資訊 (例如標題)。此外，您的影片也會出現在其他 Google 搜尋產品中。在試用期間，我們無法預測或保證是否會將您的影片納入我們的索引，以及納入的時間，但隨著我們不斷修正產品的同時，我們期望能同時改善涵蓋範圍和索引建立的速度。

Google 可以檢索下列影片檔案類型：.mpg、.mpeg、.mp4、.mov、.wmv、.asf、.avi、.ra、.ram、.rm、.flv。所有檔案都必須透過 HTTP 存取。我們目前不支援需要透過串流通訊協定下載原始檔的中繼檔案。

提交影片 Sitemap

建立影片 Sitemap 並儲存於可公開存取的網址。目前，Google 無法從受驗證保護的網址擷取檔案 (即使是基本的 HTTP 驗證)。資訊提供本身和資訊提供所指向的網址都必須正確設定 robots.txt 檔案的 User-agent “Googlebot”。
使用您的「Google 帳戶」登入 Google 網站管理員工具，並確認您已將網站新增至您的帳戶。
按一下網站旁的 [新增 Sitemap]。
選取 [影片 Sitemap]。
在提供的欄位中輸入「影片 Sitemap」的 URL。請務必輸入完整的 URL，例如「http://www.example.com/videofeed.xml」。
按一下 [新增影片 Sitemap]。

剛開始新增 Sitemap 時，狀態會顯示為 [未完成]。 Google 處理完您的影片 Sitemap 後 (可能需要幾個小時)，狀態將變更為 [確定] 或 [錯誤]。如果您收到錯誤訊息，請按一下該錯誤以檢視其他資訊。並非所有錯誤都很嚴重，有時候即使您收到錯誤還是可以完成程序。

要在哪裡放置我的影片 Sitemap？

您必須將「影片 Sitemap」放在可公開存取的 URL。目前，Google 無法從受驗證保護的 URL 擷取檔案 (即使是基本的 HTTP 驗證)。資訊提供本身和資訊提供所指向的 URL 都必須針對 User-agent “Googlebot” 正確設定其 robots.txt 檔案。

建立影片 Sitemap

將 mRSS 資訊提供用做影片 Sitemap	瞭解更多資訊…
建立影片 Sitemap	瞭解更多資訊…

將 mRSS 資訊提供用做影片 Sitemap

回到頁首

Google 支援 mRSS，這種 RSS 模組可以補充 RSS 2.0 的元素功能，使其具有更可靠的媒體聯合發佈功能。如果在您的網站上發佈影片內容的 mRSS 資訊提供，則可以將資訊提供的網址做為 Sitemap 提交。如需有關建立 mRSS 資訊提供的詳情 (包括加入範例以及最佳實踐)，請參閱 Media RSS 規格說明。Google 還支援 RSS 2.0 對影片內容和縮圖網址使用圍繞符號標記。

建立影片 Sitemap

回到頁首

影片 Sitemap 使用 Sitemap 通訊協定以及其他影片專屬標記，其定義如下。如果影片網頁上的文字和您在影片 Sitemap 中提供的文字不相符，Google 會採用在影片網頁上的文字。

影片 Sitemap 建立之後，您可以使用「網站管理員工具」將其提交給 Google。雖然影片 Sitemap 可協助 Google 找到原本可能無法在您的網站上找到的內容，但我們並不保證 Sitemap 中包含的所有影片都會出現在我們的搜尋結果中，也不保證會使用您影片 Sitemap 中包含的所有資訊。

以下是使用影片專屬標記的影片 Sitemap 項目，供您參考：

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
  <loc>http://www.example.com/videos/some_video_landing_page.html</loc>
    <video:video>
      <video:content_loc>http://www.site.com/video123.flv</video:content_loc>
      <video:player_loc allow_embed="yes">http://www.site.com/videoplayer.swf?video=123</video:player_loc>
      <video:thumbnail_loc>http://www.example.com/thumbs/123.jpg</video:thumbnail_loc>
      <video:title>夏季的燒烤排餐</video:title>
      <video:description>鮮美排餐烹調秘訣</video:description>
      <video:rating>4.2</video:rating>
      <video:view_count>12345</video:view_count>
      <video:publication_date>2007-11-05T19:20:30+08:00.</video:publication_date>
      <video:expiration_date>2009-11-05T19:20:30+08:00.</video:expiration_date>
      <video:tag>排餐</video:tag>
      <video:tag>肉食</video:tag>
      <video:tag>夏季</video:tag>
      <video:category>燒烤</video:category>
      <video:family_friendly>yes</video:family_friendly>
      <video:expiration_date>2009-11-05T19:20:30+08:00<video:expiration_date>
      <video:duration>600</video:duration>
    </video:video>
</url>
</urlset>

影片專屬標記定義

標記	是否必要？	說明
<loc>	必要	必須要有標記指定影片的到達網頁 (也稱作播放網頁或推薦網頁)。使用者在搜索結果網頁上按一下某個影片結果時，將會前往此到達網頁。
<video:video>	必要
<video:player_loc>	必要	必要屬性 `allow_embed` 會指定 Google 是否可以將影片嵌入至搜尋結果中。允許的值為「Yes」或「No」。範例：
<video:content_loc>	必要
<video:thumbnail_loc>	必要	指向影片縮圖檔案的網址。
<video:title>	必要	影片的標題。最多 100 個字元。
<video:description>		影片的描述。超過 2048 個字元的描述將會截斷。
<video:rating>	選擇性	影片的評等。這個值必須是介於 0.0 至 5.0 的浮點數。
<video:view_count>	選擇性	影片的已觀看次數
<video:publication_date>	選擇性	第一次發佈影片的日期，採用 W3C 格式。可接受的值為完整日期 (YYYY-MM-DD) 以及完整日期加上小時、分鐘和秒鐘 (YYYY-MM-DDThh:mm:ss)。可選擇附加小數秒和時區。例如，`2007-07-16T19:20:30+08:00`。
<video:tag>	選擇性	與影片相關的標記。標記通常是簡短的敘述，用來說明影片或內容的主要概念。一個影片可以有多個標記，而且這些標記可能全都屬於同一類別。例如，關於燒烤食物的影片屬於「燒烤」類別，但是可以加上「排餐」、「肉食」、「夏季」和「戶外」等標記。為與影片相關的每個標記都建立一個新 `<video:tag>` 元素。最多可以有 32 個標記。
<video:category>	選擇性	影片的類別。例如，`烹飪`。該值應為不超過 256 個字元的字串。一般而言，類別是按照主旨對內容的概略分類，通常一個影片只屬於一個類別。例如，一個介紹烹飪的網站可能有「炙烤」、「烘烤」和「燒烤」等不同的類別。
<video:family_friendly>	選擇性
<video:duration>	選擇性	影片的片長 (以秒為單位)。該值必須介於 0 到 28800 (8 小時) 之間。不允許出現非數字字元。
<video:expiration_date>	選擇性	可接受的值為完整日期 (YYYY-MM-DD) 以及完整日期加上小時、分鐘和秒鐘 (YYYY-MM-DDThh:mm:ss)。可選擇附加小數秒和時區。例如，2007-07-16T19:20:30+08:00.

在建立影片 Sitemap 時，請注意下列事項：

影片 Sitemap 應該僅包含參照影片內容的網址。影片內容包含嵌入影片的網頁、影片播放器的網址，或是您網站上代管之原始影片內容的網址。如果 Google 無法在您所提供的網址找到影片內容，Googlebot 就會忽略這些記錄。
由於每部影片都是透過其唯一的內容網址 (實際影片檔案的位置) 進行識別，或者當內容網址不存在時透過播放器網址 (指向影片播放器的網址) 進行識別，因此您必須加入 <video:player_loc> 標記或 <video:content_loc> 標記。如果您省略這些標記，我們就找不到這些資訊，也就無法建立影片索引。
您提供的每個 Sitemap 檔案最多只能有 10,000 個影片項目，而且解壓縮後的檔案不得大於 10 MB。個別影片檔案或縮圖 (分別在 <video:content_loc> 標記和 <video:thumbnail_loc> 標記中指定) 不得大於 30MB。如果超過 10,000 部影片，請提交多個 Sitemap 和一個 Sitemap 索引檔。
Google 可檢索的影片檔案類型包括：.mpg、.mpeg、.mp4、.mov、.wmv、.asf、.avi、.ra、.ram、.rm、.flv。所有檔案都必須透過 HTTP 存取。我們不支援需要透過串流通訊協定下載原始檔的中繼檔案。
Sitemap 中所包含網址必須為 User-agent「Googlebot」正確設定其 robots.txt 檔案。

Reference from: Google Media Sitemap

2009 年 9 月 8 日

20 Ways Opt-in E-Mailers Can Outsmart Spam Filters

by Dr. Ralph F. Wilson, E-Commerce Consultant
Web Marketing Today, Issue 119, December 3, 2002

It’s a jungle out there. Assurance Systems estimates that 5% of e-mails are blocked by spam filters. MarketingSherpa found a similar number but estimates that many companies will be instituting filters in the near future.

After several of my friends informed me that my Doctor Ebiz newsletter had been rejected by SpamAssassin, I decided to do some checking on my own to see how this could happen. I’ve done some evaluation on tests performed by SpamAssassin ver 2.43 (http://www.spamassassin.org/tests.html). I make no claim to be an expert, but I learned a lot from studying their tests.

Filters these days are much more sophisticated than the typical e-mail filters in Eudora and Outlook that can be made to delete an e-mail message that contains a “bad” word. Filters such as SpamAssassin look for patterns and add or delete points for certain factors. Then, if the total score reaches a predetermined level, the message is flagged as spam. By looking at what adds points (bad) and subtracts points (good), I’ve learned how to construct e-mails that will do better with the filters, if not escape them entirely.

Note: spam filters are a moving target, and my suggestions may not be as useful a few months from now. Moreover, the SpamAssassin defaults listed here can be (and sometimes are) adjusted by over-eager spam-adverse ISPs, so don’t count on them. They’re best used as a way of seeing what the filters consider bad or good, rather than as a precise measure.

I’ve found 20 different strategies that can help. Some of these are crucial; others are only of minor importance. But taken together they can help you get more of your legitimate opt-in e-mails through the obstacle course to your recipients.

1. Avoid E-Mail Software or Listservers Used by Spammers

Certain desktop e-mail listserver programs, as well as ASP hosted listservers, have developed a bad reputation for sending spam.

SpamAssassin looks for “fingerprints” of programs on its “bad list,” and adds points to your spam score if it detects them. For example, any e-mail address that includes @email-publisher.com costs you 1.00 points. Employing various free web hosting services that are commonly used by spammers can hurt, too.

The desktop e-mailing software used most often by spammers (if it can be identified as such by SpamAssassin) is penalized from 3.0 to 2.0, in descending order: jpfree, VC_IPA, StormPost, JiXing, MMailer (Gammadyne, 2.73), EVAMAIL, IMktg, screwup1, Outlook 3.14159. GroupMail, hash 2. Group Mail (ver 2.0) is dinged 1.84. Other identifiable bulk mailers are penalized about 1.00 points. (Note: While I don’t spam, I use Gammadyne Mailer routinely. The current version has no tell-tale headers identifying it as in some earlier versions. I am told Group Mail 3.x does not use such headers either.)

You might study e-mails sent out for any header lines that indicate the brand of mailer. You’ll sometimes see this in the user agent and x-mailer header lines. If you find them, disable them or insist that the software vendor remove them. It is better to send e-mail from an unknown e-mail program than one which can be identified as used by spammers. Or use Apple Mail which has such a good record (spammers can’t make it work well for them?) that your point score is reduced by 1.78. (Just kidding.)

2. Use Capitalization Carefully

Capital letters are seen as “yelling” and spammy. Excess capital letters cost you .21. I had been using capitalized titles until I found that I was being penalized for these. Since then, I’ve stopped using whole lines of capitalized type as headlines in my text newsletters. Instead I limit capitalization to partial lines only.

3. Keep HTML Simple

According to SpamAssassin, if your HTML message has more than 50% HTML tags (that is, has very specific formatting), you are fined 0.31 to 1.78 points. The lesson is to keep your HTML very simple. Highly stylized formats can hurt your score. Here are a few more elements to avoid, if possible:

An HTML table with a thick border (0.41 points)
JavaScript contained in the message (21 to .30 points)
HTML comments “which obfuscate text” cost 2.08 (whatever that means).
An HTML form in your e-mail message can also be costly. An “obfuscated action attribute” in an HTML form costs 1.00 point.

4. Watch Your Hyperlinks

SpamAssassin gives links a good looking over, so be careful.

Links without an http:// prefix cost 1.28. Oops. I’ve been shortening them, but does that spamify my newsletters? I hope not.
Don’t link to URLs using IP address numbers instead of a domain name (3.1).
More on mailto links below under unsubscription systems.

5. Use Color Judiciously

Realize that high art is likely to cost you something. A font color tag that isn’t formtted quite right can cost you .21. If you are using special font colors that aren’t in the palette of 217 web safe colors, you are dinged .30 points. Hidden letters (same color as the background color) cost you .34 points. Beware the color police.

Black	0
Blue	.21
Red	.32
Gray	.33
Green	.41
Cyan	.41
Yellow	.42
Unknown color	.42
Magenta	.44

Black fonts are safe, but I’m not ready to desert color yet. I’ll try to avoid using it in font tags, however. Rather I’ll control color with style sheets and see if that helps. Unfortunately, many e-mail client programs don’t handle style sheets very well yet. Also be aware that using a background color other than white is suspect, and racks up 0.317 points.

6. Use Large Fonts and Characters Judiciously

Fonts larger than +2 or size 3 (normal) cost you 0.34 points. I don’t believe this includes H1, H2, H3 (presumably not), so I’ll probably use HTML headers in the future rather than font tags to increase font size.

7. Avoid Suspect Spam Phrases

This list is a long one. I’ve included it on its own webpage so you can print it out for easy reference — “Words and Phrases that Trigger Some Spam Filters,” Web Marketing Today, 12/3/02. http://www.wilsonweb.com/wmt8/spamfilter_phrases.htm

Does it help to include * or ^ characters in place of vowels? The jury’s still out. I suspect that some spam filters are smart enough to detect this ruse, but I’m not sure.

8. Be Careful with Subject Lines

SpamAssassin is particularly interested in subject lines. Here are a few subject lines no-nos to learn from:

Contains “FREE” in CAPS	0.43
Starts with dollar amount	1.10
GUARANTEED	0.62
Starts with “Free”	0.30
Starts with “Hello”	1.58
To: username at front of subject	2.86
Subject includes a question mark or exclamation point	0.10
Subject contains lots of white space	2.64
Subject is all in capitals	0.48
Subject talks about savings	0.41
Subject talks about losing pounds	0.51
Subject is missing	0.34

9. Carefully Word Your Unsubscribe System

It seems ironic that legitimate opt-in e-mailers are penalized for having unsubscription information. But since so many spammers have bogus systems, it is apparently a spam indicator. For example:

List removal information	1.00
Click-to-remove with PHP/ASP action found	0.30
Claims you can be removed from the list	2.70
Claims to listen to some removal request list	1.00
Says: “to be removed, reply via email” or similar	0.45
Header contains exists:X-List-Unsubscribe	1.11

You need to include ways to unsubscribe, of course, but avoid the phrase “click here to…” and substitute something like “use this link to ….” You’re especially hurt by using mailto e-mail links with “remove” — or anything, for that matter — in the subject. Make sure that the program you are using to unsubscribe people doesn’t have “unsubscribe” or “remove” in the URL.

10. Flaunt Being a Newsletter

Fortunately, being a legitimate newsletter lowers your spam score.

Subject contains newsletter header (list)	-0.22
Subject contains newsletter header (news)	-0.62
Subject contains newsletter header (in review)	-1.00
Subject contains a frequency – probable newsletter	-0.73
Subject contains a month name – probable newsletter	-0.48
Subject contains a date	-1.60

Other words and phrases which may help you include a PGP signature, or something about a forgotten password or a registration system.

11. Use a Signature

You’re helped if your e-mail contains an e-mail signature — since so many spam messages don’t.

Short signature present (no empty lines)	-0.30
Short signature present (empty lines)	-2.09
Long signature present (no empty lines)	-3.13
Long signature present (empty lines)	-0.30
Contains what looks like an ‘E-Mail Disclaimer’	-0.70
Contains what looks like an email attribution	-1.63
Contains what looks like a quoted email text	-0.83

12. Don’t Mention Spam Law Compliance

It’s very unwise to claim that you observe all the spam laws. Only spammers say that. SpamAssassin will assess you from .91 to 3.47 points for this. If you mention House Bill 4176 you’ll be fined 2.02 points. H.R. 3113 dings you 2.93.

13. Message Size of 20K to 40K Helps

Since so many spam messages are under 20K, SpamAssassin gives you credit for a message size between 20K and 40K (-.71). Over 40K helps you less (-.12).

14. Remove Spam Flag Addresses from Your List

Occasionally, evil-minded people will add e-mail addresses to your list just to get you in trouble with the anti-spammers. Try scanning your e-mail database for an e-mail address that starts with abuse@, postmaster@, or nospam@. Sometimes an e-mail address will be inserted that subscribes you to an autoresponder each time you send out an e-mailing. You might scan for the word “subscribe” among your e-mail addresses (though this one won’t affect you with the spam filters).

15. Monitor Your “From” E-mail Address for Challenge Systems

I am seeing a small but increasing number of recipients who use systems that block all e-mails except those that take the trouble to respond to an e-mail message, and perhaps give a name and reason for the e-mail. Thus, it’s important to monitor the mailbox for your “From” e-mail address to catch these.

16. Ask Subscribers to Put Your Address in their “Whitelist” or Address Book

Some e-mail client programs such as AOL 8.0 and Hotmail have recently changed their interface to allow users to sort their mail into preferred folders. As people subscribe, ask them specifically to place you in their address book (AOL), “safe list” (Hotmail), or “whitelist” (some spam filters). That way your e-mail will come directly into their inbox. Asking may be a little trouble, but it may make the difference between your recipients seeing or not seeing your e-mail.

17. Monitor Blacklists and Test Accounts

ISPs and spam filter systems often check blacklists of known spammers to help them reject e-mails. If your listserver’s IP address or domain — or yours — gets on a spam blacklist because of complaints of spam, it will prevent some of your e-mails from getting to their recipients. Your listserver vendor should be actively working with ISPs and anti-spam services to keep an excellent reputation in the e-mail community and resolve any problems. But if they fail to — or cater to spammers — your e-mail delivery can suffer.

SpamAssassin currently checks three blacklists, and addresses that appear on such lists cause substantial penalties to any e-mails coming from them.

Razor2 (http://razor.sf.net)
DCC List (www.rhyolite.com/anti-spam/dcc/dcc-tree/dcc.html)
Pyzor (http://pyzor.sf.net)

Some other blacklists that may prevent your recipients from receiving their e-mail include:

Mail Abuse Prevention System (MAPS, www.mail-abuse.org) maintains the Realtime Blackhole List, an important blacklist, and has many ISPs as subscribers.
Network Abuse Clearinghouse (www.abuse.net)
NJABL.ORG (Not Just Another Blacklist, www.njabl.org)
SPAM Blocking Blackhole List (http://blackholes.bruli.net)

Other anti-spam organizations are listed in Yahoo! Directory under “Email > Spam”

You can check many blacklists at once to see if your domain is on it using a utility from OsiruSoft Research & Engineering (http://relays.osirusoft.com/cgi-bin/rbcheck.cgi).

In addition to checking blacklists periodically, it might be a good idea to subscribe to some of the more important ISPs (or find a friend who subscribes) so you can monitor if your e-mails are getting through. ISPs with the largest blocks of subscribers include America Online (with CompuServe and RoadRunner), MSN, Earthlink (with Mindspring and others), United Online (Juno and NetZero), and SBC/Prodigy. If you find your newsletter blacklisted, contact the service(s) involved and actively work to see the ban removed.

18. Move Immediately to Confirmed Opt-in

As I argued a few months ago in “Why I’m Moving to Double Opt-in Subscription Confirmation,” Web Marketing Today, 9/10/02 (www.wilsonweb.com/wmt7/double_optin.htm), the time has come for each company to require the higher standard of confirmed opt-in for new subscribers. If the government doesn’t require it, then the free marketplace driven by spam filters may require the higher standard. When you’re falsely accused of spamming, it’s a whole lot easier to argue your case before an ISP or blacklist when you have a confirmed opt-in standard than if you don’t.

19. Use the Habeas Header If You Qualify

Finally, if you do use a confirmed opt-in system and qualify to apply for a Habeas warrant mark (www.habeas.com), then I suggest you purchase a license to use it. Habeas is actively working with the anti-spam community and leading spam filters to have their mark (contained in headers) recognized as certifying your e-mail as confirmed opt-in. SpamAssassin, for example, subtracts 4.00 points from your score if the e-mail message contains the Habeas header lines. For more information on Habeas, read my Review of Habeas, Web Marketing Today, 1/7/03 (www.wilsonweb.com/reviews/habeas.htm).

I wish that I could guarantee that if you took all the above steps, your legitimate opt-in e-mails would get through the spam filters. But I can’t. I can’t even get all my newsletters through. Another important piece of this problem is to reduce the quantity of spam, and to do that requires legislation.

20. Use a Spam Checker to Test Your Message

We’re now seeing some services you can use to test the spam quotient of your e-zines and e-mail offers before sending them out.

SiteSell SpamCheck Report tests your message at no charge using SpamAssassin and sends you a report. Send your test e-mails to mailto:sales-spamcheck@sitesell.net Be careful, however, that you put the word TEST as the first word in the subject — and make sure it is capitalized. Otherwise, the system will delete the mail, thinking it’s spam. Following the word TEST, add the subject line that would appear in the email normally.
Assurance Systems offers three functions as part of a paid service. (1) Message Checker rates your e-mail message for spam. (2) Mailbox Monitor checks test addresses for each of the major ISPs to make sure your e-mail is being delivered. (3) Blacklist Alert lets you know what blacklists you are appearing on so you can work to get your domain or IP number off the list. http://www.assurancesys.com

I don’t want intrusive government regulation any more than you. But I believe that the time has come for clear federal regulations to prohibit spam in the same way as unrequested faxes are prohibited. State and provincial laws can’t really regulate what is a national and international problem. Federal regulations won’t stop spam entirely, but they’ll certainly put a dent in it. Yes, some spammers will move offshore. But thousands of small spammers who are willing to spam now because it’s cheap and legal will no longer spam because it illegal, and the risks are too great. I encourage you to advocate with your legislator for federal anti-spam regulations in your country. Perhaps we can recover for legitimate business use a communications medium that was once called the “killer ap.” I hope so.

additional info: http://www.list-unsubscribe.com/