せっかくPlagger通してるんだからやってみた。
yamlはこんな感じ。
plugins:
- module: Subscription::LivedoorReader
config:
username: USERNAME
password: PASSWORD
- module: Filter::StripRSSAd
- module: Filter::EntryFullText::SiteInfo
config:
impersonate: 0
force_upgrade: 1
- module: Store::Fastladder
config:
sync_rate: 1
connect_info:
- dbi:mysql:fastladder_production
- root
- on_connect_do:
- SET NAMES utf8
member_id: 1
Plagger実行。
$ plagger -c Sites/plagger/fastladder-crawler.yaml
Plagger [info] plugin Plagger::Plugin::Subscription::LivedoorReader loaded.
Plagger [info] plugin Plagger::Plugin::Filter::StripRSSAd loaded.
Plagger [info] plugin Plagger::Plugin::Filter::BloglinesContentNormalize loaded.
Can't locate Web/Scraper.pm in @INC (@INC contains: /opt/local/bin/lib /Users/Madhat/Sites/plagger/plagger/lib /opt/local/lib/perl5/5.8.8/darwin-2level /opt/local/lib/perl5/5.8.8 /opt/local/lib/perl5/site_perl/5.8.8/darwin-2level /opt/local/lib/perl5/site_perl/5.8.8 /opt/local/lib/perl5/site_perl /opt/local/lib/perl5/vendor_perl/5.8.8/darwin-2level /opt/local/lib/perl5/vendor_perl/5.8.8 /opt/local/lib/perl5/vendor_perl .) at /Users/Madhat/Sites/plagger/plagger/lib/Plagger/Plugin/Filter/EntryFullText/SiteInfo.pm line 9.
BEGIN failed--compilation aborted at /Users/Madhat/Sites/plagger/plagger/lib/Plagger/Plugin/Filter/EntryFullText/SiteInfo.pm line 9.
Compilation failed in require at /Users/Madhat/Sites/plagger/plagger/lib/Plagger.pm line 234.
怒られたのでWeb::Scraper入れる。
$ sudo cpan -i Web::Scraper
再度実行
$ plagger -c Sites/plagger/fastladder-crawler.yaml
Plagger [info] plugin Plagger::Plugin::Subscription::LivedoorReader loaded.
Plagger [info] plugin Plagger::Plugin::Filter::StripRSSAd loaded.
Plagger [info] plugin Plagger::Plugin::Filter::BloglinesContentNormalize loaded.
Plagger [info] plugin Plagger::Plugin::Filter::EntryFullText::SiteInfo loaded.
Plagger::Plugin::Filter::EntryFullText::SiteInfo [debug] siteinfo: ^http://b\.hatena\.ne\.jp/entry/ id("entry-info")/div[@class="section"][1]|id("bookmarked_user")
Plagger::Plugin::Filter::EntryFullText::SiteInfo [debug] siteinfo: ^http://(feeds\.)?japan\.cnet\.com //div[contains(@class,"leaf_body")]
Plagger::Plugin::Filter::EntryFullText::SiteInfo [debug] siteinfo: ^http://www\.excite\.co\.jp/News/bit //div[@class="lh140"]
...
だららーっとsiteinfoが読み込まれてく。成功したっぽい。
- Newer: Plaggerをcronで定期実行する
- Older: OpenFLの未読最大件数を200件にした