Installing Plagger
From Dev411: The Code Wiki
Blog Tags: Plagger (http://www.dev411.com/blog/tag/plagger)
Plagger (http://plagger.org) is a Perl-based pluggable framework/platform for RSS/Atom feed aggregation. Popular uses for Plagger include building planet-style feed aggregation websites like what Planet (http://www.planetplanet.org/) does and sending aggregate feeds to your email account like what Newspipe (http://newspipe.sourceforge.net/) does. Planet and Newspipe are both written in Python but have more limited capabilites. Plagger's big advantage over Planet is that it uses a plugin-architecture which makes adding features a breeze. Planet has more of a monolithic architecture (webpages, xml feeds, etc. are all generated in the 953-line __init__.py file) which makes it difficult to extend but is easier to get up and running (you just untar it into place). The wealth of plugins and the ability to add more makes Plagger seem like a much more capable product. I don't have any experience with Newspipe yet so can't say anything about it except that it's capabilities are limited to converting RSS feeds to email.
This article covers installing Plagger 0.7.5 on *nix systems including CentOS 4.2 and Dreamhost (Debian) and setting up a standard "planet" website that provides aggregated feeds via a webpage as well as Atom/RSS feeds. While this is what the Python-based Planet software does, Plagger does much more including scanning a large variety of blog search engines. This article just covers setting a basic planet website.
| Table of contents |
|
|
Installation
System-wide installation
If you are a superuser on your system, installing Plagger is straight-foward though long due to a large list of CPAN dependencies. If CPAN is configured already, just type the following:
$ perl -MCPAN -e shell cpan> install Plagger
This will install the Plagger modules in your Perl lib directory and install the plagger script:
$ which plagger /usr/bin/plagger
Local installation
If you are on a shared system and do not have permissions to install into the system directories, installation is a bit more involved. First you'll need to configure CPAN to use your local directory and then use CPAN. This information is from the 2005 Catalyst Advent Calendar (http://www.catalystframework.org/calendar/2005/10). These instructions assume you will install the files under the ~/local directory which you should create.
Configure CPAN
If you haven't configured CPAN already, do so by typing:
$ perl -MCPAN -e shell
Answer the configuration questions and then quit without installing anything.
Configure local directories
Edit the ~/.bashrc file and add the following:
export PATH=$HOME/local/bin:$HOME/local/script:$PATH
perlversion=`perl -v | grep 'built for' | awk '{print $4}' | sed -e 's/v//;'`
export PERL5LIB=$HOME/local/share/perl/$perlversion:$HOME/local/lib/perl/$perlversion:$HOME/local/lib:$PERL5LIB
Reload .bashrc with the following to update your environment:
$ source ~/.bashrc
Edit ~/.cpan/CPAN/MyConfig.pm
Look for the make_install_arg and makepl_arg configuration variables and set them to the following:
'make_install_arg' => qq[SITEPREFIX=$ENV{HOME}/local],
'makepl_arg' => qq[INSTALLDIRS=site install_base=$ENV{HOME}/local],
On Dreamhost, with their 5.8.4 /usr/local/bin/perl, $ENV{HOME} and install_base wouldn't work for me so I ended up hardcoding the local directory like the following where username is your *nix username:
'make_install_arg' => qq[SITEPREFIX=/home/username/local], 'makepl_arg' => qq[INSTALLDIRS=site PREFIX/home/username/local],
Install via CPAN shell
I then did a normal CPAN installation via the shell. I ended up installing a lot of the dependencies separately on Dreamhost. I also had one test fail for Template Toolkit on Dreamhost but force installed it and it seems to be working fine.
$ perl -MCPAN -e shell cpan> install Plagger
This will install the perl modules under /home/username/local/share/perl including the plagger script in:
$ which plagger /home/username/local/bin/plagger
Install via tarball
When the environment variables are set, you want to install via tarball on Dreamhost, make sure to also set PREFIX:
$ perl Makefile.PL PREFIX=/home/<username> $ make $ make test $ make install
Configuration
Installing Plagger Assets
To use the Planet web site capabilities as well as some of the other features you'll need to install the assets directory of files that comes with Plagger. This will be expaneded in your ~/.cpan directory and you can move it into place with the following which will rename the directory to plagger-assets:
$ mv ~/.cpan/build/Plagger-0.7.5/assets ~/local/plagger-assets
Creating config.yaml
Plagger runs with a YAML configuration file. This file can be specified on the command line when plagger is run with the -c or --config parameter as follows:
$ plagger -c /path/to/myplanet.yaml
If no file is specified Plagger will look for a file named config.yaml in the same directory as the plagger script (typically /usr/bin or /home/username/local/bin).
For some example config.yaml files take a look at the examples directory which is available locally in ~/.cpan/build/Plagger-0.7.5/examples or online at http://plagger.org/trac/browser/trunk/plagger/examples. The following is based on the planet.yaml example (http://plagger.org/trac/browser/trunk/plagger/examples/planet.yaml). The Bundle::Planet plugin automatically generates RSS 2.0, Atom 1.0, FOAF and OPML XML files in addition to the X/HTML file.
global:
assets_path: /home/username/local/plagger-assets
log:
level: debug
plugins:
# The debug module
#- module: Publish::Debug
# Subscribe to a couple of "My Feeds" on the web
# They could be either Feeds URL or Blog URL (with Auto-Discovery support)
# Either of the following configuration styles can be used
# (a) url only or (b) url and title
- module: Subscription::Config
config:
feed:
- http://feeds.feedburner.com/Dev411Blog
- http://bulknews.typepad.com/blog/index.rdf
- module: Bundle::Planet
config:
duration: 7 days
title: Planet Catalyst
# You can just set dir to your public_html directory, e.g. on Dreamhost:
# dir: /home/username/your.domain.com
dir: /tmp/planet
url: http://planet.catalystframework.org/
# Plagger comes with the 'default' and 'sixapart-std' themes
theme: sixapart-std
stylesheet: http://bulknews.typepad.com/blog/styles.css
# If you only want articles with certain tags, add:
#extra_rule:
# expression: grep $args->{entry}->has_tag($_), qw/catalyst dbic/
Running Plagger
Command Line
Running plagger is as simple as executing the plagger script.
$ which plagger /usr/username/local/bin/plagger $ plagger
or
$ plagger /path/to/myplanet.yaml
Cron
To run plagger periodically use cron. The following will run plagger twice an hour looking for a config.yaml in the same directory as the plagger script:
0,30 * * * * plagger
The following uses a config file of your specification:
0,30 * * * * plagger -c /path/to/myplanet.yaml
Customization
Theming
Plagger's assets directory comes with two themes, default and sixapart-std in the following directory:
~/local/plagger-assets/plugins/Publish-Planet
There are two subdirectories:
~/local/plagger-assets/plugins/Publish-Planet/default/template ~/local/plagger-assets/plugins/Publish-Planet/default/static
index.tt in the template subdirectory is used to create the Planet HTML files. Files and subdirectories under the static subdirectory are copied into the public_html directory specified by dir: in config.yaml.
Additional Resources
- Planet Engines: Plagger and Planet (http://www.dev411.com/blog/2006/07/30/planet-engines-plagger-and-planet)
- IRC: #plagger on FreeNode
