Planet engines are applications that aggregate RSS/Atom feeds and generate composite feeds as well as a website. The generated feeds typically include RSS, Atom, FOAF and OPML. Two popular open source planet engines are Planet and Plagger. I've used both to create planet-style websites and here are my observations:Read more...
On this page
The first problem I ran into was using the regex object's re.match which definately does not follow the Python philosophy of "There should be one -- and preferably only one -- obvious way to do it" because there's more than one way and re.match is not obvious because it works different than other languages. Python's re.match automatically starts from the begining of the string unlike regex engines for other languages (grep|sed|awk|perl|ruby). This results in the following behavior:
# matches - pattern is at start of string m = re.match('b', 'bc') # does not match - pattern is not at start of string m = re.match('b', 'abc') # matches - add the .* to match any leading characters # but .* isn't needed after '.*b' to match 'c' m = re.match('.*b', 'abc') # matches - second .* matches 'c' but isn't needed m = re.match('.*b.*', 'abc')
I was sticking to re.match because I needed the MatchObject to capture groups but it didn't DWIW. After posting to a forum, I learned that I should use re.search instead which will match anywhere in the string. re.search also returns a MatchObject but I didn't expect this at first because search != match. I get the impression re.match and MatchObject were created first and re.search was added later. Perhaps the MatchObject should be called something generic, like ResultObject, in the docs. I'm sort of curious but when is re.match desired over re.search? It seems like re.match is useful only in a limited number of situations and then it's always easy to implement it with re.search.
The open issue for me is Python's dependency on indenting. Lately I've been using tabs exclusively for indentation because I like to see the tickmarks may editors show for tabs. This lets me quickly know what level I'm at. This python-list thread was titled Tab Wars and argued in favor of spaces over tabs. Planet is indented with spaces and that's what I used for my hacking. Given my preference for tabs, the more time I spend with Python the more likely I'll switch to tabs. Do I need to convert the entire script to tabs or can I mix tabs with spaces? What do people with a preference for tabs do with Python code that uses spaces?
I've been thinking about releasing my enhanced version of PlanetPlanet which allows one planet site to run multiple planets but now I'm thinking I want to do more enhancements to make it even more user friendly. Let me know how Planet MVC is working for you or if I can add any additional blogs. I've already found it pretty useful to find out about things I otherwise wouldn't have.
Here are some notes on getting Planet running.