imagemagick iconie iconperl icon

Automated web screen shots with Perl

Posted in , , Fri, 21 Jul 2006 15:02:00 GMT

I've been looking for a program that will take full screen shots of web pages even when the web page is larger than the window size on my physical screen, requiring scrolling. This morning I found such a program in Petr Šmejkal's Win32::CaptureIE when it was mentioned by Displeaser on DevShed Forums in the "Screenshot of webpage" thread. It uses ImageMagick for image manipulation.

From reading the Win32::CaptureIE POD, the CapturePage function does exactly what I want:

CapturePage ( ) Captures whole page currently loaded in the Internet Explorer window. Only the page content will be captured - no window, no scrollbars. If the page is smaller than the window only the occupied part of the window will be captured. If the page is longer (scrollbars are active) the function will capture the whole page step by step by scrolling the window content (in all directions) and will return a complete image of the page.

After installing ImageMagick, Image::Magick and Win32::CaptureIE on my Windows / ActiveState Perl system, I generated this screen shot with the following short program using no additional processing:

#!perl
use strict; use warnings;
use Win32::CaptureIE;

StartIE( width => 900 );
Navigate( 'http://www.dev411.com/blog/' );

my $img = CapturePage();
$img->Write( 'capture.png' );
QuitIE;

Perl and CPAN continue to amaze me with their treasure trove of functionality. Are there similar tools for using Firefox, Linux, other image libraries or languages?

UPDATE: ishnid has found two programs with CLIs (posted to the same thread):

  • khtml2png on SourceForge. This is a command-line program that looks like it can be run without a browser. It uses libkhtml (used by Konqueror) and ImageMagick's convertn.
  • Pearl Crescent Page Saver, a commercial app but available in a free version. This is a Firefox extension and requires the browser.

UPDATE 2:: I recently tried Win32::CaptureIE with ImageMagick 6.3.0 and it doesn't work. Apparently there used to be a link to "PerlMagick" in older versions of ImageMagick that may not exist anymore. Unfortunately Win32::CaptureIE relies on PerlMagick.

UPDATE 3:: I just tried the free version of Pearl Crescent with Firefox 1.5.0.7 which it says it should support but I get a "Download error" with pageserverbasic-1.3.xpi.

del.icio.us:Automated web screen shots with Perl digg:Automated web screen shots with Perl reddit:Automated web screen shots with Perl spurl:Automated web screen shots with Perl wists:Automated web screen shots with Perl simpy:Automated web screen shots with Perl newsvine:Automated web screen shots with Perl blinklist:Automated web screen shots with Perl furl:Automated web screen shots with Perl fark:Automated web screen shots with Perl blogmarks:Automated web screen shots with Perl Y!:Automated web screen shots with Perl smarking:Automated web screen shots with Perl magnolia:Automated web screen shots with Perl segnalo:Automated web screen shots with Perl

4 comments

Comments

  1. Chisel Wright said about 4 hours later:

    I’ve looked around for this before without much success.

    I’m amazed that someone hasn’t found a way to hook into the Gecko engine, and instead of rendering to Browser, rendered to a PNG/JPG file.

    I wouldn’t know where to start looking, but it seems that all the hard-work has been dome already, and some clever b**tard just needs to redirect the output.

    I’m not keen on solutions that require you to be running in/under X/Windows to render the page.

    One day …

  2. John Wang said about 6 hours later:

    Great idea. I’d be excited about a solution that could run without a windowing system as well. Definately something to keep an eye out for.

  3. Demoric said 7 months later:

    Version 1.4 of Pearl Crescent has been released and works well with firefox 2.0.0.2

  4. bswrjj@yahoo.co.uk said over 2 years later:

    Is there a utility (or combination utilities) using which I can automate to read series of web pages & copying their texts? The utility ‘Mechanize’ helps? Thank you, BJ.

(leave url/email »)

   Comment Markup Help Preview comment