Pharo-Chrome

Hi Torsten and All,

Quick Introduction for those not familiar with Pharo-Chrome:

Pharo-Chrome enables Pharo to control and query Chrome / Chromium, in
particular to retrieve the DOM of a page.  This is useful as many modern
pages are just a template which then loads some javascript to
asynchronously build the DOM, meaning that the ZnEasy / Soap combination
doesn’t get the bulk of the information on a page.

Pharo-Chrome is now mostly working, i.e. it is possible to open
a connection to Chrome, navigate to a requested URL, wait for it to
load, retrieve the DOM and then navigate the DOM using a subset of the
Soap API, e.g. #findAllStrings:, #findAllTags:, attributeAt:, etc..

GoogleChrome class>>exampleNavigation has been updated to retrieve the
DOM from http://pharo.org.

GoogleChrome class>>get: is analogous to ZnEasy class>>get:, although it
returns a ChromeNode, not an html string.

I wasn’t able to get rid of the delay while waiting for the page to
finish loading.   This actually makes sense, since, as mentioned above,
many modern pages build the DOM asynchronously, so there’s no clear
indication of when it is complete.  The default delay is currently 2000
milliseconds, which is about twice the maximum I saw needed (983ms), but
this can be changed (ChromeTabPage>>pageLoadDelay:).

I had three use cases for this library: one which works with
ZnEasy+Soap, one that used to work with ZnEasy+Soap, but doesn’t due to
a page redesign, and one which I hadn’t got working before.  All three
are working now.

Unlike Soap, I’ve currently modelled the nodes as a single class, and
have only implemented a subset of Soap’s methods, but is enough for what
I need.

I’ve introduced a dependency on the Beacon logging framework.  I find it
useful, but can remove it if you don’t want the additional dependency.
(I’m planning to add some GoogleChrome specific logging classes and use
those to better understand what pageLoadDelay should be).

I was focussed on trying to understand the events that Chrome generates,
so documentation is still lacking (read “missing” :-)).

I’ll generate a pull request after some more testing, tweaking and
documenting, but if you would like to take a look, the code is available
at:

https://github.com/akgrant43/Pharo-Chrome/tree/development

I haven’t yet updated BaselineOfChrome with the Beacon dependency.  I
did merge in your two commits from May 23.

If you, or anyone else, finds this useful, I welcome any feedback.

P.S.  I’ve just realised that I need to tidy up #sendMessage:,
#sendMessageDictionary and #sendMessageDictionary:wait:.  I’ll do that
as part of the general tidy up.

Cheers,
Alistair

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: