Wednesday, December 30, 2009

End-of-Year Miscellanea

I haven't posted too much lately. Here and there I have been working on a few interesting projects; I continue to add little bits of functionality to Antichrist Watch and have added dojox.gfx vector graphics to a handful of sites (including among others NEARTA, Keep Saugus A Town, and Shining Stars Learning Center). Using dojox.gfx vector graphics is really interesting; it enables the sizing of elements within a site to be completely based on em size and/or window size without significant increase in load times or loss of graphical clarity. My svg2gfx.xslt utility was recently promoted to Dojo proper (in time for the 1.4 release), too, and this should hopefully make creating dojox.gfx graphics a bit simpler for some. I'll try to write something up describing the general process of making a vector image logo for a site in the relative near future. I've also been messing around with getting egg-based builds of Zope 2 and Plone working under Repoze with varying degrees of success. I don't have a solid, simple enough solution for either of these to post much of anything yet.

I've also spent a not-insignificant time lately messing around with e-mail. I've been using Apple's Mail.app for awhile now, and whenever I've needed encryption I've simply used its built-in S/MIME capabilities. With a free Thawte client certificate, it was extremely easy to set up and just worked everywhere. As you're probably aware, Thawte recently dropped its free certificate service. This meant that I had to change my ways. The simplest path was to start using the free CAcert service for client certificates and instruct those with whom I'd already been communicating via S/MIME to pull in the appropriate root certificates. I did this fairly painlessly (although I still have to hunt around for some locals willing to vouch for my ID; if you're in the North-of-Boston area and willing to do so, please drop me a line). As I'll be working more closely with the SitePen folks though in the near future and they all use PGP I wanted to get it working as well. Sadly, Mail.app does not have the nice built-in support for PGP that it does for S/MIME; to make matters worse, the GPGMail plug-in to give Mail.app PGP capabilities isn't quite yet ready for the current version of Mail.app. There are some unofficial releases available through the forums, and I got the newest of these to work. Thus I now (finally) have both S/MIME and PGP signing & encryption working for my e-mail. (Although I also have to hunt around for locals to sign my PGP key; if you're in the North-of-Boston area and willing to do so, please drop me a line.)

Sunday, October 04, 2009

Plone, Repoze, and the Dojo Toolkit

I was hoping it would be made public a couple weeks ago in September, but it just went live this weekend. It's the first of a series of sites Saugus.net has been working on to utilize cross-platform vector graphics via the Dojo Toolkit. To make it more interesting, most of these sites also use the Plone CMS.

Plone and Dojo don't coexist so easily. There is an old product called ZPDojo (with a slightly newer fork) that managed to pull it off a few years ago, but it was using an old version of Dojo. I decided to take a different approach and instead used Repoze and Deliverance to layer the Dojo portion in a logically separate portion from the Plone portion. By doing so it became fairly straightforward to use Dojo's excellent build system to build the requisite custom JavaScript package to insert into the Deliverance portion while continuing to use Plone's own excellent merge system to operate on Plone's requisite JavaScript (and even preserve KSS capabilities).

All is not perfect. KSS relies upon JQuery; theoretically this is somewhat wasteful as by pulling in Dojo's GFX package one is already getting pretty much all the capabilities that JQuery offers resulting in a slightly slower-than-necessary initial load time, and practically it breaks unless one removes JQuery from Plone's automerge system and pulls it in manually. The next version of Dojo has a JQuery compatibility package planned, so long term the theoretical solution is probably one of removing JQuery entirely and relying on the compatibility package (perhaps even gaining a bit of speed), but it's still early to say for certain. In the meanwhile, separating out JQuery from the rest of the Plone JavaScript source files and linking it directly solves the practical problem.

The site in question is the Shining Stars Learning Center. They wanted lots of stars on a light blue background, and they got them in spades. A single star drawing routine is looped through a number of times based upon the client's screen size, filling the current browser window (actually the maximum possible browser window should the window be resized) with stars of random size, rotation, and position. To add a bit of whimsy, all the stars can be independently dragged around the window.

The Shining Stars logo in the upper left hand corner is also drawn via vector graphics, although it can't be moved. I disallowed stars being drawn behind it in the initial star populating phase, but they can be dragged behind it after the window has been drawn. It was converted from SVG to Dojo GFX JSON using my fork of svg2gfx.xsl (which will soon hopefully go into the main trunk).

The whole site, including the logo and stars, is built around em size, so resizing the text causes the logo and stars to resize accordingly, too, and since they're defined using vectors they never get fuzzy.

Actually, for the fun of it I tried copying a few of the resulting vector images into bitmap images (one of which is used for the Shining Stars Twitter background). Besides being noticeably fuzzier, the image ended up being somewhat larger than the JavaScript used to define it, too. Obviously this won't always be the case depending upon specifics (and it certainly wasn't my driving motivation; I wanted a clean appearance with nice resizing) but I found it to be an interesting perk.

When I say it's a cross-platform solution I really mean it. Not only does it work on the usual suspects (Firefox, Safari, Opera, Chrome, MSIE, etc.) but it works on the Wii Internet Channel as well. Shining Stars has a Wii in their boardroom (how cool is that?) that they can use for browsing the Web (among other things) with the Wiimote as a pointer. I thus did as much testing of the site on the Wii as I did with the other more common browsers, and it works surprisingly well. It's a bit slow rendering the vector graphics, but not disastrously so, and the star dragging even works. The only problem I had with it is that when one switches into its so-called "Single Column Mode" almost nothing gets displayed at all. This could be simply alleviated by selectively turning off some JavaScript and CSS when the mode is enabled, but I've yet to find a way to reliably detect it. It's not a huge worry though as the site does work fine in the Wii's default display mode.

Microsoft Internet Explorer (as usual) provided its own set of problems. While on most browsers (including the Wii Internet Channel) the Dojo GFX stuff gets rendered as SVG, MSIE instead uses VML. Unfortunately Microsoft's recent crusade to catch up with other browsers' support of standards has not included standard vector graphics in the form of SVG, and to make matters worse they've allowed their own VML to rot with each new version of MSIE seeming to introduce new, unaddressed VML problems. In particular, some of the vector graphics the site uses cause VML issues in MSIE8, and the only way I was able to fix it was to use the Microsoft-specific X-UA-Compatible to explicitly request that MSIE8 render the site as if it were really MSIE7. Since I was forced to mess with this anyway, I took it one step further and added Chrome Frame support with the line:

<meta http-equiv="X-UA-Compatible" content="chrome=1;IE=7">

and dropping all MSIE-specific hacks when it is engaged.

It now uses Chrome Frame (if present) in preference to any of MSIE's native rendering modes, and uses MSIE7 rendering mode if in a newer version (that lacks Chrome Frame). Chrome Frame is visibly faster than all of the MSIE native rendering modes for this site, so much so that we (for the moment, anyway, we may remove it) added a notice to let regular users of the site know about it in order to improve their browsing experience. I'm assuming that the speed difference is mostly due to a combination of Chrome's faster JavaScript implementation and the fact that no extra JavaScript hacks are required to make up for CSS deficiencies, but regardless of its source it is significant.

Please note that the site is still in a beta mode of sorts and thus may not only still have bugs but will also be in fairly frequent flux. If things don't look as you'd expect from the above description, please try again in a few minutes; you probably just caught it during an update.

Friday, August 28, 2009

Maintaining Data Integrity

I promised several posts ago that I'd discuss some of the interesting internals of Antichrist Watch in more detail.

I've since gotten a bit off track, posting three entries about some of the Repoze stuff I've been doing lately. As I've been getting deeper and deeper into other projects I realize I'd better catch up and make good on my original promise or else I may never get back to it.

Today I figured I'd cover the principles behind Antichrist Watch's back-end database connection. It's using PostgreSQL and making heavy use of some of the features that many other database servers lack. In particular, PostgreSQL has great support for embedded logic (also known in database land as stored functions or stored procedures); not only does it have its own PL/pgSQL (similar to Oracle's PL/SQL) it also supports embedding code in other languages like Python, Perl, Tcl, and others besides. This embedded code can be tied into constraints that get executed (roughly) prior to a SQL command, rules that get executed (roughly) during a SQL command, and triggers that get executed after a SQL command. Additionally these stored procedures can be called directly allowing not only the effective batching of SQL commands but also multiple SQL commands bound together logically (that is, it's trivial to make a stored procedure execute one list of SQL commands under one set of conditions but execute another under a different set of conditions), or fashioned into views that behave like tables.

This combination gives the designer two big advantages over less capable database systems: first, it becomes possible to embed code to help guarantee data integrity within the database itself by careful use of constraints, rules, and triggers; and second it becomes possible to hide the database's implementation details from any client code by creating a public API via stored procedures. Thus we're basically using information hiding and restricting data access to certain controlled channels. I'm guessing that anyone who has read this far will understand in principle why these are good things. I'll carry it further though and list some specific benefits obtained by using this type of database design.

No raw tables are accessed in Antichrist Watch. It has a database access API fashioned from stored procedures and views. This level insulates the client from the raw tables; right now internally it is heavily normalized and even uses table inheritance to avoid repetition, but if at some point down the road it became necessary to duplicate some information for the sake of performance, the tables could be completely redesigned without requiring any change in client code whatsoever. Furthermore, with a clean interface it's easy to change clients. Antichrist Watch has had three completely different front-ends already. My original version was hacked together quickly from basic Python; my second version used PHP; the current version uses Twisted and Zope Page Templates (via Repoze Chameleon). At each stage making the transition was reasonably easy. It would have been a bear (possibly even unthinkable) if the interface were using raw tables and the client was responsible for data integrity.

On that note it must be said that data integrity is also improved. I'm a big believer in trying to enforce data constraints as close to the data itself as practically possible. The further away the constraint enforcement is from the data, the more space there is for loopholes, security holes, and logic errors to breed. Antichrist Watch has all data constraints and relationships enforced by the database itself. Even if someone with malicious intent were to somehow get beyond the client-side JavaScript and server-side Twisted Python (or some bugs were to crop up in my code, be it client-side or server-side), the allowed options on the database side are restricted limiting the potential damage that could be done. Basically it would be possible for fake data to get inserted, but it would not be possible for invalid data to get inserted, and this can mean the difference between a minor failure (accompanied by some embarrassment) and a critical failure (accompanied by a system crash). As an added bonus overall performance also often improves by moving such data constraint enforcement into the tightly optimized world of the database server itself.

Please note that I'm not advocating the moving of all logic out of things like Zope, Twisted, or PHP into the database server itself. This would be counterproductive. Tools for developing in languages like PL/pgSQL tend to be more primitive than tools for developing in languages like unembedded Python, and application logic definitely belongs outside the database. What I am advocating is making the database responsible for its own data checking while hiding its source internals.

Friday, July 31, 2009

Plone 3.3 and Repoze

While I've already discussed what's generally required for making Plone run with Repoze and mod_wsgi, I had been using Plone 3.3rc3. Anyone (myself included) who's tried to sub in anything more recent (like Plone 3.3rc4) has run into a problem with version conflicts due to changes between 3.3rc3 and 3.3rc4. Specifically the upgrade of five.localsitemanager from 0.4 to 1.1 sets off a zope.component version conflict, as any version of it 1.0 or beyond tries to pull in a whole bunch of eggs that aren't really necessary. Now obviously pinning five.localsitemanager to a version less than 1.0 will allow the buildout to complete properly, but this isn't really a good solution as it creates a mutant (and thus untested) version of Plone that may or may not work in random situations.

The real fix isn't that hard, though. It's simply a matter of utilizing fake Zope 2 eggs. The only catch here is that in order for them to work, they must be in place before the Plone eggs get pulled, but as always they cannot be created until after Zope 2 itself is in place. This necessitates splitting the one Zope part I showed last time into two and sticking a fake eggs part in between. The second part will have to reference the eggs of the first part. To make things just a little more interesting, z3c.recipe.fakezope2eggs expects a location to be provided by the [zope2] part. We thus have to create one. The changed section will look as follows:

[zope2]
recipe = zc.recipe.egg
dependent-scripts = true
location = ${buildout:directory}/parts
eggs =
    lxml
    repoze.zope2

[zopeliblink]
recipe = iw.recipe.cmd
on_install = true
on_update = true
cmds =
    mkdir -p ${buildout:directory}/parts/lib/python
    ln -sh ${buildout:directory}/eggs/zopelib*/zope ${buildout:directory}/parts/lib/python/zope

[fakezope2eggs]
recipe = z3c.recipe.fakezope2eggs

[plone]
recipe = zc.recipe.egg
dependent-scripts = true
interpreter = zopepy
eggs =
    ${zope2:eggs}
    Plone
    PIL
    Products.DocFinderTab
    Products.ExternalEditor
    plone.openid
    deliverance
    repoze.urispace
    repoze.dvselect
    mysite.policy

It's probably obvious, but just in case it's not, the two new parts have to be added to the parts section near the top:

parts =
    lxml
    zope2
    zopeliblink
    fakezope2eggs
    plone
    instance
    slugs
    addpaths

Don't forget that the order is important, as the fake Zope eggs have to be created after Zope proper but before Plone.

Deliverance and URISpace should continue to work without changes.

Also, I've created eggs for zopelib 2.10.8.0 and ZODB 3.7.3 (the official versions that the Plone 3.3 builds are based upon; I've also made a zopelib 2.11.3 for people using straight Zope) and submitted them to the Repoze guys for inclusion in their distribution section. If you'd like to play around with these before they become official, you can find them in the Saugus.net distribution section.

Wednesday, July 15, 2009

Repoze, Zope, Plone, URISpace, and Deliverance

I've further developed the technique I outlined in my last post for making Plone run with Repoze and mod_wsgi. In particular I've included the ability to skin via Deliverance (including the ability to separate out different sections of the site for different treatments via URISpace) and enhanced the addpaths.py script to remove the limitations I'd indicated were there without going into details; it now both handles update cases better and is far more intelligent with regards to ensuring that egg path order is maintained regardless of whether Paste or mod_wsgi is being used. If you had trouble getting my prior instructions to work in your environment, you may just want to update to the smarter addpaths.py included below.

My use case is Saugus.org (although note that the current public version may not yet reflect the changes discussed in this post as they're obviously being done on a separate development server). Basically several different not-for-profit entities have their sites located on this domain, and while there is a significant overlap between members of the various sub-sites (probably 10% - 25% or so) it's certainly not absolute, and each sub-site has different administrators and different styles. While there are ways of making Deliverance theme things differently for different portions of a site, I didn't really care for any of them in this application as they'd make it too easy to break things for particular sub-sites and/or make it too difficult to actually do the theming for individual sub-sites. The obvious solution is repoze.urispace (which implements the W3C URISpace specification in a way that can be used to make Deliverance distinguish between different portions of a site). With repoze.urispace, it becomes easy to give each sub-site its own independent set of Deliverance rules and themes, and working on one will in no way affect another.

Making It Happen

Now that you understand the motivation it's time to see the implementation. Generally one needs to follow the instructions I outlined in my earlier post, but make a few changes:

  1. First, substitute its copy of addpaths.py with the following much improved version prior to running bin/buildout:

    import os
    from dircache import listdir
    
    # The bin directory holds all the executables for the buildout
    BinDir = 'bin'
    
    # The unadorned files aren't given any of the egg paths on creation
    UnadornedFiles = (os.path.abspath(BinDir+'/zope2.wsgi'),)
    
    # Most regular files are given the egg paths, but not the old-style products path
    RegularFiles = [os.path.abspath('%s/%s'%(BinDir,filename)) for filename in listdir(BinDir)]
    
    # Eggs can live in more than one location
    EggDirs = ('eggs',)
    
    # Old style products should all be contained in one location
    ProductsPath = os.path.abspath('products')
    
    # The sample file should be regular file with a regular egg path
    SampleFile = os.path.abspath(BinDir+'/paster')
    
    def main(options, buildout):
        # We have to ensure that the zopelib directory is earlier in the search path
        # than other possibly competing packages.  Unfortunately, we don't know its
        # full name a priori and have to hunt it down first.
        for eggDir in EggDirs:
            for filename in listdir(eggDir):
                if 'zopelib' in filename:
                    zopelibPath=os.path.abspath('%s/%s/%s'%(os.getcwd(),eggDir,filename))
                    break
    
        # First we handle the regular files.  We both relocate the zopelib component
        # (if present) to the beginning of the list and prepend the old-style products
        # path.  Note that this will ignore unadorned files mixed in.
        for filename in RegularFiles:
            lines = open(filename, 'r').readlines()
            file = open(filename, 'w')
            for line in lines:
                if 'zopelib' not in line:
                    file.write(line)
                if line.startswith('sys.path'):
                    if ProductsPath not in ' '.join(lines):
                        file.write("  '%s',\n"%ProductsPath)
                    if zopelibPath:
                        file.write("  '%s',\n"%zopelibPath)
            file.close()
    
        # The path list should now be perfect in our sample file.  Grab it.
        lines=open(SampleFile,'r').readlines()
        lineNum=begLineNum=endLineNum=0
        for line in lines:
            lineNum+=1
            if line.startswith('import sys'):
                begLineNum=lineNum-1
            elif begLineNum and ']' in line:
                endLineNum=lineNum
        eggLines=lines[begLineNum:endLineNum]
    
        # Now we're ready to handle the unadorned files.  Simply replace any existing
        # sys path with the good one we've obtained from the sample, or add it if nothing
        # yet exists.  We're assuming that in a healthy file the import os statement will
        # occur after the import sys statement.
        for filename in UnadornedFiles:
            lines = open(filename,'r').readlines()
            alreadyProcessed=False
            lineNum=begLineNum=endLineNum=0
            for line in lines:
                lineNum+=1
                if line.startswith('import sys'):
                    alreadyProcessed=True
                    begLineNum=lineNum-1
                elif begLineNum and ']' in line:
                    endLineNum=lineNum
                elif line.startswith('import os') and not alreadyProcessed:
                    begLineNum=endLineNum=lineNum-1
            lines[begLineNum:endLineNum]=eggLines
            file = open(filename,'w')
            file.writelines(lines)
            file.close()
    
        # All done!  Let's just tell the user roughly what we've done.
        print "Egg paths added to %s" % ', '.join(UnadornedFiles)
        print "Product path added to %s" % ', '.join(RegularFiles)
    
  2. Next, a slightly modified buildout.cfg must be used (obviously also before running bin/buildout). The following should work:

    [buildout]
    extends =
        http://good-py.appspot.com/release/repoze.zope2/1.0
        http://dist.plone.org/release/3.3rc3/versions.cfg
    
    versions = versions
    
    find-links =
        http://dist.repoze.org/zope2/latest
        http://dist.repoze.org/zope2/dev
        http://dist.plone.org/release/3.3rc3
        http://download.zope.org/ppix/
        http://download.zope.org/distribution/
        http://effbot.org/downloads
    
    develop =
        src/mysite.policy
    
    parts =
        lxml
        zope2
        instance
        slugs
        addpaths
    
    [versions]
    zopelib = 2.10.7.0
    
    [lxml]
    recipe = z3c.recipe.staticlxml
    egg = lxml
    libxml2-url = http://xmlsoft.org/sources/libxml2-2.7.2.tar.gz
    libxslt-url = http://xmlsoft.org/sources/libxslt-1.1.24.tar.gz
    
    [zope2]
    recipe = zc.recipe.egg
    dependent-scripts = true
    interpreter = zopepy
    eggs =
        lxml
        repoze.zope2
        Plone
        PIL
        Products.DocFinderTab
        Products.ExternalEditor
        plone.openid
        deliverance
        repoze.urispace
        repoze.dvselect
        mysite.policy
    
    [slugs]
    recipe = collective.recipe.zcml
    zope2-location=${buildout:directory}
    zcml =
        mysite.policy
    
    [instance]
    recipe = iw.recipe.cmd
    on_install = true
    cmds =
       bin/mkzope2instance --use-zeo --zeo-port=${buildout:directory}/var/zeo.zdsock --zope-port=8888
       sed -i "" "s/server localhost:/server /" ${buildout:directory}/etc/zope.conf
       echo "Please run 'bin/runzeo -C etc/zeo.conf' and 'bin/paster serve etc/zope2.ini', then 'bin/addzope2user  '"
    
    [addpaths]
    recipe = z3c.recipe.runscript
    install-script = addpaths.py:main
    update-script = addpaths.py:main
    

    As before note the dummy mysite.policy product included to show the most general solution. If you are not developing any products and do not need any additional ZCML slugs, these can be totally omitted. Otherwise they should be changed accordingly.

  3. Next, add rules and themes subdirectories to the etc directory.

  4. Create a default.xml file in the rules directory. The following sample:

    <?xml version="1.0" encoding="UTF-8"?>
    <rules xmlns:xi="http://www.w3.org/2001/XInclude"
           xmlns="http://www.plone.org/deliverance">
      <replace theme="/html/head" content="/html/head" />
      <replace theme="/html/body" content="/html/body||/html/frameset" />
    </rules>
    

    is basically a NOP and will get you started. There's more information on the Deliverance site on the sorts of commands you can add in here. Ultimately you'll be making more of these rules files with each one being used for a different section of your site.

  5. Create a default.html file in the themes directory. Something as basic as:

    <html>
      <head>
        <title>default.html theme for Deliverance</title>
      </head>
      <body>
      </body>
    </html>
    

    will likewise be enough to get you started. You can ultimately create this file however you'd like using any sorts of HTML generation tools that tickle your fancy. Ultimately you'll be making more of these theme files (probably one to go with each rules file) with each one being used for a different section of your site.

  6. Lastly you'll need to create a URISpace configuration file urispace.xml in the etc directory. While there's more info on the repoze.urispace site on the options that you can include in here, the following is a minimal one that directs everything to the default rules and theme we have already defined:

    <?xml version="1.0" ?>
    <themeselect
       xmlns:uri='http://www.w3.org/2000/urispace'
       xmlns:uriext='http://repoze.org/repoze.urispace/extensions'
       xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
       >
    
     <!-- default theme and rules -->
     <theme>file:///home/eric/WSGI/SaugusOrg/etc/themes/theme.xhtml</theme>
     <rules>file:///home/eric/WSGI/SaugusOrg/etc/rules/rules.xml</rules>
    </themeselect>
    
  7. Finally, before trying to run the new site one must ensure that the WSGI pipeline is properly prepared. The following new zope2.ini will do so:

    [DEFAULT]
    debug = True
    
    [app:zope2]
    paste.app_factory = repoze.obob.publisher:make_obob
    repoze.obob.get_root = repoze.zope2.z2bob:get_root
    repoze.obob.initializer = repoze.zope2.z2bob:initialize
    repoze.obob.helper_factory = repoze.zope2.z2bob:Zope2ObobHelper
    zope.conf = %(here)s/zope.conf
    
    [filter:errorlog]
    use = egg:repoze.errorlog#errorlog
    path = /__error_log__
    keep = 20
    ignore = paste.httpexceptions:HTTPUnauthorized
           paste.httpexceptions:HTTPNotFound
           paste.httpexceptions:HTTPFound
    
    [filter:deliverance]
    use = egg:deliverance#main
    theme_uri = http://www.example.com/
    rule_uri = file:///%(here)s/rules/rules.xml
    
    [filter:urispace]
    use = egg:repoze.urispace#urispace
    filename = %(here)s/urispace.xml
    
    [pipeline:main]
    pipeline = egg:Paste#cgitb
               egg:Paste#httpexceptions
    #           egg:Paste#translogger
               egg:repoze.retry#retry
               egg:repoze.tm#tm
               egg:repoze.vhm#vhm_xheaders
               errorlog
               urispace
               egg:repoze.dvselect#main
               deliverance
               zope2
    
    # Note: replace egg:Paste#cgitb with egg:Paste#evalerror above to get
    # the browser to display eval'able traceback stacks (unsuitable for
    # production).
    
    # If you enable (uncomment) the translogger, it will show access log
    # info to the console.
    
    [server:main]
    use = egg:repoze.zope2#zserver
    host = 127.0.0.1
    port = 8888
    

    The theme referenced in the Deliverance section will never actually be used as it'll be overridden by repoze.urispace.

Details To Note

First off, thanks to Tres Seaver not just for originally writing repoze.urispace and its companion repoze.dvselect, but for also putting up with (and acting on) both my bug reports and suggestions for making things easier to use. I had originally gotten this all working with an earlier version of repoze.urispace and and the original unreleased version of repoze.dvselect, but the method described here is much cleaner, and it wouldn't have been possible without Tres' numerous recent updates to both repoze.urispace and repoze.dvselect.

Second, as seems to happen a lot lately, just as I was starting to write this I was informed of another effort to do something similar. Wojciech Lichota's technique seems to do some things better and some things worse than the technique described above. Depending upon what you're doing, you may find that what he's doing more directly addresses your needs. Probably a hybrid technique will be better than either...

Also, one may wonder about the whole z3c.recipe.staticlxml used above to build lxml and whether or not it's really necessary. It may look like extra work, but this method allows the exact same buildout to work on Mac OS X in addition to regular more traditional UNIX and UNIX-like environments.

Finally, note that as presented here Deliverance and URISpace won't actually do anything... it's expected that you'll add rules and themes appropriate to your own project.

Friday, June 19, 2009

Buildout with Repoze, Zope, and Plone

I know in my last post I indicated that I'd post more about the various technologies I'm using for the Antichrist Watch site, but I've now had a few requests related to some of the Repoze work I'm doing with some of the local Saugus, MA sites, so I'll be making a slight detour. I will get back to the Dojo stuff. I promise.

I discovered Repoze when first dabbling around with what would eventually become the aforementioned Antichrist Watch, and in fact I directly made use of repoze.chameleon to handle its templating needs. I liked what I saw, and decided to try first experimenting a bit with it and then actually porting over some real-world sites to it. We were in the process of doing some hardware upgrades on the some servers anyway, so the sites they hosted seemed liked good candidates. To make things interesting, most of them came in basically two distinct flavors: straight Zope, and customized Plone.

The first group was largely composed of mostly non-technical customers who take advantage of the bottom of the so-called "Z shaped curve" to do basic site edits through the Web. Some of these customers enjoy having fairly current versions of Zope. The second group featured various degrees of customization, from the fairly vanilla to the strikingly different. Most of these were handled via custom Plone products designed more or less in the manner described in Martin Aspeli's Professional Plone Development. With this logical grouping, buildout seemed a logical choice for site construction as I could make just three buildouts and then trivially generate as many sites as desired. The only problems were that the Repoze guys themselves seem not to use it much and (understandably considering the drudgery involved) don't have prepackaged versions of the latest-and-greatest of either Zope or Plone currently available.

Ultimately I wanted all the sites running through mod_wsgi using as few physical servers as practical. I wanted to only have one supervisord instance per physical server monitoring all the ZEO servers it contained. I also wanted to minimize the number of ports in use in order to reduce the bureaucracy of port tracking we'd have to do afterwards.

Getting current versions of Zope and Plone running under Repoze turned out not to be so simple. I read a few articles on the topic in addition to the Repoze Quick Start, but due to differences in versions and/or environment none of them did what I needed.

Making It Happen

Enough introduction! Here's what I did for the variant with Plone:

  1. paster create -t zope2_buildout targetname

    It doesn't much matter what answers are given, as buildout.cfg gets overwritten anyway.

  2. cd targetname

  3. Replaced buildout.cfg with the following:

    [buildout]
    extends =
        http://good-py.appspot.com/release/repoze.zope2/1.0
        http://dist.plone.org/release/3.3rc3/versions.cfg
    
    versions = versions
    
    find-links =
        http://dist.repoze.org/zope2/latest
        http://dist.repoze.org/zope2/dev
        http://dist.plone.org/release/3.3rc3
        http://download.zope.org/ppix/
        http://download.zope.org/distribution/
        http://effbot.org/downloads
    
    develop =
        src/mysite.policy
    
    parts =
        zope2
        instance
        slugs
        addpaths
    
    [zope2]
    recipe = zc.recipe.egg
    dependent-scripts = true
    interpreter = zopepy
    eggs =
        repoze.zope2
        Plone
        PIL
        Products.DocFinderTab
        Products.ExternalEditor
        plone.openid
        mysite.policy
    
    [slugs]
    recipe = collective.recipe.zcml
    zope2-location=${buildout:directory}
    zcml =
        mysite.policy
    
    [instance]
    recipe = iw.recipe.cmd
    on_install = true
    cmds =
       bin/mkzope2instance --use-zeo --zeo-port=${buildout:directory}/var/zeo.zdsock --zope-port=8888
       sed -i "" "s/server localhost:/server /" ${buildout:directory}/etc/zope.conf
       echo "Please run 'bin/runzeo -C etc/zeo.conf' and 'bin/paster serve etc/zope2.ini', then 'bin/addzope2user  '"
    
    [addpaths]
    recipe = z3c.recipe.runscript
    install-script = addpaths.py:main
    update-script = addpaths.py:main
    
  4. Added the file addpaths.py to the buildout's top level directory with the following contents:

    import os
    from dircache import listdir
    
    BinDir = 'bin'
    UnadornedFiles = ('bin/zope2.wsgi',)
    RegularFiles = ['%s/%s'%(BinDir,filename) for filename in listdir(BinDir)]
    EggDirs = ('eggs',)
    ProductsPath = os.path.abspath('products')
    
    def main(options, buildout):
        for filename in UnadornedFiles:
            lines = open(filename,'r').readlines()
            file = open(filename,'w')
            alreadyProcessed=False
            for line in lines:
                if line.startswith('import sys'):
                    alreadyProcessed=True
                elif line.startswith('import os') and not alreadyProcessed:
                    file.write("import sys\nsys.path[0:0] = [\n")
                    for eggDir in EggDirs:
                        for filename in listdir(eggDir):
                            file.write("  '%s',\n"%os.path.abspath('%s/%s'%(eggDir,filename)))
                    file.write("  ]\n\n")
                file.write(line)
        for filename in RegularFiles:
            lines = open(filename, 'r').readlines()
            file = open(filename, 'w')
            for line in lines:
                file.write(line)
                if line.startswith('sys.path'):
                    file.write("  '%s',\n"%ProductsPath)
            file.close()
    
        print "Egg paths added to %s" % ', '.join(UnadornedFiles)
        print "Product path added to %s" % ', '.join(RegularFiles)
    
  5. python2.4 bootstrap.py

  6. bin/buildout

    Just ignore any 'return' outside function types of errors you see.

Once these steps have been completed, ZEO can be started with the command bin/runzeo -C etc/zeo.conf (which can be easily controlled via supervisord) and Zope can be started manually for testing with bin/paster serve etc/zope2.ini or in a more production-ready form via mod_wsgi using zope2.wsgi.

Details To Note

I borrowed but heavily modified the addpath.py concept already living in the Plone collective to add all missing paths to all the scripts in bin. The WSGI script as created lacks pretty much everything, and all the others lack the products directory used for old-style products. As written it's not very clever and could be greatly improved, but it serves my current needs.

The order of parts matters somewhat, as both slugs and addpaths rely on pieces created earlier.

I'm using direct sockets in lieu of ports since these sites are living on single physical servers anyway and it means I have less to manage afterwards. Unfortunately the mkzope2instance doesn't do exactly what one might want in this case, so the sed line is necessary afterwards to clean up.

The slugs section adds ZCML slugs for those products that need them, including the hypothetical product mysite.policy being actively developed in src.

I just recently discovered Martin Aspeli's Good-Py. I was originally individually pinning the versions of the pieces that mattered, and only made this simplification this morning... so far it seems to be fine, though.

Just before starting this post I spotted Alex Clark's post also discussing this topic. We're doing a lot alike, but a few things differently. Depending upon what you're doing, you may find that what he's doing more directly addresses your needs.

Earlier on I mentioned how I needed to handle two main groups of sites: Plone ones and straight Zope ones. The above instructions only cover how I handled the Plone ones. This post has gotten long enough already, and although my treatment of the straight Zope ones is similar it's different enough to make things confusing for those who don't need it. If you need it, let me know and I'll probably give it a similar treatment to what I did here.

Friday, May 29, 2009

Antichrist Structure

Last time I mentioned how I'll post some articles discussing how some of the key technologies (like the Dojo Toolkit, PostgreSQL, Twisted, Zope, Python, and Repoze) are used by Antichrist Watch. In this post I'll discuss some of the general architecture behind the site, along with motivations behind some of the design decisions I made. In future posts I'll follow up with specifics related to some of the key technologies employed.

I've already covered the originally planned core features of the site, so I won't repeat them here. What's important for this discussion is that those features require fast access to live data on votes, tallies, and comments, and the ability to display that data in a number of ways, and that everything be done cleanly according to current industry standards (that is, using decent semantic HTML backed by appropriate CSS and RDF that'll not cause the respective W3C validators to choke). Obviously an AJaX design is called for, and the individual toolkits and frameworks used must be flexible enough to support reasonably standards-compliant code and mark-up. This may sound trivial on the surface, but oftentimes toolkits, frameworks, and even design tools have their own axes to grind and introduce deliberate incompatibilities or proprietary extensions in an attempt to promote their own stuff or denigrate others' stuff.

PostgreSQL for the back-end database was an easy choice. It performs well, has great support for standard SQL (and even XML) built-in, and has advanced capabilities like triggers, rules, and stored procedures that can be used to help ensure data integrity. I'm a big believer in keeping code that enforces data integrity as deep within a Web application as practically possible; the closer it is to the front-end the more space there is for something to go wrong. With PostgreSQL I was able to get most of the code that handles data integrity embedded in the database itself, so not only is there not even a theoretical way for a crafty user to bypass constraints, there's not even a theoretical way for misbehaving middleware to bypass them. I'll discuss this in more detail later.

The Dojo Toolkit for the front-end JavaScript will be surprising to some as many believe its recent (post 1.0) versions cannot be used within a site that validates. This is incorrect. What is true is that Dojo's Dijit widgets can not be used in a validating site via mark-up without modifying Dojo's parser. This is a completely different statement, as there are other (and better) ways to apply Dijit widgets to a site. I'll discuss my use of the Dojo Toolkit in quite a bit more detail later. Most of the site's code is actually Dojo code, so I suspect I'll dedicate a couple of posts to it.

Of course, the JavaScript needs to be tied into some mark-up, and for this I used a single Zope Page Template designed to output clean XHTML. This would be trivial for a typical Zope site, but in this case it's certainly not typical as only scattered Zope technologies are being used. I'll get back to this in a bit.

The Dojo Toolkit has the concept of an abstracted data store that it leverages heavily for most of its widgets. It has numerous concrete implementations of this store that make binding it to various data sources fairly trivial. The one it includes for dealing with relational databases requires data to adhere to a particular format that would be awkward to generate directly in PostgreSQL, so it was necessary to create some code to fetch data from PostgreSQL and format it for Dojo. Since I had already built a great deal of intelligence into the PostgreSQL database itself, it was actually fairly simple to create this middleware, and during development I tested out a couple of different approaches (including raw Python and PHP) before settling on one designed around Twisted Python. It makes good use of Twisted Enterprise and the psycopg2 database adapter (including its splendid DictCursor) to make database access fast, efficient, and comprehensible. I also tied the rendering of the main Zope Page Template into this Twisted app by virtue of Repoze Chameleon in lieu of making it a typical Zope site. This whole layer is probably also worthy of its own post.

That's a quick run-through of what it takes to reveal the Antichrist. I'll be posting more details on these individual components later; please feel free to point out areas of particular interest.

Wednesday, March 18, 2009

Who Is The Antichrist?

I know it's been a long time since I've written anything here. A really long time. I figured I'd break this silence with an overview of a project I've been working on lately that will hopefully prove interesting.

It's a site called Antichrist Watch. It's a site that attempts to answer this article's titular question who is the Antichrist? in a Web 2.0 sort of way using a fun mix of AJaX technologies.

I recently realized that much of the work that I do ends up buried within various intranets and extranets and is thus invisible to the world at large. While I've worked on all sorts of dynamic systems that do all sorts of things (both server-side and client-side), I really don't have any that I can show people. Most of the public sites that I work on tend to be simpler sorts of things, and people viewing them tend to mostly remember the graphics design elements (which in most cases I didn't actually work on anyway). With Antichrist Watch I have something that I can show both curious acquaintances and potential clients. The dynamic properties of it are fundamental to how the site works, and even people not all that familiar with the Web can get it.

Under the hood it uses a somewhat unusual combination of key technologies that not too many others use together: the Dojo Toolkit, Zope, Twisted, Python, and PostgreSQL. While people familiar with my company Saugus.net will not at all be surprised with Zope, Python, and PostgreSQL (we've been using them all pretty prominently for years), Dojo and Twisted may be unexpected. We've actually also been using these off and on for quite awhile too, but not in too many public projects. In fact, thinking back, there are currently only two public areas of Saugus.net sites that utilize Dojo, and no public areas at all that use Twisted. While we are hoping to change this in the relative near future, Antichrist Watch demonstrates how these technologies can work together in harmony now.

While I'm still working on the site fixing bugs and adding new minor features, all of my originally planned core features are in place:

  1. The ability to vote on one's own choice of Antichrist, whether or not he or she is already in the list, with autocompletion based upon the live list of everyone's past choices.
  2. Charts and tables showing current leaders for the past day, the past week, the past month, the past year, and all time.
  3. Additional charts and tables showing votes as they come in with individual votes mapped to rough geographical areas, and the recent levels of voting activity over time.
  4. A way to drill down and see the numbers for any particular Antichrist candidate.
  5. A facility that allows one to argue for or against the likelihood of a particular Antichrist candidate being the Antichrist, respond to others' comments, and display it all in a tree.
  6. The means to moderate the aforementioned comments.
  7. A login system allowing people to establish identities within the site and voluntarily add personal information that will be displayed along with their comments to add personality.
  8. Support for various icon types to go along with login identities, including at minimum PIcons, Gravatars, and .Mac icons.
  9. Support for OpenID to ease the whole login process.
  10. Basic support for forward and back buttons within the site.
  11. A reasonably semantic design including a fully populated external RDF metadata file boasting appropriate Dublin Core terms and DOAP information (plus a few other useful nuggets).
  12. Cleanly handle the full UTF-8 character set throughout.
  13. Proper validation of pretty much everything according to the W3C's respective validators.

It does all of this in a way that almost totally avoids page reloads (the only exception is with OpenID authentication which by definition requires a redirect offsite and back). It also heavily makes use of lazy loading within its tables and trees, so that although access is provided to all data all the way back to day one, only data actually required for the current view gets downloaded keeping things fast on both client and server side. Of course, through the magic of Dojo it also works pretty well in all major browsers.

There are still lots of little details that need to be added. A means for automated password recovery quickly comes to mind, as well as more styling (what's there now is pretty close to being bare Dojo tundra), some more selective refreshing of data based on user input, better error handling (both in terms of weird user input and HTTP type errors like the dreaded 404), some friendly help for new users, and some more intelligence added to certain buttons and dialogs already within the site.

I'll post some more articles here discussing some of the key technologies and how I put them to use within the site. If anyone has any particular areas of interest, don't be afraid to speak up.