<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>notes from Ken &#187; Python</title>
	<atom:link href="http://www.notes.xythian.net/category/programming/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.notes.xythian.net</link>
	<description>Links, technical notes, whatnot.</description>
	<lastBuildDate>Sun, 04 Jul 2010 19:01:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Python and libyahoo2, take 2</title>
		<link>http://www.notes.xythian.net/2009/06/01/python-and-libyahoo2-take-2/</link>
		<comments>http://www.notes.xythian.net/2009/06/01/python-and-libyahoo2-take-2/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 04:48:08 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/?p=247</guid>
		<description><![CDATA[Last October I worked on a libyahoo2 binding using Pyrex and got it far enough along to discover the version of libyahoo2 I was using couldn&#8217;t log into YIM. Several Ubuntu updates later I noticed libyahoo2 had been upgraded and, sure enough, this version worked. I updated the Git repository mentioned in that post but [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.notes.xythian.net/2008/10/27/pyrex-and-libyahoo2-or-not/">Last October</a> I worked on a libyahoo2 binding using Pyrex and got it far enough along to discover the version of libyahoo2 I was using couldn&#8217;t log into YIM.</p>
<p>Several Ubuntu updates later I noticed libyahoo2 had been upgraded and, sure enough, this version worked.</p>
<p>I updated the Git repository mentioned in that post but it is likely if I continue to work on it it will go to <a href="http://github.com/xythian/python-yahoo2/tree/master">python-yahoo2 on GitHub</a>.</p>
<p>Pyrex&#8217;s blindness where &#8220;const&#8221; is concerned is pretty annoying.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2009/06/01/python-and-libyahoo2-take-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ubuntu 9.04 and python-virtualenv</title>
		<link>http://www.notes.xythian.net/2009/05/31/ubuntu-904-and-python-virtualenv/</link>
		<comments>http://www.notes.xythian.net/2009/05/31/ubuntu-904-and-python-virtualenv/#comments</comments>
		<pubDate>Sun, 31 May 2009 19:26:17 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Experience]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/?p=245</guid>
		<description><![CDATA[I recently upgraded a bunch of physical and virtual machines from various Ubuntu v.older version to Ubuntu 9.04. Naturally, this broke my Python development environment since Python was upgraded from 2.5 to 2.6. I had instructed easy_install to put things into /usr/local. The upgrades from Ubuntu v.older to 9.04 went quite smoothly on every machine [...]]]></description>
			<content:encoded><![CDATA[<p>I recently upgraded a bunch of physical and virtual machines from various Ubuntu v.older version to Ubuntu 9.04.   Naturally, this broke my Python development environment since Python was upgraded from 2.5 to 2.6.  I had instructed easy_install to put things into /usr/local.</p>
<p>The upgrades from Ubuntu v.older to 9.04 went quite smoothly on every machine including the ones where I had to do several stepped upgrades since on various machines I skipped one or more upgrades prior to 9.04.</p>
<p>Rather than return to my old ways of just hosing packages (and, worse, &#8220;setup.py develop&#8221; symlinks) into /usr/local I&#8217;ve decided to use python-virtualenv to create some non-root-owned Python environments to work on my various Python-y apps.  The main difference to me is being able to keep things I&#8217;m working on separate from one another.</p>
<p>We&#8217;ll see how it goes.</p>
<p>I&#8217;m afraid all these years of using package-managed software has made me soft &#8212; now a piece of software not having a nice package means I need to think a bit harder about if it&#8217;s worth dealing with the hassle of having that software outside of the package manager.  (That applies to when I&#8217;d like to be using a more recent version of the package, too, though generally on Ubuntu I run into that a lot less frequently than I did on Debian.)   I have mixed feelings about setuptools and easy_install.</p>
<p>I am pleased that now mod_wsgi is in a package so I can switch to that version rather than the one I installed by hand.   mod_wsgi was worth it to install by hand to use until there was a nice package available.  It&#8217;s Just Better than my previous mechanisms for running WSGI apps in, under, or behind Apache.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2009/05/31/ubuntu-904-and-python-virtualenv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Singleshot and Git and first steps</title>
		<link>http://www.notes.xythian.net/2009/03/04/singleshot-and-git-and-first-steps/</link>
		<comments>http://www.notes.xythian.net/2009/03/04/singleshot-and-git-and-first-steps/#comments</comments>
		<pubDate>Wed, 04 Mar 2009 08:12:18 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Experience]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Singleshot]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/?p=240</guid>
		<description><![CDATA[Singleshot is the photo album software I use to host photos.xythian.com. The last release was .. rather a while ago though I have been making minor enhancements and bug fixes since then for my own use (including flash video support which extracts thumbnails using mplayer and embeds flowplayer to play) though there&#8217;s no video on [...]]]></description>
			<content:encoded><![CDATA[<p>Singleshot is the photo album software I use to host <a href="http://photos.xythian.com/">photos.xythian.com</a>.  The last release was .. rather a while ago though I have been making minor enhancements and bug fixes since then for my own use (including flash video support which extracts thumbnails using mplayer and embeds flowplayer to play) though there&#8217;s no video on my public photo site).</p>
<p>Prior to today every time I took the time to try and update the source tree on Sourceforge I ran into a snag such as &#8220;Sourceforge CVS is unavailable.&#8221;   Months ago, when I tried to migrate from CVS to SVN on Sourceforge I ran into errors.  I didn&#8217;t even try to investigate what went awry as I was low on patience and time.</p>
<p>As a result, my own tree (first in Perforce and now in Subversion) has gotten pretty out of sync from that tree.  I decided to try GitHub and try using Git for something &#8220;real&#8221; &#8212; so I&#8217;m going to move Singleshot&#8217;s source to GitHub and then push all my changes to there.</p>
<p>I followed (mostly) the steps from <a href="http://stackoverflow.com/questions/584522/how-to-export-revision-history-from-mercurial-or-git-to-cvs/584567">How to export revision history from mercurial or git to cvs</a> which describes how to use the git cvsimport command to pull Singleshot&#8217;s revision history from its Sourceforge CVS repository, testing making a tiny change to a README, and pushing that change back to the Sourceforge CVS tree.</p>
<p>(After backing up the source tree on Sourceforge using rsync.)</p>
<pre>
% export CVS_RSH=ssh
% git cvsimport -d :ext:xythian@singleshot.cvs.sourceforge.net:/cvsroot/singleshot -C test1 \
    -r cvs -k -A /home/fox/src/singleshot-import/authors.txt  singleshot
Initialized empty Git repository in /home/fox/src/singleshot-import/test1/.git/
[longish pause]
Counting objects: 561, done.
Compressing objects: 100% (502/502), done.
Writing objects: 100% (561/561), done.
Total 561 (delta 317), reused 0 (delta 0)
</pre>
<p>(Some poking around to see if things look reasonable, making a small change to the README, commiting it first to the git repository and then back to Sourceforge using git cvsexportcommit.)</p>
<p>Then I pushed the whole thing to a GitHub tree:</p>
<p>http://github.com/xythian/singleshot/tree/master</p>
<pre>
% git remote add origin git@github.com:xythian/singleshot.git
% git push origin master
</pre>
<p>That was pretty straightforward and appears to have worked.</p>
<p>My plan is to merge my changes into the out of date tree and then push the new code to GitHub.  If it all works out I&#8217;ll probably indicate that the Sourceforge project is defunct there and host the code on GitHub henceforth. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2009/03/04/singleshot-and-git-and-first-steps/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pyrex and libyahoo2 (or not)</title>
		<link>http://www.notes.xythian.net/2008/10/27/pyrex-and-libyahoo2-or-not/</link>
		<comments>http://www.notes.xythian.net/2008/10/27/pyrex-and-libyahoo2-or-not/#comments</comments>
		<pubDate>Tue, 28 Oct 2008 04:24:58 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Experience]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2008/10/27/pyrex-and-libyahoo2-or-not/</guid>
		<description><![CDATA[I had something I wanted to try in Python running against Yahoo! Messenger. The obvious choice of library for talking to Yahoo! Messenger was libyahoo2.&#160;&#160; I could not find a Python binding for it, so I started sketching one together with SWIG.&#160; The first step is creating a bunch of empty callbacks. The libyahoo2 in [...]]]></description>
			<content:encoded><![CDATA[<p>I had something I wanted to try in Python running against Yahoo! Messenger.  The obvious choice of library for talking to Yahoo! Messenger was <a href="http://libyahoo2.sourceforge.net/">libyahoo2</a>.&#160;&#160; I could not find a Python binding for it, so I started sketching one together with SWIG.&#160; The first step is creating a bunch of empty callbacks.  The libyahoo2 in Ubuntu&#8217;s package is compiled without USE_CALLBACK_STRUCT, so libyahoo2 expects to find a bunch of extern functions defined to interact with the host.&#160;I made empty callbacks in a C file and started reading more about the API. </p>
<p>It rapidly became clear that I was going to want a layer on top of the library to make interacting with it from Python more palatable.&#160;&#160; I switched to Pyrex, since I wanted to write that wrapper in Python (or something Python-like) rather than building a straight-C wrapper so using SWIG would continue to make sense.&#160;&#160; SWIG&#8217;s big benefit in my mind over Pyrex is easy support for more languages and better tools for defining straight wrappers.&#160; I wasn&#8217;t going to get &quot;free&quot; use in other languages and it wasn&#8217;t going to be a straight wrapper now but rather a module that exposed the functionality of libyahoo2 to Python. </p>
<p> I kept the callbacks.c I had defined but started migrating the definitions to the Pyrex file as I implemented.&#160; This way my library would continue to link without complaint about functions I didn&#8217;t have yet.</p>
<p>Following the usual pattern for Python bindings to libraries that need wrappers to be more Pythonic, I planned to have a &#8216;yahoo2&#8242; module in Python and a &#8216;_yahoo2.so&#8217; extension module.&#160; The _yahoo2 module is written in Pyrex. </p>
<p>libyahoo2 appears not to adhere to the documentation it defines.&#160; It&#8217;ll call ext_yahoo_remove_handler with a tag that was never returned by ext_yahoo_add_handler&#8230; (0).&#160; It looks like an undocumented (that I found) part of the charter of ext_yahoo_add_handler is not to add a given handler more than once. </p>
<p>This also made defining the callbacks the way libyahoo2 expected easier.</p>
<p>D&#8217;oh, I got it far enough along to get this:</p>
<blockquote>
<p>libyahoo2.c:620: debug: Key: 4          Value: Yahoo_Messenger<br />
	    libyahoo2.c:620: debug: Key: 5          Value: hodorbot<br />
	    libyahoo2.c:620: debug: Key: 14         Value: This version of Messenger expired on April 2, 2008. Please upgrade now to the latest supported version: http://messenger.yahoo.com Learn more: http://messenger.yahoo.com/eol<br />
	    libyahoo2.c:620: debug: Key: 15         Value: 1225167367<br />
	    libyahoo2.c:620: debug: Key: 97         Value: 1
      </p>
</blockquote>
<p>I&#8217;ve learned what I wanted to, so instead of seeing if a more recent libyahoo2 than what&#8217;s in Ubuntu works (&gt; 0.7.5+dfsg-3), I&#8217;m just going to call it here.&#160; I&#8217;ll post it in case someone can learn something useful from it.</p>
<p>You can pull a working repository with a command like: </p>
<blockquote>
<p>git clone http://notes.xythian.net/media/2008/10/pythonlibyahoo2.git/ mydirectory </p>
</blockquote>
<p>There&#8217;s three files that do anything: </p>
<ul>
<li>yahoo2.py &#8211; wraps some of the lower level details from the Pyrex layer, including exposing the IO bits as an asyncore dispatcher</li>
<li>_yahoo2.pyx &#8211; is the binding</li>
<li>callbacks.c &#8211; exists to have empty functions defined to satisfy the linker until those have implementations in the Python binding  </li>
</ul>
<p>Not much works, really; there&#8217;s implementations of connect and async_connect, but only async_connect is called by libyahoo2 before it gets far enough along to not log in with the error message above.&#160;There&#8217;s a wrapper for setting the log level (which was key to discovering the above fact&#8230;).</p>
<p>This is a typical definition of one of the library callbacks: </p>
<blockquote class="code"><pre>
cdef public int ext_yahoo_connect_async(int id, char *host, int port, \
  yahoo_connect_callback callback, void *callback_data):
   cdef ConnectionHandle handle
   handle = ConnectionHandle(id, host, port)
   handle.connect_callback = callback
   handle.connect_data = callback_data
   handle.async_connect(callback, callback_data)
   HANDLER_MAP[id].connections.append(handle)
   MANAGER.add(handle)
   return handle.fileno()
</pre>
</blockquote>
<p>ConnectionHandle is an extension class which wraps a Python socket object and can make the callbacks the libyahoo2 IO layer expect.&#160;&#160; MANAGER is the connection manager, which is implemented in the higher level layer (yahoo2.py) so it could be replaced with something that interacted with a GUI event loop. </p>
<p>I&#8217;m probably done with this for the forseeable future and hope publishing it along with a git repoistory may allow someone to build on or learn from what I&#8217;ve done.&#160;&#160; I stumbled upon Cython while working on this, but didn&#8217;t want to derail any progress to something working by playing with it.</p>
<p>So what did I learn?</p>
<ul>
<li>Write toy bots against open protocols with existing libraries in your language of choice, such as XMPP, lest your project get hijacked working on a binding rather than the toy.</li>
<li>Pyrex is pretty nice for building wrappers that have more meat than a typical SWIG binding.</li>
<li>Cython is probably worth checking out.</li>
<li>git is worth playing with more</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2008/10/27/pyrex-and-libyahoo2-or-not/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>python-cdb 0.32 (-5.2ubuntu2) with Python 2.5 causes double-free corruption crash on dealloc</title>
		<link>http://www.notes.xythian.net/2007/10/24/python-cdb-032-52ubuntu2-with-python-25-causes-double-free-corruption-crash-on-dealloc/</link>
		<comments>http://www.notes.xythian.net/2007/10/24/python-cdb-032-52ubuntu2-with-python-25-causes-double-free-corruption-crash-on-dealloc/#comments</comments>
		<pubDate>Thu, 25 Oct 2007 01:59:03 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2007/10/24/python-cdb-032-52ubuntu2-with-python-25-causes-double-free-corruption-crash-on-dealloc/</guid>
		<description><![CDATA[I&#8217;ve recently started moving my linux box to a new x86_64 machine running Ubuntu 7.10. I searched for references to this bug but didn&#8217;t find anything with Google or launchpad, so I wanted to make a note of it so future victims can see what is going on and to remind me to report it. [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently started moving my linux box to a new x86_64 machine running Ubuntu 7.10.   I searched for references to this bug but didn&#8217;t find anything with Google or launchpad, so I wanted to make a note of it so future victims can see what is going on and to remind me to report it.</p>
<p>The symptom is a crash when your cdb object is deallocated usually with a &#8220;double-free&#8221; memory corruption error message.  Assuming a .cdb file named &#8220;foo.cdb&#8221;, the following script will repro the bug:</p>
<blockquote class="code"><p>
#!/usr/bin/python<br />
import cdb<br />
c = cdb.init(&#8216;foo.cdb&#8217;)<br />
del c
</p></blockquote>
<p>with the following message:</p>
<blockquote><p>
fox@hercules:~$ python cdbrepro.py<br />
*** glibc detected *** python: free(): invalid pointer: 0x00002b56c25bd750 ***<br />
======= Backtrace: =========<br />
/lib/libc.so.6[0x2b56c300ab0a]<br />
/lib/libc.so.6(cfree+0x8c)[0x2b56c300e6fc]<br />
python(PyDict_DelItem+0xfa)[0x44370a]<br />
python(PyEval_EvalFrameEx+0x2e40)[0x485140]<br />
python(PyEval_EvalCodeEx+0&#215;830)[0x489d60]<br />
python(PyEval_EvalCode+0&#215;32)[0x489da2]<br />
python(PyRun_FileExFlags+0x10e)[0x4ab4fe]<br />
[...]<br />
Aborted (core dumped)
</p></blockquote>
<p>Some other searching suggests that python-cdb&#8217;s use of PyMem_DEL is no longer recommended.   I haven&#8217;t verified that this doesn&#8217;t cause other problems, but replacing cdbmodule.c&#8217;s use of PyMem_DEL with PyObject_Del (and the PyObject_NEW with _New, to use consistent naming). </p>
<p>As soon as Ubuntu&#8217;s bug tracker (launchpad) works again for me I&#8217;ll report the bug.  Launchpad is timing out with an error message for me now.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2007/10/24/python-cdb-032-52ubuntu2-with-python-25-causes-double-free-corruption-crash-on-dealloc/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tools and libraries</title>
		<link>http://www.notes.xythian.net/2007/07/13/tools-and-libraries/</link>
		<comments>http://www.notes.xythian.net/2007/07/13/tools-and-libraries/#comments</comments>
		<pubDate>Sat, 14 Jul 2007 04:57:15 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Experience]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2007/07/13/tools-and-libraries/</guid>
		<description><![CDATA[A tool and a library I&#8217;ve been using or at least trying out: Launchy Free, open-source Windows app that indexes Program Files and any other directories you tell it to. Then alt-space pops up a command line box and it autocompletes as you type. I installed it a while ago and meant to mention it [...]]]></description>
			<content:encoded><![CDATA[<p>A tool and a library I&#8217;ve been using or at least trying out:</p>
<dl>
<dt><a href="http://www.launchy.net/">Launchy</a></dt>
<dd>Free, open-source Windows app that indexes Program Files and any other directories you tell it to.  Then alt-space pops up a command line box and it autocompletes as you type.   I installed it a while ago and meant to mention it but it kind of faded into the background and I don&#8217;t think about using it anymore.   It&#8217;s just there.  No more do I need to concern myself with the program menu&#8217;s lengthy &#8220;organization&#8221; by vendor nor do I keep needing to add paths to my PATH.  I used to do that so I could use Windows-Run to run things rather than rummaging in Program Files but Launchy is a much better solution.  I&#8217;ve read that it resembles Quicksilver for the Mac.</dd>
<dt><a href="http://www.sqlalchemy.org/">SQLAlchemy</a></dt>
<dd>&#8220;SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.&#8221; &#8212; mostly it handles some tedium without getting in the way.   ORMs still make me nervous since they tend to have a lot of &#8220;magic&#8221; in them but SQLAlchemy makes it pretty easy to use the convenient parts and override the &#8220;magic&#8221; if/when it&#8217;s necessary for performance.  Which probably won&#8217;t happen too much but knowing it&#8217;s easy makes it easier to rely on SQLAlchemy in the meantime.   I don&#8217;t know if I&#8217;ll keep using it, because like every other time I use a library like this I find myself spending more time trying to figure out how SQLAlchemy expresses something and I already know how to use SQL.</dd>
<dt>
</dt>
</dl>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2007/07/13/tools-and-libraries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>wx.lib.iewin and NewWindow3</title>
		<link>http://www.notes.xythian.net/2007/03/22/wxlibiewin-and-newwindow3/</link>
		<comments>http://www.notes.xythian.net/2007/03/22/wxlibiewin-and-newwindow3/#comments</comments>
		<pubDate>Fri, 23 Mar 2007 05:29:22 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Readshot]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2007/03/22/wxlibiewin-and-newwindow3/</guid>
		<description><![CDATA[Suppose your wxPython app is is using an embedded Internet Explorer window to display HTML. You want links clicked in that window to open the user&#8217;s default browser rather than following the link within the embedded Internet Explorer. The obvious solution is to hook OnBeforeNavigate2 like so: self.Bind(iewin.EVT_BeforeNavigate2, self.OnBeforeNavigate2, self.ie) ... def OnBeforeNavigate2(self, evt): wx.LaunchDefaultBrowser(evt.URL) [...]]]></description>
			<content:encoded><![CDATA[<p>Suppose your wxPython app is is using an embedded Internet Explorer window to display HTML.  You want links clicked in that window to open the user&#8217;s default browser rather than following the link within the embedded Internet Explorer.</p>
<p>The obvious solution is to hook OnBeforeNavigate2 like so:</p>
<pre class="code python">
        self.Bind(iewin.EVT_BeforeNavigate2, self.OnBeforeNavigate2, self.ie)
...
    def OnBeforeNavigate2(self, evt):
        wx.LaunchDefaultBrowser(evt.URL)
        evt.Cancel = True
</pre>
<p>but if the link the user clicked is set to open a new window, the emebdded IE will just open a new window.   So you hook NewWindow2.  Oops, NewWindow2 doesn&#8217;t tell you what the URL is!</p>
<p>Here&#8217;s what I did, since I don&#8217;t care if my reader works on pre-SP2 Windows XP:</p>
<p>Define an Event to permit me to hook NewWindow3, which is a new event that includes the URL being opened, then hook it appropriately:</p>
<pre class="code python"
wxEVT_NewWindow3 = wx.activex.RegisterActiveXEvent('NewWindow3')
EVT_NewWindow3 = wx.PyEventBinder(wxEVT_NewWindow3, 1)
...
        self.Bind(EVT_NewWindow3, self.OnNewWindow3, self.ie)
...
    def OnNewWindow3(self, evt):
        self.logEvt(evt)
        # Veto the new window.  Cancel is defined as an "out" param
        # for this event.  See iewin.py
        evt.Cancel = True
        wx.LaunchDefaultBrowser(evt.bstrUrl)
</pre>
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2007/03/22/wxlibiewin-and-newwindow3/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Readshot</title>
		<link>http://www.notes.xythian.net/2007/03/18/readshot/</link>
		<comments>http://www.notes.xythian.net/2007/03/18/readshot/#comments</comments>
		<pubDate>Mon, 19 Mar 2007 04:45:35 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Readshot]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2007/03/18/readshot/</guid>
		<description><![CDATA[Ongoing small irritations finally piled high enough to push me to evolve my feed reader further. I finally created a category for the RSS aggregator-related posts: Readshot. A brief history of this project: January 2005 &#8212; A proof-of-concept toy that used an IMAP store and copied posts from a single RSS feed to an IMAP [...]]]></description>
			<content:encoded><![CDATA[<p>Ongoing small irritations finally piled high enough to push me to evolve my feed reader further.   I finally created a category for the RSS aggregator-related posts: <a href="/category/programming/readshot/">Readshot</a>.</p>
<p>A brief history of this project:</p>
<ul>
<li><a href="/2005/01/16/rss-to-imap-the-proof-of-concept/">January 2005</a> &#8212; A proof-of-concept toy that used an IMAP store and copied posts from a single RSS feed to an IMAP server.   I was still using Thunderbird and rawdog to do my feed reading at this point.</li>
<li><a href="/2005/10/25/rss-fetcher/">October 2005</a> &#8212; began using my new aggregator.  It had two components &#8212; a fetcher which ran from cron, used sqlite to keep track of state, and emitted new article messages to a Spread group and an IMAP storer which read from the Spread group and copied articles into an IMAP folder.  I read the IMAP folder using Thunderbird.</li>
<li><a href="/2006/03/06/rss-updated/">March 2006</a> &#8212; I collapsed the components into a single fetcher which ran from cron, used a SQLite data store, and put messages into an IMAP folder.   I also improved the feed scheduling to reduce the number of fetches on relatively inactive feeds.</li>
<li><a href="/2007/01/30/on-the-importance-of-encoding/">January 2007</a> &#8212; moved from the SQLite-based aggregator to a new aggregator based on PostgreSQL.  The new reader implements its own IMAP server using Twisted.  It&#8217;s much faster than the old SQLite solution and keeps all of the state in the database rather than having &#8220;feed state&#8221; in a SQLite database and &#8220;user state&#8221; (flagging, read state) in an IMAP folder. </li>
</ul>
<p>I made a very basic web UI with the SQLite-based aggreagtor which allowed me to subscribe to new feeds with a bookmarklet.   I migrated that UI to the new aggregator.   It was always inadequate and I supplemented it with manual SQL commands to manage feeds.   I&#8217;d sit down to enhance the interface but keep getting distracted playing with Ajax toolkits because the UI I had in mind didn&#8217;t fit very well into the web browser.</p>
<p>This weekend I began working on a desktop client.  It&#8217;ll be backed by the same database so the IMAP server will continue to work in case I want to read via IMAP for some reason such as to do offline reading.   I have a very crude GUI which has already let me do some management of the feeds which was awkward to do before but the desktop client still exists more as doodles in a notebook than code.</p>
<p>The first version connects directly to the PostgreSQL database.   Eventually I plan to split the database access out into its own tier.  I don&#8217;t yet know enough of what I want that protocol to look like to do that now and want to get a better feel for what the client will need before taking that step.   The plan to split the storage layer into its own tier guides the design of that layer.</p>
<p>There&#8217;s always been a bit of a mismatch between my needs in a mail client and my needs for a feed reader but until recently I put up with all the little cuts that resulted from using a mail client for feed reading because the benefits of a desktop experience and shared server state outweighed the pain of using a mail client (and IMAP) for feed reading.  </p>
<p>I&#8217;m using wxPython to develop the client so at least in theory it should work on Windows, Linux, and Mac OS.  I only plan to test and develop using Windows for the foreseeable future, though.</p>
<p>No project of mine advances as quickly as those that scratch an itch or alleviate a pain for me.</p>
<p>I still don&#8217;t know if I&#8217;ll ever release any of this.   It would probably be academic since I imagine the demand for this kind of thing is very low given how excited people get about &#8220;no install&#8221; web readers.  I think the the thought process that guides design decisions in the reader is more interesting to other folks than the actual code.  I suppose the RSS fetcher part of the system might be interesting to other folks as its own component.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2007/03/18/readshot/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>On the importance of encoding</title>
		<link>http://www.notes.xythian.net/2007/01/30/on-the-importance-of-encoding/</link>
		<comments>http://www.notes.xythian.net/2007/01/30/on-the-importance-of-encoding/#comments</comments>
		<pubDate>Tue, 30 Jan 2007 10:29:46 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Readshot]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2007/01/30/on-the-importance-of-encoding/</guid>
		<description><![CDATA[I&#8217;m converting my RSS aggregator to use PostgreSQL as its back-end instead of SQLite.&#160; It has served me well for quite&#160;a while but it&#8217;s time to move on from SQLite so I can add some features such as a feed management UI and an IMAP server. I&#8217;m working on an IMAP server for my RSS [...]]]></description>
			<content:encoded><![CDATA[</p>
<p>I&#8217;m converting my <a href="http://www.notes.xythian.net/2006/03/06/rss-updated/">RSS aggregator</a> to use PostgreSQL as its back-end instead of SQLite.&nbsp; It has served me well for quite&nbsp;a while but it&#8217;s time to move on from SQLite so I can add some features such as a feed management UI and an IMAP server.  </p>
<p>I&#8217;m working on an IMAP server for my RSS aggregator.&nbsp; I want to capture the read status and message flagging from my reading habits in my database.&nbsp; I&#8217;ve been using message flagging to indicate articles I want to review later or remember for some reason. The new aggregator will have its own IMAP server to serve articles directly from the database rather than copying articles into another IMAP server.&nbsp; This will also permit me to make a more dynamic folder view of the data rather than freezing the state of the IMAP view into a bunch of Maildirs.&nbsp; This part has been&nbsp; going well.&nbsp; I&#8217;ll have more to say about this, too, but the short version is &#8220;Twisted is still pretty cool and is handling <strike>all</strike> most&nbsp;of&nbsp;the grotty IMAP4rev1 protocol bits and letting me just implement a few interfaces to present a view of my data as IMAP folders and messages.&#8221;  </p>
<p>That isn&#8217;t why I&#8217;m posting now, though.&nbsp; I&#8217;m going to tell you a little story starring one of my favorite kinds of bugs with my favorite mysterious symptoms.  </p>
<p>My project was proceeding along nicely.&nbsp; Then I did another import and now mutt is segfaulting when it tries to load the folder view.&nbsp; I chase down a rabbit hole convinced that the reason is I&#8217;m sending more articles than mutt expects because my folder size counting code doesn&#8217;t agree with my FETCH processing code. Deceptively, I discover a case where these can be mismatched. But that isn&#8217;t what is happening.  </p>
<p>Eventually I set this issue aside and resume using Thunderbird to test.&nbsp; Deceptively, I&#8217;ve rebuilt the database in the meantime and am no longer using the same messages. The problem no longer appears and I thus blame mutt&#8217;s shoddy IMAP implementation.  </p>
<p>Everything is great again and&nbsp;my IMAP server is really shaping up. I rebuild the database to use a larger percentage of my real dataset and work through a number of issues in my article processing code.&nbsp;  </p>
<p>Now Thunderbird starts hanging.&nbsp; I recheck mutt. It&#8217;s segfaulting again. Curses! Both the Windows and Linux versions of Thunderbird appear to connect and work for a while and then they start hanging, spinning, and generally losing badly.&nbsp; I need to figure this out to proceed.  </p>
<p>After some printf debugging suggests everything is fine, it&#8217;s time to see what Thunderbird is actually seeing. I bring out <strike>Ethereal</strike> Wireshark.&nbsp; Wireshark rapidly shows that some of my IMAP server&#8217;s command responses look like this:<br />
<blockquote>
<p>2a 00 00 00 20 00 00 00 31 00 00 00 36 00 00 00<br />20 00 00 00 46 00 00 00 45 00 00 00 54 00 00 00<br />&#8230;</p>
</blockquote>
<p>Most of the responses look fine but some of them have three nulls between each real character.&nbsp; Uh oh.&nbsp; I peek with ptrace and confirm the process really is send()ing three nulls between each character.&nbsp; Why are my characters four-byt&#8230; oh.&nbsp;Before jumping to the conclusion, I refine my printf debugging to also print the type of the string being added to the outgoing buffer.</p>
<blockquote><p>WRITE &lt;type &#8216;str&#8217;&gt; : * 18 FETCH (<br />WRITE &lt;type &#8216;str&#8217;&gt; : UID 48319<br />WRITE &lt;type &#8216;str&#8217;&gt; : <br />WRITE &lt;type &#8216;str&#8217;&gt; : RFC822.SIZE 865<br />WRITE &lt;type &#8216;str&#8217;&gt; : <br />WRITE &lt;type &#8216;str&#8217;&gt; : FLAGS (\Unseen)<br />WRITE &lt;type &#8216;str&#8217;&gt; : <br />WRITE &lt;type &#8216;unicode&#8217;&gt; : BODY[HEADER.FIELDS (From To Cc Subject Date Message-Id Priority X-Priority References Newsgroups In-Reply-To Content-Type)] {333}</p>
</blockquote>
<p>&#8216;unicode&#8217;!&nbsp; It is as I suspected.&nbsp;&nbsp; It is idiomatic in Python when buffering writes to assemble a list of strings and then call .write(&#8220;&#8221;.join(list)).&nbsp; Twisted does this.</p>
<p>Python doesn&#8217;t have a &#8220;raw byte buffer&#8221; type. It has unicode strings (type &#8216;unicode&#8217;) and &#8220;raw&#8221; strings (type &#8216;str&#8217;). I&#8217;m using &#8220;unicode&#8221; strings for virtually all of my data (and in the database) and raw strings to represent byte buffers.&nbsp; I thought I had caught everywhere I was outputting and called an appropriate .encode() on the string.&nbsp;&nbsp; Unfortunately, concatenating a &#8216;str&#8217; and a &#8216;unicode&#8217; results in a &#8216;unicode&#8217; instead of a TypeError.&nbsp; Python got this wrong for &#8216;int&#8217; and &#8216;float&#8217; so it is no wonder that it is wrong for &#8216;str&#8217; and &#8216;unicode&#8217;. A single unicode leak will result in the entire write being unicode.&nbsp; Python, it turns out, will also cheerfully write out the raw bytes of a unicode string. </p>
<p>Java has different types for &#8220;byte&#8221; and &#8220;char&#8221;. You just can&#8217;t pass a directly to anything that&#8217;s going to do i/o without casting or encoding. Java characters are all unicode, of course. Most of the time this is just another hoop to jump through when dealing with i/o in Java. Right now I really appreciate it. I wish Python i/o primitives threw an exception if you tried to write an unencoded &#8216;unicode&#8217;.&nbsp; Even encoding as &#8216;utf-8&#8242; would be better than writing out the raw bytes.</p>
<p>It&#8217;s not very robust of mutt and Thunderbird to hang, spin, or crash when they encounter unexpected nulls in the result from a network server.&nbsp;I pity the poor end user using one of these clients to connect to a shoddy server.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2007/01/30/on-the-importance-of-encoding/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Log analysis with Python, SQLite, and matplotlib</title>
		<link>http://www.notes.xythian.net/2006/07/04/log-analysis-with-python-sqlite-and-matplotlib/</link>
		<comments>http://www.notes.xythian.net/2006/07/04/log-analysis-with-python-sqlite-and-matplotlib/#comments</comments>
		<pubDate>Tue, 04 Jul 2006 22:30:22 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.notes.xythian.net/2006/07/04/log-analysis-with-python-sqlite-and-matplotlib/</guid>
		<description><![CDATA[I recently spent a day or so writing scripts to grovel over a service&#8217;s log files looking for information. I had a couple ideas about what I wanted to learn, but I wanted to make it easy to experiment a little. Not so easy I&#8217;d end up with results that weren&#8217;t easily reproducable, though. In [...]]]></description>
			<content:encoded><![CDATA[<p>I recently spent a day or so writing scripts to grovel over a service&#8217;s log files looking for information.  I had a couple ideas about what I wanted to learn, but I wanted to make it easy to experiment a little.  Not so easy I&#8217;d end up with results that weren&#8217;t easily reproducable, though.  In the past I&#8217;ve used a mix of Python to turn log files into data tables and then Excel to explore and chart (good old PivotTables).    The nice thing about Excel is fiddling with the chart or table format, filtering, or grouping criteria is easy.  Unfortunately, it&#8217;s also hard to see how Excel is configured and to reproduce the graphics or table so produced later.   It&#8217;s certainly possible to use Excel from a scripting language (or using Excel&#8217;s built-in VBA) but at that point one may as well skip Excel entirely since the appeal of Excel for this is UI to change things and see results right away.</p>
<p>For this round, I wrote everything in Python.</p>
<p><a href="http://matplotlib.sourceforge.net/">matplotlib</a> is a library for Python that tries to present a MatLab-like interface for a bunch of graphics and data manipulation tools.  I don&#8217;t know how successful it is at emulating MatLab since I never used MatLab enough to become very familar with it.   matplotlib is nice, though &#8212; probably easiest library of this type that I&#8217;ve used.</p>
<p><a href="http://www.sqlite.org/">SQLite</a> I have mentioned here before &#8212; it&#8217;s an embeddable SQL database engine.   In this case, I&#8217;m using primarly to avoid inventing another binary file format with the added bonus of being able to query the data stored in my file.  I needed to parse the many log files and store them in an intermediate form that was easy to read and query so I didn&#8217;t have to wait for a script to grovel through the log files every time I wanted to revise how I analyzed the data.   For the <a href="http://www.notes.xythian.net/2006/03/01/sqlite-3-pysqlite2-and-incorrected-typed-data-segfault/">reasons I discussed earlier</a>, I&#8217;m still using the <a href="http://www.rogerbinns.com/apsw.html">APSW</a> Python binding for SQLite.</p>
<p>A script parsed the raw log files and output a SQLite database representing essentially all of the information in the log files in tables.    Then another script processed that data into some intermediate, derived tables.   Finally, the last script used matplotlib to create the graphics from both the raw data and the intermediate tables.   There was much experimention in the last two steps as I generated graphics that led me to wonder about something which led me to compute more intermediate data to look at and/or graph.   I used matplotlib a little interactively to browse around on the graphics but I avoided using </p>
<p>All these words about parsing and graphics invite posting of some pictures and scripts to do it.   I can&#8217;t post the work related code so I decided to perform a similar exercise on some other information for this post.</p>
<p>There is a great graphic shown in one of Edward Tufte&#8217;s books of a train schedule.   It represented stations vertically and time horizontally.  Each train was represented as a line intersecting the horizontal of a station for the time the train would be in the station.   The slope of the lines indicated how fast each train was.  I will write a script to process Caltrain&#8217;s HTML schedule into such a graphic.</p>
<p><img src="http://xythian.net/2006/07/03/weekday.jpg" height=255 width=300/><br />
<a href="http://xythian.net/2006/07/03/weekday.png">Weekday</a> (800&#215;600 PNG, 260K), <a href="http://xythian.net/2006/07/03/weekend.png">Weekend/Holiday</a> (800&#215;600 PNG, 125K)</p>
<p>Starting from the <a href="http://www.caltrain.com/timetable.html">Caltrain timetable</a> and <a href="http://www.caltrain.com/caltrain_stations.html">Caltrain station info</a> in HTML, I used <a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> (a Python HTML scraping library) to extract the timetable.   I&#8217;d never used Beautiful Soup before this but the thought of constructing the parser (or regexps) to extract the schedule from the HTML made me sad enough to learn how to use a new library.   Beautiful Soup is nice!   It was very easy to figure out how to extract the information I wanted from the HTML.</p>
<p>Incidentally, as of this writing that timetable has, in HTML comments, snippets of ELisp code.  I couldn&#8217;t figure out what the ELisp was supposed to do that was helpful, though.</p>
<p>The script isn&#8217;t very pretty, but here it is anyway: <a href="http://xythian.net/2006/07/03/schedule.py.script">schedule.py</a> (4K).   It requires Beautiful Soup and matplotlib to run.   The graphic omits stations south of San Jose because so few trains run to them and it squished the stations most of the trains did run to into less space.   I didn&#8217;t really put together how closely spaced the Caltrain stations on the penninsula until I saw this graphic.</p>
<p>The script needs more work if you wanted to use the schedule for anything as the use of express and local trains meant I had to come up with some way to indicate if a given train actually stopped at the stop it was passing.   I used a black circle, but I can see places where overlapping trains and dots make which train is doing the stopping somewhat ambiguous.</p>
<p>The script assumes you have timetable.html and caltrain_stations.html in the current directory &#8212; these were both just pulled straight from the caltrain web site using the links above.</p>
<p>Now that I have all this parsing code I&#8217;ll probably finally get around to making something to print custom schedules which include only the stations I care about.  I&#8217;ll just use a simple textual table, though.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.notes.xythian.net/2006/07/04/log-analysis-with-python-sqlite-and-matplotlib/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
