Part 1 of the sage experience was just installing the software. This was incredibly easy on both OSX and linux (CentOS 5.2 and Fedora 9). For the Fedora 9 install I just downloaded the latest version of Sage which was compiled for Fedora 8, and this seemed to be just fine.
So for me, I really just wanted to be able to do a few different examples which would be close to “real world applications” for me.
Some things that I would like to be able to do in sage:
1. Load in a 2-D NetCDF satellite data file and display it as a map projection. This should be really simple. I would usually just use GMT for this (a small shell script wrapping psbasemap, grdimage, and pscoast).
2. Load in a data series with dates and locations, and match this to corresponding satellite data in time and space. Normally I would use a perl script that I wrote many moons ago to do this. I would basically sort the data, then match a block of data at a time using GMT’s grdtrack function. I know that this is inefficient, and really I would like to be able to pull extra data in x,y, or t and take the mean or median value, which would be more CPU intensive, but better than matching just one point in space and time to the nearest pixel.
3. Load in a multivariate data series and do multivariate statistics (e.g. LME, GLM/GAM, RDA). This is where the R interface would come into play. Normally I would prepare the data elsewhere, then import the flat table into R and use the R functions. This may involve installing more packages (nlme, mgcv, etc).
4. Load in a 3-D set (x,y,t) of satellite data files and perform an EOF analysis on them (akin to SVD in Matlab). Normally I would do this in Matlab or Ferret. I’m just curious how easy it would be to do this here.
There are other things that I could do, but these are a few off the top of my head, and things that I am doing now, so it would be incentive to try Sage out with. For tonight, I’ll just work on #1, which should be really fast.
The data file I’m using is just a NetCDF file (created by GMT) which I can read with pupynere in python. Here I’m going to use the scipy.io.netcdf module (which is actually based on pupynere I believe).
sage: from scipy.io.netcdf import *
sage: from pylab import *
# Read in file metadata to object
sage: ncfile = netcdf_file(‘RS2006001_2006031_sst.grd’,'r’)
# get the variables in the data file
sage: ncfile.variables
{‘x’: <scipy.io.netcdf.netcdf_variable object at 0xb47b08c>,
‘y’: <scipy.io.netcdf.netcdf_variable object at 0xb47b16c>,
‘z’: <scipy.io.netcdf.netcdf_variable object at 0xb47b1ec>}
# Yank out data
sage: longitude = ncfile.variables['x'][:]
sage: latitude = ncfile.variables['y'][:]
sage: sst = ncfile.variables['z'][:]
# just plot sst to test 2D image plotting
sage: plot(sst)
[<matplotlib.AxesImage instance at 0xc03636c>]
Nice, but it’s upside down. Let’s flip it vertically.
sage: clf
sage: plot(flipud(sst))
[<matplotlib.AxesImage instance at 0xb86a2ac>]
sage: savefig('temp.png')
Easy, but I want to put this on a projection. Normally I would use the basemap tools which are an add on to matplotlib. I don’t see these installed, and I didn’t see them in the extra sage packages on line, so I downloaded them from SourceForge and installed them.
The first step you have to do is to install the geos package, just read the README in the geos folder and hit
./configure
make
and then we get our first epic fail. Something in the geos chain won’t compile, and I’m just about fried enough to call it quits for this evening.
At this point I’ve been playing with this for more than 2 hours, and I still have yet to make a simple map on a projection. There has to be something I’m missing, but at this point I’m going to pause until tomorrow. So not the best testing evening, but there are some positives so far. The bundling of most packages is a plus, and the ease of loading in NetCDF files is nice. Data displays well using the Pylab interface, even though I am still forced to save to a file at this point.
So immediate goals:
1. Get a backend working for viewing plots in widgets (akin to ipython -pylab)
2. Get the basemap tools installed so that I can make a map with a projection!
Technorati Tags:
Computers, linux, Mac OS X, python, R, scientific programming, sage

July 22, 2008 at 9:49 am
Thanks for your posts, I’ll be following your experience with Sage closely. I made the switch from Matlab to Python about a year ago and I’m glad I did. I ran across Sage a while ago but have found matplotlib, R, and IPython to be sufficient for everything I need to do. I’m curious to see what advantages Sage has.
Re: your point #3 . . .
I’ve been using RPy to run redundancy analysis in R from Python. I actually have a draft of an RPy tutorial that I’m working on; I’ll add an RDA example to that.
Re: your point #4 . . .
There’s some good stuff in NumPy’s linear algebra module . . . numpy.linalg.svd() should do the trick for EOF analysis.
-ryan
July 22, 2008 at 10:09 am
Thanks Ryan,
While Sage does appear to be an incredible package, I wonder if it’s overkill for a user such as myself that already had most of the separate components installed. Most the quantitative work that I do involves multivariate statistics, which I would do in R. RPy looks promising (I look forward to your tutorial) as I really don’t like programming/data mining in R just yet. The ability to choose the tools and do everything “in one house” is appealing, and that’s what I was hoping that Sage could offer. The only downside right now that I see is in the documentation for Sage, as it’s geared more for straight mathematics, which I didn’t use Matlab for either. That’s not a knock at all against Sage, just I wish that there were more tutorials and documentation aimed at more life science/ecology examples.
Thanks again for the comment. I hope to spend more time with this during the week as time permits.