OK, another night, another trial. I must say, tonight was a lot more fun than the last couple of nights, because I really felt that I learned something, which is really the whole point of this exercise. So the example I was trying to code tonight is a simple EOF of a 3D data series. This is something that I just had to code up at work today, so it was a perfect chance for me to try out Sage. For work I ended up altering an existing m-file and running the EOFs in Matlab, but that’s OK, because now I know what I expect to see after running this in Sage.

The data names have been changed to protect the innocent.

`# Load in required modules`

sage: from scipy.io.netcdf import *

sage: from pylab import *

sage: from scipy.stats.stats import nanmean

sage: import datetime

```
```#Load data from NetCDF file

sage: ncfile = netcdf_file('file.nc','r')

sage: varnames = ncfile.variables.keys()

sage: varnames

['LONGITUDE', 'TIME', 'LATITUDE', 'DATA']

#Now that I have the order I can load into arrays

sage: lon = ncfile.variables[varnames[0]][:]

sage: lat = ncfile.variables[varnames[2]][:]

sage: dates = ncfile.variables[varnames[1]][:]

sage: raw = ncfile.variables[varnames[3]][:,0:50,:] #I only want 50 records in Y

sage: data = raw.copy() #make a copy

sage: data.shape

(124, 50, 151)

sage: (ncycles, ny, nx) = data.shape

#deal with dates

sage: ncfile.variables[varnames[1]].attributes

{'axis': 'TIME',

'time_origin': '15-JAN-1901 00:00:00',

'units': 'HOURS since 1901-01-15 00:00:00'}

sage: off = datetime.datetime(1901,1,15,0,0,0)

sage: months = ones(ncycles)

sage: for i in range(0,ncycles):

....tdel = datetime.timedelta(days=dates[i]/24)

....td = off + tdel

....months[i] = td.month

`sage: ind = where(raw<0)`

sage: data[ind] = nan

And here was the first real bottleneck, as things just slowed to a crawl as python tried to find all the instances where the data was less than zero. This is something that is instantaneous in Matlab, and took over 30 seconds to go through 124*50*151 values. There must be a faster way to do this.

`data2=data.copy()`

#Take out monthly averages

sage: mclim = ones((50,151))

sage: for i in range(1,13):

....index = where(months==i)[0]

....mclim = nanmean(data[index,:,:])

....data2[index,:,:] = data[index,:,:] - mclim

```
```data2.shape = (ncycles, nx*ny)

ltmean = nanmean(data2) #get mean of each time series

#take out long term mean

sage: anom = data2.copy()

sage: for i in range(0,ncycles):

....anom[i,:] = data2[i,:] - ltmean

`sage: EOF = nan_to_num(anom) #push land back to zero`

**sage: [u,s,v] = linalg.svd(EOF)**

sage: for i in range(0,ncycles):#build array so that we can project eigenvalues back onto timeseries

....s2[i,i] = s[i]

sage: amp = dot(s2.transpose(),u.transpose()) #get amplitude

sage: spatial = v[0:4,:]# pull out spatial fields

sage: ratios = pow(s,2)/sum(pow(s,2))*100 #get %variance explained for each mode

sage: temp = spatial[0,:]

sage: temp.shape = (ny,nx) #push back to original dims

sage: plot(amp)

sage: savefig('amplitude.png')

sage: imshow(flipud(temp))

sage: savefig('spatial.png')

Success!

I actually really felt positive about this whole example as I really learned a lot more. This also was probably too large of an array to test out (measure twice cut once!) but it’s what I was working with so I wanted a real world example. The more that I worked in sage the more comfortable I felt as well. The geographic projection issue is still there, as well as some indexing speed issues, but overall, I was really impressed with the Sage/SciPy/NumPy experience today. Overall I feel that more of a transition was made for me last night/today. Which was great timing as a co-worker actually called me and asked if I knew of any free replacements for Matlab…

Technorati Tags:

Computers, linux, matlab, programming, python, sage, scientific programming