I'm sorry Dave, I can't do that.
Preliminaries
group assignments
Is everyone assigned to a group?
Is everyone registered for the class?
Introductions
Who are you?
A Short Review
Computers work use digital circuits.
Electronic Switches
to
Digital Logic
to
CPU
----------------------
Computers can only do about five things:
simple math
byte operations
simple decisions
store and retrieve data
Internal Data Representations
Bytes
single group of 8 bits
Integers
groups of 2 or more bytes
1 sign bit with a mantissa
Which is one?
00000000 00000001
or
00000001 00000000
Which is one?
00000000 00000001
or
00000001 00000000
Answer:
It depends.
The byte order for integers is machine dependent.
Little Endian
vs
Big Endian
X = 2.343 x 1012
start with byte representation
32 bits = 4 bytes
00000000 00000000 00000000 00000000
X = +/- (0.b1 b2 b3... ) x 2m
1 bit = sign of mantissa
1 bit = sign of exponent
z bits = the integer exponent
30 - z = the mantissa
Bit order is usually:
sign
sign
exponent
mantissa
Example:
52.234375
52 = 110100
0.234375 = 0.001111
therefore
52.234375 = 110100.001111
or
0.110100001111 x 2110
In this example, we have the exact representation of a binary number
Roundoff Error
What happens when your number converts into a repeating binary number?
0.101010101010...
The number can only be given to the precision of the computer.
The computer truncates the number.
Especially important when you
subtract numbers which are close to the same value.
234.34565432245332 -
234.34565432245341
Significant Digits
given a floating point number with two sign bytes and a mantissa of size z
what is the smallest positive number that can be represented in the system?
0.000000001 x 20
2^(-z)
Overflow and Underflow Errors
A floating point number can only represent a range of numerical values.
If z=the number of bits in the mantissa
the largest number it can represent is:
2^(31 - z-1)
Informal Programming Assignment
What are the smallest positive numbers which can be represented with single precision floating point numbers on the SGI machines?
What are the largest positive numbers which can be represented with single precision floating point numbers on the SGI machines?
Basic Data Structures
We store data with variables.
A = 3
B= 324.34653
C = 'f"
D = 'uberdog"
We can also create arrays of data
a[0] = 232
a[1] = 343
...
or
b[0][0] = 345.33
b[0][1] = 2334.343
...
Arrays
An Array is one of the most basic types of data structures.
In memory:
a[0] is next to a[1]
Pointers are addresses of data.
Variables refer to the data themselves.
Imagine a two dimensional array b.
Which elements are adjacent?
A(I,J) is next to A(I+1,J) ?
Or
A(I,J) is next to A(I,J+1) ?
Column Major
in C, elements are stored in adjacent columns
tmp[row][column]
Tmp[2][12] is
two rows of 12 columns
tmp[i][j] is next to tmp[i][j+1]
------------------------
Row Major
in Fortran, elements are stored in adjacent rows
tmp(i,j)
tmp(i,j ) is next to tmp(i+1,j)
---------
it is generally faster and more efficient to access data elements which are adjacent in memory
Pointers and Variables
variables
name associated with a value and a data type
pointers
a name associated with a memory location and a data type
int *a;
int b;
int c
c = 1242;
*a = 2353;
&b = a;
Pointers in Fortran
int a(100), b(100), c;
int i
equivalence (a,b)
c = 1242
for i=1,100
a(i) = i
enddo
ALL VARIABLES PASSED BETWEEN SUBROUTINES IN FORTRAN ARE POINTERS!
Why Use Pointers
Allocation and deallocation of memory
Rapid transfer of data
Pointer Arithmetic
Dynamical Data Structures
most modern computers use disk drives to expand program memory
RAM and disk are exchanged during the execution of a program
Page Faults are slow.
Chunks of memory are loaded from disk.
You want to minimize page faults by locating commonly used variables close together in RAM.
data types which are created from several basic or derived data types
allows you to create new data types by encapsulating data
can be made into arrays
can be very useful for science work
Derived Data Types
position
x, y, z
velocity
vx, vy, vz
acceleration
ax, ay, az
species, mass, ionization
Internal Representation
float x, y, z;
float vx, vy, vz
float ax, ay, az;
byte species;
double mass;
short int ionization;
float = 4 bytes
byte = 1 byte
double = 8 bytes
short int = 2 bytes
9 * 4 + 1 + 8 + 2
=
47 bytes for each record
another derived data types
position
x, y, z
velocity
vx, vy, vz
acceleration
ax, ay, az
vibrational state, rotational state
components
atom1, atom2
Data Structures
dynamically - not predefined and can be updated with new data
organized
data sets
A way to organize your data
Typical Data Structure Operations
from Cormen et al.
Search(S, k)
Insert(S, x)
Delete(S, x)
Minimum(S)
Maximum(S)
Successor(S,x)
Predecessor(S,x)
where
S = a set of data, k = a key, and x = a data element
A Random Array of Data
elements are not in order
most operations require you to search the data set
max, min, successor, predecessor
deletes require memory to be moved
-------------------------------------
A Sorted Array of Data
elements are in order
most operations require no searching
max, min, successor, predecessor
deletes and inserts still require memory to be moved
basic data structures
last in- first out
(LIFO)
input data is pushed on the stack
2 23 56 3
push 5
5 2 23 56 3
output data is popped from the stack
pull x
2 23 56 3
and x=5
similar to HP calculators
basic data structures
first in- first out
(FIFO)
input data is enqueued
2 23 56 3
enqueue 5
5 2 23 56 3
output data is dequeued
dequeue x
5 2 23 56
and x=3
similar to supermarkets
basic data structures
useful for tracking binned data
---------
1) each bin points to the first element in the bin
2) each element points to the next element in the same bin
3) each element may point to the previous element in the same bin
4) the final element in a bin points to nothing
derived data types
element list
data
pointer to the next element
(pointer to the previous element)
-----------
bins list
# elements in the bin
Head of Chains (HOC)
pointer to first data element
science applications
tracking hurricanes
cities: Miami, Charlottesville, etc.
Hurricanes: A50, B50, C50,...., F96
each Hurricane hit a major city
List may be some thing like:
Miami: B53, E59, A67, G75, A91
Charlottesville: H92, F96
The HOC(Miami) = B53
Next(B53) = E59
Next(E59) = A67
Next(A67) = G75
Next(G75) = A91
Next(A91) = null
Why not use simple arrays for each City?
Waste of memory
----------
Why not just sort every hurricane by city?
Waste of cpu time
basic data structures
1) data is divided into nodes
2) each node has left and right children (pointers)
3) each node has a key used for comparison
4) each node has a parent (pointer)
elements must be inserted as nodes
nodes must be inserted at empty children
the position of the data is determined by the nodes' keys
imagine an expert system used to identify animals
Does the animal have four legs?
Yes-
Does the animal have fur?
No-
Does the animal have two legs?
Etc...
Trees can be used to maintain and update searchable data lists.
searchable databases
calculation of clustering
multipole methods
images storage and compression (quad trees)
What happens if your tree is unbalanced?
hashing
random generation of keys from data
recursive bisection
dividing data into exactly equal groups
A set of instructions used to solve a problem.
'The information superhighway will save the environment and revolutionize education in the 21st century."
-An Al Gore -ism
How many operations will it take to complete this algorithm?
Binary Tree Searches
log N operations
Finding the minimum of a data set
N operations
Fast Fourier Transforms
N log N
Finding the minimum difference between data values
N^2 operations
Simple inversion of an N x N matrix
N^3 operations
How many operations does it take to multiply an N-digit number by another N-digit number?
1 2 6 3 7 3 7 3
x 3 4 5 6 7 8 7 4
---------------
First digit is multiplied by N digits
Second digit is multiplied by N digits
....
Nth digit is multiplied by N digits
Also, there are also additions involved.
This is an N^2 algorithm.
Can this be done more efficiently?
Can this be done more efficiently?
Yes
FFT(product) = FFT(number1) + FFT(number2)
FFT's take O(N log N) operations.
Inverse FFT's take O(N log N)
The operations will take O(N log N) operations.
This is must better for large N.
Does this really make a difference?
Assume each operation takes 1 microsecond and we have a million digits.
N^2 = 1,000,000 second
= 11.6 days
N log N = 14 seconds
70,000 times faster!
This makes some naive assumptions, but the difference is very dramatic.
This is MUCH better than getting a faster machine.
3 key ideas
iteration
recursion
bisection
cyclically refining an answer
Example:
solving well behaved
transcendental equations
X = sin (X + e)
trial solutions
reform equation as
X(I+1) = sin(X(I) + e)
Let e= 0.3
x(0) = 0.5
then :
x(1) = 0.7174
x(2) = 0.8507
x(3) = 0.9131
x(4) = 0.9367
x(5) = 0.9447
x(6) = 0.9473
x(7) = 0.9481
x(8) = 0.9484
x(9) = 0.9485
...
x(100) = 0.9485
the interaction converges
LCG's
random number generates based on iterations
x(i+1) = (x(i) * a + c ) mod m
Generally
x(i) = integer between 0 and m-1
a , c integers
Park and Miller's minimal generators
a = 16807
m = 2^31 - 1 = 2147483647
c= 0
Are LCG's truly random?
No.
Are LCG's truly reliable?
No. It depends on the values of a, c, and m. Bad values can lead to disastrous results.
the infamous IBM LCG
a = 65539
m = 2^31
11 planes are visible when you plot
x(i+1) vs x(i)
'We guarantee that each number is random individually, but we don't guarantee that more than one of them is random."
anonymous computer consultant
from Press et al.
executing a subroutine from within the same subroutine
Example:
Factorial
n!
Some C code:
fact(int x) {
if (x > 1)
return fact(x-1)*x;
else
return 1;
}
Fibbinouci Series
1 1 2 3 5 7 9 ....
N(i) = N(I-1) + N(I-2)
Is this something that recursion can be applied to?
divide and conquer techniques
assume your problem scales as N^2
if you can break the problem into too groups of N, the calculation will take 2 (N/2)^2 calculations
You can recursively repeat this technique
Can change N^2 into
N log N
Science Examples
Fast Fourier Transform
used in signal processing and field equations
changes N^2 to N log N
----------
Multipole Expansion
used primarily to solve field equations
changes N^2 to N log N
often uses tree data structures
a bad algorithm
N(N-1) calculations
O(N^2)
compares every number with every other number
Bubble(A, r) for i=1, r-1 x = A(i) for j=i+1, r if x < A(j) then exchange A(i) and A(j) x = A(i)
x = 34
12 35 43 57 34 23 45 23
x=12
12 35 43 57 34 23 45 23
x = 35
12
35 43 57 34 23 45 23
12 34 43 57 35
23 45 23
12 23 43 57 35
34 45 23
a good algorithm
two parts
a partitioning routine
a quicksort routine
Typical performance
N log N
Worst Case
N^2
example taken from
Cormen, Leiserson, and Rivest
Introduction to Algorithms
QuickSort(A, p, r) if p < r then q = Partition(A,p,r) Quicksort(A,p,q) Quicksort(A,q+1,r)
Notes:
initially call with
A = unsorted array
p = 1
r = size of (A)
Partition(A, p, r)
x = A[p]
i = p - 1
j = r + 1
while TRUE
do repeat j = j -1
until A[j] <= x
repeat i = i + 1
until A[i] >= x
if i < j
then exchange A[i] with A[j]
else return j
From Cormen et al.
Divides and partitions array into above and below x
Returns the index where this partition occurs
34 35 43 57 12 23 45 23
x = 34
23 35 43 57 12 23 45 34
23 23 43 57 12 35 45 34
23 23 12 57 43 35 45 34
j=3 is returned
thus
23 23 12 < 34
57 43 35 45 34 >= 34
repeat with smaller arrays
Unix Literacy
Compilers
cc gcc
f77
CC g++
basic options/flags
-g enable debugging
-o output file name
-O0 -O1 -O2 -O3
set optimization level
-ccompile only, do not link
-llink to this library
-L path of other libraries
-Ipath of included files
cc -o dog -lm -g dog.c
to compile a simple on file program
cc -o dog.o -g -c dog.c
cc -o cat.o -g -c cat.c
cc dog.o cat.o -lm -o bigdog
Other Unix Commands
alias mroe 'more'
path
source
.cshrc
This file is executed whenever you start a new C-shell.
redirecting io
cmd >> newfile
cmd > existing file
piping
cmd1 | cmd2
echo
env
emacs
Homework Assignment
1) edit your .cshrc file and modify your path to include
~jwallin/bin/
2) source the .cshrc file to update your path
3) execute the command "ddog" and pipe the results into a
file
4) create an alias in your .cshrc file for ddog called "mdog"
5) pipe the set of environmental variables into a file
6) cp the files ~jwallin/bin/test.c, ~jwallin/bin/test.f,
~jwallin/bin/test.cc into your directory
7) compile them with either the system or gnu compilers
8) execute the fortran and c++ codes, and pipe the results into a file
9) execute the compiled test.c code
10) using emacs, copy your .cshrc file, your ddog results,
and the results of your compilations into an html file
11) call the html file "hw2.html" and place it in your public_html
directory
12) add a link to your main html page with for this assignment
13) make sure the permissions are set so I can read them through
the web!
_______________________________